shunping commented on issue #31040: URL: https://github.com/apache/beam/issues/31040#issuecomment-2569764217
To clarify the behavior of textio with various content encoding, content type, and compression settings, I've expanded the table in the Apache Beam GitHub issue [#18390](https://github.com/apache/beam/issues/18390#issuecomment-1422729486). This table compares the behavior across two Beam SDK versions: 2.52.0 (prior to the GCSIO migration) and 2.62.0 (the upcoming release). I also include the proposed behavior of my fix in the last column.  A few notes about how the data is generated. - For the first 3 x 2 x 3 rows, the text data is gzipped locally and then uploaded to gcs. Then the metadata values of `content-type` and `content-encoded` are manually adjusted. - For the row marked as "copy default text file", the text data is directly copied/uploaded to gcs without gzip. - For the row marked as "copy default gzip file", the gzipped text data is copied/uploaded to gcs. - For the row marked as "copy default text file with gzip-local flag", the **text data** is uploaded to gcs with the said flag. `gcloud storage cp -Z ./textio-test-data.1k.txt gs://apache-beam-samples/textio/textio-test-data.gzip-local.1k.txt.gz`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
