kennknowles opened a new issue, #19413:
URL: https://github.com/apache/beam/issues/19413
To reduce the size of uploaded files we decided to gzip it before upload.
Unfortunately, we noticed that we don't have content-encoding 'gzip' in the
uploaded files metadata. I rechecked the code and noticed that there is no way
to pass gzip encoding on
```
apache_beam.io.gcp.gcsio.GcsIO.open()
```
Also, I noticed that apache_beam.io.gcp.gcsio.GcsUploader doesn't support
uploading for gzipped files.
To resolve this problem we need to allow pass gzip_encoded option, which can
be passed to apitools.base.py.transfer on
```
GcsUploader.__init__()
```
Is there any possibility that you apply the required changes soon?
*What steps to reproduce the problem?*
1. Prepare gzip encoded file for example pdf
2. Upload it to GCS using
```
from apache_beam.io.gcp.gcsio import GcsIO
def upload_gzipped_pdf(gzipped_pdf, path)
with GcsIO().open(path,
'w') as f:
f.write(gzipped_pdf)
```
3. Try to download uploaded file via browser
*What is the expected result?*
I see the file content properly
*What happens instead?*
I have a broken document
*Possible resolution after implementing expected changes*
```
from apache_beam.io.gcp.gcsio import GcsIO
def upload_gzipped_pdf(gzipped_pdf, path)
with GcsIO().open(path,
'w', gzip_encoded=True) as f:
f.write(gzipped_pdf)
```
Imported from Jira
[BEAM-7411](https://issues.apache.org/jira/browse/BEAM-7411). Original Jira may
contain additional context.
Reported by: p35.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]