grufino-blackbird commented on code in PR #22419:
URL: https://github.com/apache/beam/pull/22419#discussion_r933550128
##########
sdks/python/apache_beam/io/filesystem.py:
##########
@@ -166,6 +177,9 @@ def _initialize_decompressor(self):
self._decompressor = bz2.BZ2Decompressor()
elif self._compression_type == CompressionTypes.DEFLATE:
self._decompressor = zlib.decompressobj()
+ elif self._compression_type == CompressionTypes.ZSTD:
+ self._decompressor = zstandard.ZstdDecompressor(
+ max_window_size=2147483648).decompressobj()
Review Comment:
thank you for the suggestion, I agree and found this:
https://github.com/indygreg/python-zstandard/issues/157
apparently it is related to the compression level used. What do you think
about adding this as a comment? It doesn't seem like the library intends to fix
this, as the issue was closed, but I still think it's better to leave the value
as it seems to work as a general recommendation (for my use case I tested in
10GB+ compressed or 100GB+ decompressed files and it is working fine)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]