grufino-blackbird commented on code in PR #22419:
URL: https://github.com/apache/beam/pull/22419#discussion_r933550128


##########
sdks/python/apache_beam/io/filesystem.py:
##########
@@ -166,6 +177,9 @@ def _initialize_decompressor(self):
       self._decompressor = bz2.BZ2Decompressor()
     elif self._compression_type == CompressionTypes.DEFLATE:
       self._decompressor = zlib.decompressobj()
+    elif self._compression_type == CompressionTypes.ZSTD:
+      self._decompressor = zstandard.ZstdDecompressor(
+          max_window_size=2147483648).decompressobj()

Review Comment:
   thank you for the suggestion, I agree and found this: 
https://github.com/indygreg/python-zstandard/issues/157
   apparently it is related to the compression level used. What do you think 
about referencing this issue as a comment? It doesn't seem like the library 
intends to fix this in an automated way, as the issue was closed, but I still 
think it's better to leave the value as it seems to work as a general 
recommendation (for my use case I tested in 10GB+ compressed or 100GB+ 
decompressed files in several difference sizes of workers from 5GB to 64GB RAM 
and it is working fine)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to