Re: [PR] [SPARK-52482][SQL][CORE] ZStandard support for file data source reader [spark]

via GitHub Wed, 18 Jun 2025 22:11:13 -0700


pan3793 commented on PR #51182:
URL: https://github.com/apache/spark/pull/51182#issuecomment-2986630488


   @sandip-db TBH, the current approach (control flow is based on try-catch 
exception) seems too hacky, and I'd like to see more detailed designs of your 
next steps. (I don't think the below  question got answered.)
   
   > how do you define the behavior of "specify the compression at the session 
level"? always respect session conf and ignore filename suffix? or fallback to 
use codec suggested by session conf when something goes wrong?
   
   > This PR just adds decompression support.
   
   You still need to ensure that Spark's zstd codec is compatible with Hadoop's 
implementation. I have experienced using the AirCompressor LZO codec to 
decompress the files written via hadoop-lzo may randomly get corrupt content 
with no errors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-52482][SQL][CORE] ZStandard support for file data source reader [spark]

Reply via email to