pan3793 commented on PR #51182: URL: https://github.com/apache/spark/pull/51182#issuecomment-2986630488
@sandip-db TBH, the current approach (control flow is based on try-catch exception) seems too hacky, and I'd like to see more detailed designs of your next steps. (I don't think the below question got answered.) > how do you define the behavior of "specify the compression at the session level"? always respect session conf and ignore filename suffix? or fallback to use codec suggested by session conf when something goes wrong? > This PR just adds decompression support. You still need to ensure that Spark's zstd codec is compatible with Hadoop's implementation. I have experienced using the AirCompressor LZO codec to decompress the files written via hadoop-lzo may randomly get corrupt content with no errors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
