amassalha opened a new issue, #38271:
URL: https://github.com/apache/arrow/issues/38271

   ### Describe the enhancement requested
   
   Curent gzip decompress is calling 'infalte' until getting 'Z_STREAM_END ' or 
error is returned, but zccording to gzip (zlib) documentation, this might be 
not enough:
   
   " inflate() will not automatically decode concatenated gzip members. 
inflate() will return Z_STREAM_END at the end of the gzip member. The state 
would need to be reset to continue decoding a subsequent gzip member. This must 
be done if there is more data after a gzip member, in order for the 
decompression to be compliant with the gzip standard (RFC 1952)." 
(https://www.zlib.net/manual.html)
   
   This PR is for supporting reading parquet files that contains more than 1 
gzip member. (example file attahced)
   
[concatenated_gzip_members.zip](https://github.com/apache/arrow/files/12908697/concatenated_gzip_members.zip)
   
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to