pitrou commented on code in PR #38272:
URL: https://github.com/apache/arrow/pull/38272#discussion_r1399186681
##########
cpp/src/arrow/util/compression_zlib.cc:
##########
@@ -392,40 +395,48 @@ class GZipCodec : public Codec {
return 0;
}
- // Reset the stream for this block
- if (inflateReset(&stream_) != Z_OK) {
- return ZlibErrorPrefix("zlib inflateReset failed: ", stream_.msg);
- }
+ // inflate() will not automatically decode concatenated gzip members, keep
calling
+ // inflate until reading all input data
+ while (read_input_bytes < input_length) {
+ // Reset the stream for this block
+ if (inflateReset(&stream_) != Z_OK) {
+ return ZlibErrorPrefix("zlib inflateReset failed: ", stream_.msg);
+ }
- int ret = 0;
- // gzip can run in streaming mode or non-streaming mode. We only
- // support the non-streaming use case where we present it the entire
- // compressed input and a buffer big enough to contain the entire
- // compressed output. In the case where we don't know the output,
- // we just make a bigger buffer and try the non-streaming mode
- // from the beginning again.
- while (ret != Z_STREAM_END) {
- stream_.next_in = const_cast<Bytef*>(reinterpret_cast<const
Bytef*>(input));
- stream_.avail_in = static_cast<uInt>(input_length);
- stream_.next_out = reinterpret_cast<Bytef*>(output);
- stream_.avail_out = static_cast<uInt>(output_buffer_length);
-
- // We know the output size. In this case, we can use Z_FINISH
- // which is more efficient.
- ret = inflate(&stream_, Z_FINISH);
- if (ret == Z_STREAM_END || ret != Z_OK) break;
-
- // Failure, buffer was too small
- return Status::IOError("Too small a buffer passed to GZipCodec.
InputLength=",
- input_length, " OutputLength=",
output_buffer_length);
- }
+ int ret = 0;
+ // gzip can run in streaming mode or non-streaming mode. We only
+ // support the non-streaming use case where we present it the entire
+ // compressed input and a buffer big enough to contain the entire
+ // compressed output. In the case where we don't know the output,
+ // we just make a bigger buffer and try the non-streaming mode
+ // from the beginning again.
+ while (ret != Z_STREAM_END) {
Review Comment:
It seems this isn't a real loop since we either `break` or `return` from it.
Can you simplifiy this code?
(I understand that this is the original code reindented, but let's fix it
anyway)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]