Tianyi Wang has uploaded a new patch set (#3). Change subject: IMPALA-5250: Unify decompressor output_length semantics ......................................................................
IMPALA-5250: Unify decompressor output_length semantics This patch makes the semantics of the output_length parameter in Codec::ProcessBlock to be the same across all codecs. In existing code different decompressor treats output_length differently: 1. SnappyDecompressor needs output_length to be greater than or equal to the actual decompressed length, but it does not set it to the actual decompressed length after decompression. 2. SnappyBlockDecompressor and Lz4Decompressor require output_length to be exactly the same as the actual decompressed length, otherwise decompression fails. 3. Other decompressors need output_length to be greater than or equal to the actual decompressed length and will set it to actual decompressed length if oversized. This inconsistency leads to a bug where the error message is undeterministic when the compressed block is corrupted. This patch makes all decompressor behave like a modified version of 3: Output_length should be greater than or equal to the actual decompressed length and it will be set to actual decompressed length if oversized. A decompression failure sets it to 0. A testcase is added checking that decompressors can handle an oversized output buffer correctly. Lz4Decompressor will use the "safe" instead of the "fast" decompression function, for the latter is insecure with corrupted data and requires the decompressed length to be known. A benchmark is run on a 16-node cluster and no performance impact is found. Change-Id: Ifd42942b169921a7eb53940c3762bc45bb82a993 --- M be/src/util/codec.h M be/src/util/decompress-test.cc M be/src/util/decompress.cc 3 files changed, 74 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/8030/3 -- To view, visit http://gerrit.cloudera.org:8080/8030 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifd42942b169921a7eb53940c3762bc45bb82a993 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>