Tianyi Wang has uploaded a new patch set (#3).

Change subject: IMPALA-5250: Unify decompressor output_length semantics
......................................................................

IMPALA-5250: Unify decompressor output_length semantics

This patch makes the semantics of the output_length parameter in
Codec::ProcessBlock to be the same across all codecs. In existing code
different decompressor treats output_length differently:
1. SnappyDecompressor needs output_length to be greater than or equal
   to the actual decompressed length, but it does not set it to the
   actual decompressed length after decompression.
2. SnappyBlockDecompressor and Lz4Decompressor require output_length to
   be exactly the same as the actual decompressed length, otherwise
   decompression fails.
3. Other decompressors need output_length to be greater than or equal to
   the actual decompressed length and will set it to actual decompressed
   length if oversized.
This inconsistency leads to a bug where the error message is
undeterministic when the compressed block is corrupted. This patch makes
all decompressor behave like a modified version of 3:
Output_length should be greater than or equal to the actual decompressed
length and it will be set to actual decompressed length if oversized. A
decompression failure sets it to 0.
A testcase is added checking that decompressors can handle an oversized
output buffer correctly.
Lz4Decompressor will use the "safe" instead of the "fast" decompression
function, for the latter is insecure with corrupted data and requires
the decompressed length to be known. A benchmark is run on a 16-node
cluster and no performance impact is found.

Change-Id: Ifd42942b169921a7eb53940c3762bc45bb82a993
---
M be/src/util/codec.h
M be/src/util/decompress-test.cc
M be/src/util/decompress.cc
3 files changed, 74 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/8030/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifd42942b169921a7eb53940c3762bc45bb82a993
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>

Reply via email to