Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9459
Change subject: IMPALA-6543: Limit RowBatch serialization size to INT_MAX ...................................................................... IMPALA-6543: Limit RowBatch serialization size to INT_MAX The serialization format of a row batch relies on tuple offsets. In its current form, the tuple offsets are int32s. This means that it is impossible to generate a valid serialization of a row batch that is larger than INT_MAX. This changes RowBatch::SerializeInternal() to return an error if trying to serialize a row batch larger than INT_MAX. This prevents a DCHECK on debug builds when creating a row larger than 2GB. This also changes the compression logic in RowBatch::Serialize() to avoid a DCHECK if LZ4 will not be able to compress the row batch. Instead, it returns an error. This modifies row-batch-serialize-test to verify behavior at each of the limits. Specifically: RowBatches up to size LZ4_MAX_INPUT_SIZE succeed. RowBatches with size range [LZ4_MAX_INPUT_SIZE+1, INT_MAX] fail on LZ4 compression. RowBatches with size > INT_MAX fail with RowBatch too large. Change-Id: I1f93131facf47bd8d86c3d4923b23186ee90fcbf --- M be/src/runtime/row-batch-serialize-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M common/thrift/generate_error_codes.py 4 files changed, 131 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/9459/1 -- To view, visit http://gerrit.cloudera.org:8080/9459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: I1f93131facf47bd8d86c3d4923b23186ee90fcbf Gerrit-Change-Number: 9459 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell <[email protected]>
