Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16278 )
Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems ...................................................................... IMPALA-10005: Fix Snappy decompression for non-block filesystems Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED type compression in the backend. However, for non-block filesystems, the frontend is incorrectly passing THdfsCompression::SNAPPY instead. On debug builds, this leads to a DCHECK when trying to read Snappy-compressed text. On release builds, it fails to decompress the data. This fixes the frontend to always pass THdfsCompression::SNAPPY_BLOCKED for Snappy-compressed text. This reworks query_test/test_compressed_formats.py to provide better coverage: - Changed the RC and Seq test cases to verify that the file extension doesn't matter. Added Avro to this case as well. - Fixed the text case to use appropriate extensions (fixing IMPALA-9004) - Changed the utility function so it doesn't use Hive. This allows it to be enabled on non-HDFS filesystems like S3. - Changed the test to use unique_database and allow parallel execution. - Changed the test to run in the core job, so it now has coverage on the usual S3 test configuration. It is reasonably quick (1-2 minutes) and runs in parallel. Testing: - Exhaustive job - Core s3 job - Changed the frontend to force it to use the code for non-block filesystems (i.e. the TFileSplitGeneratorSpec code) and verified that it is now able to read Snappy-compressed text. Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Reviewed-on: http://gerrit.cloudera.org:8080/16278 Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Reviewed-by: Sahil Takiar <stak...@cloudera.com> --- M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java M tests/query_test/test_compressed_formats.py 2 files changed, 132 insertions(+), 84 deletions(-) Approvals: Impala Public Jenkins: Verified Sahil Takiar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Gerrit-Change-Number: 16278 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>