[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1920/ -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 18:05:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 18:05:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 5: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1918/ -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Feb 2018 08:04:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1918/ -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Feb 2018 04:30:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1915/ -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Feb 2018 01:44:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1915/ -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Feb 2018 21:51:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Hello Lars Volker, Csaba Ringhofer, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9241 to look at the new patch set (#4). Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. IMPALA-6077: remove Parquet BIT_PACKED def level support The encoding was added in an early version of the Parquet spec and deprecated even in the Parquet 1.0 spec. Parquet-MR switched to generating RLE at the same time as the spec changed in mid-2013. Impala always wrote RLE: see commit 6e293090e60aea300f9e83db67f56a5efd07c35c. The Impala implementation of BIT_PACKED was never correct because it implemented little endian bit unpacking instead of the big endian unpacking required by the spec for levels. Testing: Updated tests to reflect expected behaviour for supported and unsupported def level encodings. Cherry-picks: not for 2.x. Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae --- M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M common/thrift/generate_error_codes.py M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test M tests/query_test/test_scanners.py 5 files changed, 53 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/4 -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 3: Code-Review+2 I assume this is not going to the 2.x branch? -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Feb 2018 21:38:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Feb 2018 19:42:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Hello Lars Volker, Csaba Ringhofer, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9241 to look at the new patch set (#3). Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. IMPALA-6077: remove Parquet BIT_PACKED def level support The encoding was added in an early version of the Parquet spec and deprecated even in the Parquet 1.0 spec. Parquet-MR switched to generating RLE at the same time as the spec changed in mid-2013. Impala always wrote RLE: see commit 6e293090e60aea300f9e83db67f56a5efd07c35c. The Impala implementation of BIT_PACKED was never correct because it implemented little endian bit unpacking instead of the big endian unpacking required by the spec for levels. Testing: Updated tests to reflect expected behaviour for supported and unsupported def level encodings. Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae --- M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M common/thrift/generate_error_codes.py M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test M tests/query_test/test_scanners.py 5 files changed, 53 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/3 -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test File testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test: http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test@55 PS2, Line 55: # IMPALA-6077: unsupported BIT_PACKED encoding fails when materializing columns. : select count(id), count(tinyint_col), count(smallint_col), count(int_col), : count(bigint_col), count(float_col), count(double_col), count(date_string_col), : count(string_col), count(timestamp_col), count(year), count(month), count(day) : from alltypesagg_bitpacked This query seems to be the same as the next query, and should not materialize columns. http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test@64 PS2, Line 64: materializing I am not 100% sure about this, but I think that if a column is not complex, and the stats are filled, then count can be served from column chunk stats without reading any data page, so this error will not be returned. This may not be a problem for this specific parquet file, but I would mention it in a comment, or replace the query with something that has to read the data pages. http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test@65 PS2, Line 65: select count(id), count(tinyint_col), count(smallint_col), count(int_col), : count(bigint_col), count(float_col), count(double_col), count(date_string_col), : count(string_col), count(timestamp_col), count(year), count(month), count(day) Is it necessary to list every column here? If one column is enough for the test, then I would prefer if it were shorter (for the sake of readability). -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 09 Feb 2018 14:58:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9241 to look at the new patch set (#2). Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. IMPALA-6077: remove Parquet BIT_PACKED def level support The encoding was added in an early version of the Parquet spec and deprecated even in the Parquet 1.0 spec. Parquet-MR switched to generating RLE at the same time as the spec changed in mid-2013. Impala always wrote RLE: see commit 6e293090e60aea300f9e83db67f56a5efd07c35c. The Impala implementation of BIT_PACKED was never correct because it implemented little endian bit unpacking instead of the big endian unpacking required by the spec for levels. Testing: Updated tests to reflect expected behaviour for supported and unsupported def level encodings. Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae --- M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M common/thrift/generate_error_codes.py M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test M tests/query_test/test_scanners.py 5 files changed, 37 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/2 -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9241 Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. IMPALA-6077: remove Parquet BIT_PACKED def level support The encoding was added in an early version of the Parquet spec and deprecated even in the Parquet 1.0 spec. Parquet-MR switched to generating RLE at the same time as the spec changed in mid-2013. Impala always wrote RLE: see commit 6e293090e60aea300f9e83db67f56a5efd07c35c. The Impala implementation of BIT_PACKED was never correct because it implemented little endian bit unpacking instead of the big endian unpacking required by the spec for levels. Testing: Updated tests to reflect expected behaviour for supported and unsupported def level encodings. Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae --- M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M common/thrift/generate_error_codes.py M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test M tests/query_test/test_scanners.py 5 files changed, 37 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/1 -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong