[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16751 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. IMPALA-10329 Change apt install retry times to 30 Change apt install retry times to 30 in bootstrap_system.sh, Because this always timeout recently. And add solution for waiting the apt's lock-frontend Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 Reviewed-on: http://gerrit.cloudera.org:8080/16751 Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins --- M bin/bootstrap_system.sh 1 file changed, 6 insertions(+), 1 deletion(-) Approvals: Jim Apple: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16751 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 Gerrit-Change-Number: 16751 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16751 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16751 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 Gerrit-Change-Number: 16751 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Fri, 20 Nov 2020 07:44:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16749 ) Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool .. Patch Set 2: hit IMPALA-9355 -- To view, visit http://gerrit.cloudera.org:8080/16749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e Gerrit-Change-Number: 16749 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Nov 2020 03:16:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16751 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7692/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16751 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 Gerrit-Change-Number: 16751 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Fri, 20 Nov 2020 02:30:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16751 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6683/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16751 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 Gerrit-Change-Number: 16751 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Fri, 20 Nov 2020 02:28:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/16751 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. Patch Set 1: Code-Review+2 Thank you! -- To view, visit http://gerrit.cloudera.org:8080/16751 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 Gerrit-Change-Number: 16751 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Fri, 20 Nov 2020 02:27:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
zhaoren...@hotmail.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16751 Change subject: IMPALA-10329 Change apt install retry times to 30 .. IMPALA-10329 Change apt install retry times to 30 Change apt install retry times to 30 in bootstrap_system.sh, Because this always timeout recently. And add solution for waiting the apt's lock-frontend Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 --- M bin/bootstrap_system.sh 1 file changed, 6 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/16751/1 -- To view, visit http://gerrit.cloudera.org:8080/16751 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id664dd66874ac65d6b78e630c974a6a563408147 Gerrit-Change-Number: 16751 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
zhaoren...@hotmail.com has abandoned this change. ( http://gerrit.cloudera.org:8080/16725 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. Abandoned duplicate -- To view, visit http://gerrit.cloudera.org:8080/16725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I790750da36ad53c87a830dfab6803a1862490daf Gerrit-Change-Number: 16725 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
zhaoren...@hotmail.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/16725 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. Patch Set 3: Sorry, Jim, my develop environment is recreated, so I create a new commit on here: http://gerrit.cloudera.org:8080/16751 And I will abandon this. -- To view, visit http://gerrit.cloudera.org:8080/16725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I790750da36ad53c87a830dfab6803a1862490daf Gerrit-Change-Number: 16725 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Fri, 20 Nov 2020 02:07:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10314: Optimize planning time for simple limits
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16723 ) Change subject: IMPALA-10314: Optimize planning time for simple limits .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/cup/sql-parser.cup@3115 PS5, Line 3115: KW_WHERE opt_plan_hints:pred_hints expr:e I guess this is a bit limiting in the it applies only to the whole where clause. Should it be part of the expr production below so it can be attached to any expression? I don't think this affects the functionality of this patch, since we're only checking the top-level statement anyway, but it seems like itwould me more elegant to have the expr hint be associated with the expr in the parser? If there are complications with that, maybe a comment here explaining the limitation would be sufficient. http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/analysis/Predicate.java File fe/src/main/java/org/apache/impala/analysis/Predicate.java: http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/analysis/Predicate.java@30 PS5, Line 30: isAlwaysTrue_ maybe hasAlwaysTrueHint_ just to make it crystal-clear that it's not actually a guarantee? http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@869 PS5, Line 869: if (fsHasBlocks && fd.getNumFileBlocks() == 0) continue; nit: use braces for multi-line if http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@870 PS5, Line 870: fd.getFileLength() > Yes. Totally agree. We probably can live with the 0-row data files through We already had to deal with a similar issue here https://impala.apache.org/docs/build/html/topics/impala_optimize_partition_key_scans.html ; we should document similarly Generally it doesn't make any sense to write files with 0 rows and it should be rare. Our experience is that some misbehaving tools can generate 0 row files (we've seen Spark do it with issues like https://issues.apache.org/jira/browse/SPARK-10216). You're right that the files are non-empty because they have the footer with the schema. I don't think there's an upper bound on the size either though, cause they could have an arbitrarily complex scheme. -- To view, visit http://gerrit.cloudera.org:8080/16723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574 Gerrit-Change-Number: 16723 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Nov 2020 01:29:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16749 ) Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6681/ -- To view, visit http://gerrit.cloudera.org:8080/16749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e Gerrit-Change-Number: 16749 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Nov 2020 00:38:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 ) Change subject: IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. Patch Set 11: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7691/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 11 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Nov 2020 21:58:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 ) Change subject: IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/16720/11/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/16720/11/common/thrift/PlanNodes.thrift@299 PS11, Line 299: 12: optional i32 overlap_predicate_start_index line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 11 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Nov 2020 21:37:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Qifan Chen has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/16720 ) Change subject: IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate This patch adds the logic to utilize min/max stats for Parquet row groups or pages to skip these entities when they don't qualify an equi-join predicate. A new class of predicates called overlap predicates is introduced to aid in the determination of whether a Parquet row group or a page overlap with the a range computed from the hash join. If not, then the entire Parquet row group or the page are skipped. The new class of predicates co-exist with the existing min/max conjuncts that are introduced based on the local scan predicates. Both classes of predicates can work individually or togther with each other. The overlap predicates are evaualted after the existing min/max conjuncts. To be done: 1. Handle all data types; 2. Unit testing; 3. Core testing. Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 --- M be/src/exec/exec-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java 14 files changed, 386 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16720/11 -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 11 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [WIP] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 ) Change subject: [WIP] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. Patch Set 9: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7690/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Nov 2020 19:52:57 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 ) Change subject: [WIP] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/16720/9/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/16720/9/common/thrift/PlanNodes.thrift@299 PS9, Line 299: 12: optional list slot_usage_map line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Nov 2020 19:36:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Qifan Chen has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/16720 ) Change subject: [WIP] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. [WIP] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate This patch adds the logic to utilize min/max stats for Parquet row groups or pages to skip these entities when they don't qualify an equi-join predicate. A new class of predicates called overlap predicates is introduced to aid in the determination of whether a Parquet row group or a page overlap with the a range computed from the hash join. If not, then the entire Parquet row group or the page are skipped. The new class of predicates co-exist with the existing min/max conjuncts that are introduced based on the local scan predicates. Both classes of predicates can work individually or togther with each other. Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 --- M be/src/exec/exec-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java 14 files changed, 442 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16720/9 -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16749 ) Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6681/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e Gerrit-Change-Number: 16749 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Nov 2020 19:13:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16749 ) Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e Gerrit-Change-Number: 16749 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Nov 2020 19:13:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10332: Add file formats to HdfsScanNode's thrift representation.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16728 ) Change subject: IMPALA-10332: Add file formats to HdfsScanNode's thrift representation. .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/16728/8/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test: http://gerrit.cloudera.org:8080/#/c/16728/8/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@298 PS8, Line 298: | | file formats: [ORC] We shouldn't see this, as explain_level is only 2, right? -- To view, visit http://gerrit.cloudera.org:8080/16728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad6b8271bd248983f327c07883a3bedf50f25b5d Gerrit-Change-Number: 16728 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Nov 2020 18:44:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10332: Add file formats to HdfsScanNode's thrift representation.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16728 ) Change subject: IMPALA-10332: Add file formats to HdfsScanNode's thrift representation. .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6680/ -- To view, visit http://gerrit.cloudera.org:8080/16728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad6b8271bd248983f327c07883a3bedf50f25b5d Gerrit-Change-Number: 16728 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Nov 2020 18:35:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10314: Optimize planning time for simple limits
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16723 ) Change subject: IMPALA-10314: Optimize planning time for simple limits .. Patch Set 5: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@870 PS5, Line 870: fd.getFileLength() > The getFileLength() check is used in other places too..so I borrowed that f Yes. Totally agree. We probably can live with the 0-row data files through documentation. -- To view, visit http://gerrit.cloudera.org:8080/16723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574 Gerrit-Change-Number: 16723 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Nov 2020 17:52:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10314: Optimize planning time for simple limits
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16723 ) Change subject: IMPALA-10314: Optimize planning time for simple limits .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@870 PS5, Line 870: fd.getFileLength() > Is it possible for a Parquet data file with empty # of rows pass this test? The getFileLength() check is used in other places too..so I borrowed that from generateScanRangeSpec(). For self-describing schema file formats like Parquet, yes it is possible for the length to be non-zero and num_rows zero. I think that handling such cases will need some major rework of this computeScanRangeLocation() method since right now it is agnostic to the file format (it does care about file system type but not so much the formats). Further, I believe other changes will be needed in the metadata catalog layer to ensure this FileMetaData is plumbed through although I haven't looked closely into that. The trade-off is the size of the metadata in catalog cache could blow up for large number of files and we are already run into significant memory issues. http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@947 PS5, Line 947: if (isSimpleLimit && simpleLimitNumRows == : analyzer.getSimpleLimitStatus().second) { : // for the simple limit case if the estimated rows has already reached the limit : // there's no need to process more partitions : break; : } > This is good. Ack -- To view, visit http://gerrit.cloudera.org:8080/16723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574 Gerrit-Change-Number: 16723 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Nov 2020 16:51:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16749 ) Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e Gerrit-Change-Number: 16749 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Nov 2020 16:49:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10329 Change apt install retry times to 30
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/16725 ) Change subject: IMPALA-10329 Change apt install retry times to 30 .. Patch Set 3: > for 'why should it be done', no reason, just don't output to the > console, I already tested, adding it or not don't impact the logic. In that case, let's not redirect. I think that's more the style of the rest of the script. -- To view, visit http://gerrit.cloudera.org:8080/16725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I790750da36ad53c87a830dfab6803a1862490daf Gerrit-Change-Number: 16725 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Thu, 19 Nov 2020 16:17:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 4: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012 PS2, Line 2012: ncompleteTable && isSynchronizedIceber > We still need to invoke HMS dropTable for synchronized tables that don't ha I am also unsure about this scenario, but I preferred not to change the original handling. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 15:10:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9121: try to avoid ASAN error in hdfs-util-test
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16748 ) Change subject: IMPALA-9121: try to avoid ASAN error in hdfs-util-test .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7b42be0f8b5d6c6a31095f9d1a278fd82bd500c Gerrit-Change-Number: 16748 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Nov 2020 14:51:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9121: try to avoid ASAN error in hdfs-util-test
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16748 ) Change subject: IMPALA-9121: try to avoid ASAN error in hdfs-util-test .. IMPALA-9121: try to avoid ASAN error in hdfs-util-test I couldn't discern the likely root cause of the ASAN error, but have a hunch that it's a background thread accessing some data structure that is being torn down as the process exits. The tests in this file are simple so there shouldn't really be that much that can go wrong, except for the stuff started by ExecEnv::Init(). I modified the test to only initialize the necessary configs in ExecEnv, not start up the whole thing. Hopefully that make the problem go away. Testing: Looped the test locally with ASAN. Change-Id: Ic7b42be0f8b5d6c6a31095f9d1a278fd82bd500c Reviewed-on: http://gerrit.cloudera.org:8080/16748 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/util/hdfs-util-test.cc 3 files changed, 18 insertions(+), 15 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic7b42be0f8b5d6c6a31095f9d1a278fd82bd500c Gerrit-Change-Number: 16748 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WiP: IMPALA-10237: Support Bucket and Truncate partition transforms as built-in functions
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16741 ) Change subject: WiP: IMPALA-10237: Support Bucket and Truncate partition transforms as built-in functions .. Patch Set 1: Thanks for working on this. The change looks great, though I became a bit unsure about whether we want to make these functions visible to the user. However, it's definitely useful during development. Maybe write C++ unit tests instead of e2e tests, and later we can decide the visibility of these functions. -- To view, visit http://gerrit.cloudera.org:8080/16741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I485680cf79d96d578dd8cfbfd554bec468fe84bd Gerrit-Change-Number: 16741 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Nov 2020 14:46:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10314: Optimize planning time for simple limits
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16723 ) Change subject: IMPALA-10314: Optimize planning time for simple limits .. Patch Set 5: (2 comments) Looks very good! The empty parquet data file may be a corner case to worry about. http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@870 PS5, Line 870: fd.getFileLength() Is it possible for a Parquet data file with empty # of rows pass this test? Note that due to the meta-data portion, such a file will have some number of bytes. See one case here https://github.com/G-Research/ParquetSharp/issues/110. If we can look at the meta-data of such a file, the number of rows is right there. 871 struct FileMetaData { 872 /** Version of this file **/ 873 1: required i32 version 874 875 /** Parquet schema for this file. This schema contains metadata for all the columns. 876* The schema is represented as a tree with a single root. The nodes of the tree 877* are flattened to a list by doing a depth-first traversal. 878* The column metadata contains the path in the schema for that column which can be 879* used to map columns to nodes in the schema. 880* The first element is the root **/ 881 2: required list schema; 882 883 /** Number of rows in this file **/ 884 3: required i64 num_rows http://gerrit.cloudera.org:8080/#/c/16723/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@947 PS5, Line 947: if (isSimpleLimit && simpleLimitNumRows == : analyzer.getSimpleLimitStatus().second) { : // for the simple limit case if the estimated rows has already reached the limit : // there's no need to process more partitions : break; : } This is good. -- To view, visit http://gerrit.cloudera.org:8080/16723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574 Gerrit-Change-Number: 16723 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Nov 2020 14:30:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10332: Add file formats to HdfsScanNode's thrift representation.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16728 ) Change subject: IMPALA-10332: Add file formats to HdfsScanNode's thrift representation. .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6680/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad6b8271bd248983f327c07883a3bedf50f25b5d Gerrit-Change-Number: 16728 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Nov 2020 13:08:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10332: Add file formats to HdfsScanNode's thrift representation.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16728 ) Change subject: IMPALA-10332: Add file formats to HdfsScanNode's thrift representation. .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7689/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad6b8271bd248983f327c07883a3bedf50f25b5d Gerrit-Change-Number: 16728 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Nov 2020 12:38:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10332: Add file formats to HdfsScanNode's thrift representation.
Daniel Becker has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/16728 ) Change subject: IMPALA-10332: Add file formats to HdfsScanNode's thrift representation. .. IMPALA-10332: Add file formats to HdfsScanNode's thrift representation. List all file formats that a HdfsScanNode needs to process in any fragment instance. It is possible that some file formats will not be needed in all fragment instances. This is a step towards sharing codegen between different impala backends. Using the file formats provided in the thrift file, a backend can codegen code for file formats that are not needed in its own process but are needed in other fragment instances running on other backends, and the resulting binary can be shared between multiple backends. Codegenning for file formats will be done based on the thrift message and not on what is needed for the actual backend. This leads to some extra work in case a file format is not needed for the current backend and codegen sharing is not available (at this point it is not implemented). However, the overall number of such cases is low. Also adding the file formats to the node's explain string at level 3. Testing: - Added tests to verify that the file formats are present in the explain string at level 3. Change-Id: Iad6b8271bd248983f327c07883a3bedf50f25b5d --- M be/src/exec/hdfs-scan-node-base.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test 5 files changed, 60 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16728/7 -- To view, visit http://gerrit.cloudera.org:8080/16728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iad6b8271bd248983f327c07883a3bedf50f25b5d Gerrit-Change-Number: 16728 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7688/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:58:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7687/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:48:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 4: PS4 is only a rebase. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:42:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#4). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 14 files changed, 524 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/4 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@29 PS1, Line 29: e2e > e2e Done http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012 PS2, Line 2012: > This will "double drop" Kudu tables where existingTbl instanceof Incomplet We still need to invoke HMS dropTable for synchronized tables that don't have HMS integration enabled. So the "double drop" can only happen when existingTbl instanceof IncompleteTable && msTbl table could be retrieved && isHmsIntegrationAutomatic(msTbl) I'm not sure if we can hit such scenario with normal usage, but anyway I restricted this condition to Iceberg tables. http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2015 PS2, Line 2015: !isHmsIntegrationA > it calls dropTable, so needsHmsDropTable would clearer Done -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:29:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#3). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 14 files changed, 524 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/3 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-9121: try to avoid ASAN error in hdfs-util-test
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16748 ) Change subject: IMPALA-9121: try to avoid ASAN error in hdfs-util-test .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6679/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7b42be0f8b5d6c6a31095f9d1a278fd82bd500c Gerrit-Change-Number: 16748 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Nov 2020 09:26:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9121: try to avoid ASAN error in hdfs-util-test
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16748 ) Change subject: IMPALA-9121: try to avoid ASAN error in hdfs-util-test .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7b42be0f8b5d6c6a31095f9d1a278fd82bd500c Gerrit-Change-Number: 16748 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Nov 2020 09:26:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9121: try to avoid ASAN error in hdfs-util-test
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16748 ) Change subject: IMPALA-9121: try to avoid ASAN error in hdfs-util-test .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7b42be0f8b5d6c6a31095f9d1a278fd82bd500c Gerrit-Change-Number: 16748 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Nov 2020 09:26:32 + Gerrit-HasComments: No