[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10358 ) Change subject: IMPALA-6802 (part 3): Clean up authorization tests .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/10358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7 Gerrit-Change-Number: 10358 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Sat, 12 May 2018 03:44:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10327 ) Change subject: IMPALA-6907: Close stale connections to removed cluster members .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Gerrit-Change-Number: 10327 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Sat, 12 May 2018 03:32:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10327 ) Change subject: IMPALA-6907: Close stale connections to removed cluster members .. IMPALA-6907: Close stale connections to removed cluster members Previously, ImpalaServer::MembershipCallback() is used by each Impala backend node to update cluster membership. It also removes stale connections to nodes which are no longer members of the cluster. However, the way it detects removed member is flawed as it relies on query_locations_ to determine whether stale connections may exist to the removed members. query_locations_ is a map of host name to a set of queries running on that host. A entry for a remote node only exists in query_locations_ if an Impalad node has acted as coordinator of a query with fragment instances scheduled to run on that remote node. This change fixes this problem by closing connections to remote hosts which are removed from the cluster regardless of whether it can be found in query_locations_. A new test is added to exercise this path by restarting Impalad backend nodes between queries. Also change impala_cluster.py to use bin/start-impala.sh to start Impala demon instead of directly forking and exec'ing Impalad. This is needed as start-impala.sh sets up the proper Java related environment variables. Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Reviewed-on: http://gerrit.cloudera.org:8080/10327 Reviewed-by: Michael HoTested-by: Impala Public Jenkins --- M be/src/service/impala-server.cc M tests/common/impala_cluster.py M tests/custom_cluster/test_restart_services.py 3 files changed, 56 insertions(+), 35 deletions(-) Approvals: Michael Ho: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Gerrit-Change-Number: 10327 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR](2.x) IMPALA-7010: don't run memory usage tests on non-HDFS
Hello Impala Public Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/10387 to review the following change. Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. IMPALA-7010: don't run memory usage tests on non-HDFS Moved a number of tests with tuned mem_limits. In some cases this required separating the tests from non-tuned functional tests. TestQueryMemLimit used very high and very low limits only, so seemed safe to run in all configurations. Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Reviewed-on: http://gerrit.cloudera.org:8080/10370 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- A testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test A testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test A testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test R testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test M tests/common/skip.py M tests/query_test/test_insert.py M tests/query_test/test_kudu.py M tests/query_test/test_mem_usage_scaling.py M tests/query_test/test_nested_types.py M tests/query_test/test_sort.py M tests/query_test/test_spilling.py 15 files changed, 198 insertions(+), 157 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/10387/1 -- To view, visit http://gerrit.cloudera.org:8080/10387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10387 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR](2.x) IMPALA-7010: don't run memory usage tests on non-HDFS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10387 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2460/ -- To view, visit http://gerrit.cloudera.org:8080/10387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10387 Gerrit-PatchSet: 1 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 12 May 2018 02:47:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 12 May 2018 01:43:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. IMPALA-6957: calc thread resource requirement in planner This only factors in fragment execution threads. E.g. this does *not* try to account for the number of threads on the old Thrift RPC code path if that is enabled. This is loosely related to the old VCores estimate, but is different in that it: * Directly ties into the notion of required threads in ThreadResourceMgr. * Is a strict upper bound on the number of such threads, rather than an estimate. Does not include "optional" threads. ThreadResourceMgr in the backend bounds the number of "optional" threads per impalad, so the number of execution threads on a backend is limited by sum(required threads per query) + CpuInfo::num_cores() * FLAGS_num_threads_per_core DCHECKS in the backend enforce that the calculation is correct. They were actually hit in KuduScanNode because of some races in thread management leading to multiple "required" threads running. Now the first thread in the multithreaded scans never exits, which means that it's always safe for any of the other threads to exit early, which simplifies the logic a lot. Testing: Updated planner tests. Ran core tests. Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Reviewed-on: http://gerrit.cloudera.org:8080/10256 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.cc M be/src/runtime/runtime-state.cc M be/src/runtime/thread-resource-mgr.cc M be/src/runtime/thread-resource-mgr.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test 43 files changed, 1,977 insertions(+), 1,931 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala
[Impala-ASF-CR] IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10353 ) Change subject: IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml File docs/topics/impala_create_table.xml: http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml@215 PS1, Line 215: > missing a comma Done -- To view, visit http://gerrit.cloudera.org:8080/10353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e Gerrit-Change-Number: 10353 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 12 May 2018 01:33:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause
Hello Thomas Tauber-Marshall, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10353 to look at the new patch set (#2). Change subject: IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause .. IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e --- M docs/topics/impala_create_table.xml 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/10353/2 -- To view, visit http://gerrit.cloudera.org:8080/10353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e Gerrit-Change-Number: 10353 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 5: This change did not cherrypick successfully into branch 2.x. To resolve this, please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/482/ . -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 12 May 2018 01:30:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10381 ) Change subject: IMPALA-6983: stress: don't write a null runtime profile .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2457/ -- To view, visit http://gerrit.cloudera.org:8080/10381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5 Gerrit-Change-Number: 10381 Gerrit-PatchSet: 1 Gerrit-Owner: Michael BrownGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nithya Janarthanan Gerrit-Comment-Date: Sat, 12 May 2018 00:59:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7003: Deflake erasure coding data loading
Tianyi Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10362 ) Change subject: IMPALA-7003: Deflake erasure coding data loading .. IMPALA-7003: Deflake erasure coding data loading Erasure coding data loading is flaky in two ways: 1. HBase sometimes doesn't work because of HBase-19369 2. Nested data loading sometimes fails because the HDFS namenode cannot find enough good datanodes. For problem 1, this patch enables erasure coding only on /test-warehouse directory. For problem 2, this patch sets dfs.namenode.redundancy.considerLoad to false, preventing namenode from excluding heavily-loaded datanodes. Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c --- M bin/impala-config.sh M testdata/bin/create-load-data.sh M testdata/bin/load-test-warehouse-snapshot.sh M testdata/bin/setup-hdfs-env.sh M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl 5 files changed, 44 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/10362/2 -- To view, visit http://gerrit.cloudera.org:8080/10362 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c Gerrit-Change-Number: 10362 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. IMPALA-5384, part 2: Simplify Coordinator locking and clarify state The is the final change to clarify and break up the Coordinator's lock. The state machine for the coordinator is made explicit, distinguishing between executing state and multiple terminal states. Logic to transition into a terminal state is centralized in one location and executes exactly once for each coordinator object. Derived from a patch for IMPALA_5384 by Marcel Kornacker. Testing: - exhaustive functional tests - stress test on minicluster with memory overcommitment. Verified from the logs that this exercises all these paths: - successful queries - client requested cancellation - error from exec FInstances RPC - error reported asynchronously via report status RPC - eos before backend execution completed Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Reviewed-on: http://gerrit.cloudera.org:8080/10158 Reviewed-by: Dan HechtTested-by: Impala Public Jenkins --- M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/client-request-state.cc M be/src/service/impala-server.h M be/src/util/counting-barrier.h 6 files changed, 393 insertions(+), 392 deletions(-) Approvals: Dan Hecht: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 22 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 21: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 21 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 12 May 2018 00:46:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10327 ) Change subject: IMPALA-6907: Close stale connections to removed cluster members .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2459/ -- To view, visit http://gerrit.cloudera.org:8080/10327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Gerrit-Change-Number: 10327 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Sat, 12 May 2018 00:04:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1147 PS7, Line 1147: void ClientRequestState::FinishExecQueryOrDmlRequest() { : #ifndef NDEBUG : SleepIfSetInDebugOptions(schedule_->query_options(), SLEEP_BEFORE_ADMISSION_MS); : #endif : // Remove if-check after IMPALA-5814. : if (ExecEnv::GetInstance()->admission_controller() != nullptr) { : Status admit_status = ExecEnv::GetInstance()->admission_controller()->AdmitQuery( : schedule_.get(), _outcome_); : : #ifndef NDEBUG : SleepIfSetInDebugOptions(schedule_->query_options(), SLEEP_AFTER_ADMISSION_MS); : #endif > Given how small the window is, I think it's better to go with the simpler a Yes, it is equivalent to cancellation happening right after this check, which would be caught by the cancellation check at Line 1173 -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 23:59:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 21: (1 comment) http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476 PS18, Line 476: DCHECK(exec_status_.ok()) << exec_status_; > I was thinking the ($3 -> $4) already gives that info, once you get used to Agreed. That makes sense. -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 21 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 23:59:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test File testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test: http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test@64 PS4, Line 64: instances > Sounds like deferring it is a good idea :) Agree, definitely outside of the scope of this change -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 23:16:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/10060/8/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/10060/8/be/src/service/impala-server.cc@1062 PS8, Line 1062: const PerBackendExecParams& per_backend_params = : request_state->schedule()->per_backend_exec_params(); : if (!per_backend_params.empty()) { : lock_guard l(query_locations_lock_); : for (const auto& entry : per_backend_params) { : const TNetworkAddress& hostport = entry.first; : // Query may have been removed already by cancellation path. In particular, if : // node to fail was last sender to an exchange, the coordinator will realise and : // fail the query at the same time the failure detection path does the same : // thing. They will harmlessly race to remove the query from this map. : QueryLocations::iterator it = query_locations_.find(hostport); : if (it != query_locations_.end()) { : it->second.erase(request_state->query_id()); : } : } : } forgot to change this part, since after last patch query locations are always added even if coord has not been started. Just need to enable this part to execute if a valid schedule exists -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 23:03:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10358 ) Change subject: IMPALA-6802 (part 3): Clean up authorization tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java: http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@750 PS1, Line 750: .error(selectError("functional.alltypesagg")) : .error(selectError("functional.alltypesagg"), onServer(allExcept( > I'm not sure this test makes sense because you can only grant SELECT at the Done. I'll just remove all other onColumn error checks that don't make sense. -- To view, visit http://gerrit.cloudera.org:8080/10358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7 Gerrit-Change-Number: 10358 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Fri, 11 May 2018 22:53:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests
Fredy Wijaya has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10358 ) Change subject: IMPALA-6802 (part 3): Clean up authorization tests .. IMPALA-6802 (part 3): Clean up authorization tests The third part of this patch is to rewrite the following authorization tests: - with - union - reset metadata - show Testing: - Added new authorization tests - Ran all front-end tests Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7 Cherry-picks: not for 2.x --- M fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java 1 file changed, 443 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10358/2 -- To view, visit http://gerrit.cloudera.org:8080/10358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7 Gerrit-Change-Number: 10358 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 7: (40 comments) http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@44 PS7, Line 44: ClientRequestState::admit_outcome_ > if this enum is meant to be consumed publicly, then it's best to not refer Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@201 PS7, Line 201: admit_status > admit_outcome? Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@203 PS7, Line 203: cancelled > given that there's no Cancel() API in this class, how is that done? (I'm gu Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@204 PS7, Line 204: admit_status > admit_outcome Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@112 PS7, Line 112: Q > nit: don't capitalize queue Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@899 PS7, Line 899: false; > use && rather than ?:false Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@72 PS7, Line 72: /// Must *not* be called with lock_ held. > given that the coordinator object is (unfortunately) current exposed by thi Changed coord() to GetCoord() and added those details to its comment instead. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@114 PS7, Line 114: non-error states (RUNNING_STATE and FINISHED_STATE) > is PENDING_STATE one of those? added it to this method, and updated the implementation to honor the state transition order of: INITIALIZED_STATE -> PENDING_STATE -> RUNNING_STATE -> FINISHED_STATE http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@152 PS7, Line 152: coord_ > let's try not to talk about private members when documenting the public int Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@172 PS7, Line 172: Coordinator* coord() const { : return coord_exec_started_.Load() ? coord_.get() : nullptr; : } > it's confusing that coord() is not actually an accessor given it's name. If Renamed to GetCoord(). For the second part of your comment: That was my initial thought as well, but what happens is that there is a small window between calling coord->Exec() and setting the "coord_" variable in CRS, where the started fragments send in UpdateBackendExecStatus RPCs which we need to pass to the coord object that we currently do by looking up the CRS object directly in "client_request_state_map_". This worked up till now since coord_ is set before the Exec() call but now if we defer setting it, then we get a nullptr instead. I thought of 2 approaches to to solve this: 1. the one that I implemented using "coord_exec_started_" for controlling external access and adding methods CRS::UpdateBackendExecStatus(), CRS::UpdateFilter() that access the "coord_" object directly. OR 2. To defer setting the "coord_" variable in CRS and to serve the UpdateBackendExecStatus and UpdateFilter RPCs by keeping a list of active coordinators separately and looking them up there. Update: As discussed, I will try to look into a way of cleaning this up. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@323 PS7, Line 323: Exec() has been successfully called on it > why do we bother creating the coordinator object ahead of when we call Exec see previous comment. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@481 PS7, Line 481: query-exec-state > Is there a reason we can't update the group name to client-request-state? I I dont see any reason not to. Since this only needs to be changed in 2 other places apart from this (seems like a small enough change), would you recommend I make the change in this commit itself? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@482 PS7, Line 482: SubmitToAdmissionController, this, : _controller_thread_, > I think it's a bit confusing that we call this the "admission controller" t Excellent idea. Done. I am a bit iffy about calling it the
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10060 to look at the new patch set (#8). Change subject: IMPALA-5216: Make admission control queuing async .. IMPALA-5216: Make admission control queuing async Implement asynchronous admission control queuing. This is achieved by running the admission control code-path in a separate thread. Major changes include: propagating cancellation to the admission control thread and dequeuing thread, and adding a new Query Operation State called "PENDING" that represents the state between completion of planning and starting of query execution. Testing: - Added a deterministic end to end test - Ran multiple stress tests successfully with a cancellation probability of 60% and with different values for the following parameters: max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a valid state afterwards (no orphan fragments or wrong metrics). - Ran all exhaustive tests and ASAN core tests successfully. TODO: change terminology of "in_flight_query" to "submitted_queries". Need to identify all references of this terminology, eg. in comments, tests, variable names, etc. Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf --- M be/src/common/atomic.h M be/src/common/logging.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/debug-util.cc M be/src/util/debug-util.h M be/src/util/promise-test.cc M be/src/util/promise.h M common/thrift/ImpalaService.thrift M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test M tests/authorization/test_authorization.py M tests/beeswax/impala_beeswax.py M tests/common/impala_connection.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/hs2/hs2_test_suite.py M tests/hs2/test_hs2.py M tests/query_test/test_observability.py M www/query_backends.tmpl 30 files changed, 649 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/8 -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 22:41:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. IMPALA-7010: don't run memory usage tests on non-HDFS Moved a number of tests with tuned mem_limits. In some cases this required separating the tests from non-tuned functional tests. TestQueryMemLimit used very high and very low limits only, so seemed safe to run in all configurations. Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Reviewed-on: http://gerrit.cloudera.org:8080/10370 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- A testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test A testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test A testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test R testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test M tests/common/skip.py M tests/query_test/test_insert.py M tests/query_test/test_kudu.py M tests/query_test/test_mem_usage_scaling.py M tests/query_test/test_nested_types.py M tests/query_test/test_sort.py M tests/query_test/test_spilling.py 15 files changed, 196 insertions(+), 156 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10381 ) Change subject: IMPALA-6983: stress: don't write a null runtime profile .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2457/ -- To view, visit http://gerrit.cloudera.org:8080/10381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5 Gerrit-Change-Number: 10381 Gerrit-PatchSet: 1 Gerrit-Owner: Michael BrownGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nithya Janarthanan Gerrit-Comment-Date: Fri, 11 May 2018 22:35:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10381 ) Change subject: IMPALA-6983: stress: don't write a null runtime profile .. Patch Set 1: > I don't have permissions to give +2 > Looks good to me Appreciate the review! It's good to have reviews from non-committers as well, and reviewing code is a way to get you on a path toward committership. https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala http://impala.apache.org/bylaws.html -- To view, visit http://gerrit.cloudera.org:8080/10381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5 Gerrit-Change-Number: 10381 Gerrit-PatchSet: 1 Gerrit-Owner: Michael BrownGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nithya Janarthanan Gerrit-Comment-Date: Fri, 11 May 2018 22:35:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2456/ -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 22:29:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 22:28:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile
Nithya Janarthanan has posted comments on this change. ( http://gerrit.cloudera.org:8080/10381 ) Change subject: IMPALA-6983: stress: don't write a null runtime profile .. Patch Set 1: Code-Review+1 I don't have permissions to give +2 Looks good to me -- To view, visit http://gerrit.cloudera.org:8080/10381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5 Gerrit-Change-Number: 10381 Gerrit-PatchSet: 1 Gerrit-Owner: Michael BrownGerrit-Reviewer: David Knupp Gerrit-Reviewer: Nithya Janarthanan Gerrit-Comment-Date: Fri, 11 May 2018 22:23:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/10353 ) Change subject: IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml File docs/topics/impala_create_table.xml: http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml@215 PS1, Line 215: missing a comma -- To view, visit http://gerrit.cloudera.org:8080/10353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e Gerrit-Change-Number: 10353 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 11 May 2018 21:52:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR](2.x) Ignore 3.x version update patch
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10383 ) Change subject: Ignore 3.x version update patch .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c Gerrit-Change-Number: 10383 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 11 May 2018 21:31:32 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) Ignore 3.x version update patch
Sailesh Mukil has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10383 ) Change subject: Ignore 3.x version update patch .. Ignore 3.x version update patch Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c Reviewed-on: http://gerrit.cloudera.org:8080/10383 Reviewed-by: Michael BrownTested-by: Sailesh Mukil --- M bin/ignored_commits.json 1 file changed, 3 insertions(+), 1 deletion(-) Approvals: Michael Brown: Looks good to me, approved Sailesh Mukil: Verified -- To view, visit http://gerrit.cloudera.org:8080/10383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: merged Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c Gerrit-Change-Number: 10383 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR](2.x) Ignore 3.x version update patch
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10383 ) Change subject: Ignore 3.x version update patch .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c Gerrit-Change-Number: 10383 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 11 May 2018 21:30:39 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) Ignore 3.x version update patch
Sailesh Mukil has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10383 Change subject: Ignore 3.x version update patch .. Ignore 3.x version update patch Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c --- M bin/ignored_commits.json 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/10383/1 -- To view, visit http://gerrit.cloudera.org:8080/10383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c Gerrit-Change-Number: 10383 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 21: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2455/ -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 21 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 21:18:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 21: Code-Review+2 Rebase -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 21 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 21:18:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476 PS18, Line 476: DCHECK(exec_status_.ok()) << exec_status_; > Maybe we can have a larger discussion later, and leave this as it is now. B I was thinking the ($3 -> $4) already gives that info, once you get used to it. For an error that results in query error, it will look like: (EXECUTING -> ERROR) for other errors (that are effectively dropped), it will look like e.g.: (CANCELLED -> CANCELLED) (RETURNED_RESULTS -> RETURNED_RESULTS) (ERROR -> ERROR) which says that we didn't enter the error state for this status. I'm definitely open to tweaking the condition for this log in the future, but just thinking given the amount of active development in the code that will call this path (i.e. IMPALA-2990), it might be helpful to have at least initially. -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 18 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 21:16:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 20: Code-Review+2 (4 comments) Thanks! This LGTM, +2-ing it since it's already gone through multiple rounds of review. All comments are just responses for discussion's sake, and don't require any code changes. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@325 PS18, Line 325: rySummary() and > ExecState is a scalar (enum). We pass scalar by value - no reason to have i I agree but on the other hand, keeping it const would help by avoiding accidental bugs such as overwriting the exec state in future changes, so it's better just to define the read only contract upfront. Also agree with no need for a ref. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@279 PS14, Line 279: > Cancel() is not called for (execution) errors. Thats part of the point of t I think I meant CancelBackends() as the NotifyAll() lives there, but the comment you updated it to is much clearer. Thanks. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457 PS18, Line 457: ret_status > exec_status_ is protected by the lock, so the copy is necessary. Fair point, I missed that. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476 PS18, Line 476: DCHECK(exec_status_.ok()) << exec_status_; > My thinking was that it would be good to at least log the error, which is w Maybe we can have a larger discussion later, and leave this as it is now. But just as someone who frequents the logs to debug issues, if I see an error for a query, I would assume by default that the query failed, which could lead to some confusion, if it in fact didn't fail. Maybe a solution would be to meet in the middle and tag log messages such as this with something like " failed but query succeeded", etc. But we don't need to have that convo in this patch. -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 20 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 21:11:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.h@317 PS5, Line 317: /// Query-global timezone used as local timezone when executing the query. who owns it? Let's at least say "Not owned." http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/util/filesystem-util.h File be/src/util/filesystem-util.h: http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/util/filesystem-util.h@58 PS5, Line 58: Real Should we call it GetCanonicalPath()? I assume it's related to IsCanonicalPath(), and so would be nice to name them similarly (and group them together). Is it the case that IsCanonicalPath() always returns true for the *real_path returned by GetRealPath()? -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 5 Gerrit-Owner: Attila JegesGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 21:07:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 21:01:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10165 ) Change subject: IMPALA-6941: load more text scanner compression plugins .. Patch Set 11: Thanks -- To view, visit http://gerrit.cloudera.org:8080/10165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6 Gerrit-Change-Number: 10165 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 20:56:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10165 ) Change subject: IMPALA-6941: load more text scanner compression plugins .. Patch Set 11: I believe the test case below covers the interesting cases testdata/workloads/functional-query/queries/QueryTest/unsupported-compression-partitions.test -- To view, visit http://gerrit.cloudera.org:8080/10165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6 Gerrit-Change-Number: 10165 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 20:22:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10327 ) Change subject: IMPALA-6907: Close stale connections to removed cluster members .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2453/ -- To view, visit http://gerrit.cloudera.org:8080/10327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Gerrit-Change-Number: 10327 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 11 May 2018 20:07:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift@416 PS6, Line 416: max_per_host_required_threads > should we just go ahead and call this a reservation? Or, we can wait for l Done. Went through and updated variables and comments to use thread_reservation/mem_reservation. There were some minor formatting changes from clang-format too. http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift@80 PS6, Line 80: required_threads > same question about reservation. Done http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@276 PS6, Line 276: .setMemEstimateBytes(0) > why do that? The ResourceProfileBuilder requires setting a mem estimate (to avoid PlanNodes implicitly leaving it at zero). This is just preserving the old behaviour (which doesn't make sense, but I've been deferring making piecemeal changes to the estimates - IMPALA-5013). -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 19:58:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Hello Bikramjeet Vig, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10256 to look at the new patch set (#7). Change subject: IMPALA-6957: calc thread resource requirement in planner .. IMPALA-6957: calc thread resource requirement in planner This only factors in fragment execution threads. E.g. this does *not* try to account for the number of threads on the old Thrift RPC code path if that is enabled. This is loosely related to the old VCores estimate, but is different in that it: * Directly ties into the notion of required threads in ThreadResourceMgr. * Is a strict upper bound on the number of such threads, rather than an estimate. Does not include "optional" threads. ThreadResourceMgr in the backend bounds the number of "optional" threads per impalad, so the number of execution threads on a backend is limited by sum(required threads per query) + CpuInfo::num_cores() * FLAGS_num_threads_per_core DCHECKS in the backend enforce that the calculation is correct. They were actually hit in KuduScanNode because of some races in thread management leading to multiple "required" threads running. Now the first thread in the multithreaded scans never exits, which means that it's always safe for any of the other threads to exit early, which simplifies the logic a lot. Testing: Updated planner tests. Ran core tests. Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.cc M be/src/runtime/runtime-state.cc M be/src/runtime/thread-resource-mgr.cc M be/src/runtime/thread-resource-mgr.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test 43 files changed, 1,977 insertions(+), 1,931 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/10256/7 -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile
Michael Brown has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10381 Change subject: IMPALA-6983: stress: don't write a null runtime profile .. IMPALA-6983: stress: don't write a null runtime profile This patch fixes a problem in which the writing of profiles always assumed profiles would be collected during binary. That's not the case when queries are too expensive to run. The simple fix is to just check for None and not perform the write. Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5 --- M tests/stress/concurrent_select.py 1 file changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/10381/1 -- To view, visit http://gerrit.cloudera.org:8080/10381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5 Gerrit-Change-Number: 10381 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Brown
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2454/ -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 19:14:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 19:14:05 + Gerrit-HasComments: No
[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10360 ) Change subject: Update version to 3.1.0-SNAPSHOT .. Patch Set 2: This change did not cherrypick successfully into branch 2.x. To resolve this, please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/479/ . -- To view, visit http://gerrit.cloudera.org:8080/10360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167 Gerrit-Change-Number: 10360 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 11 May 2018 18:58:13 + Gerrit-HasComments: No
[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10360 ) Change subject: Update version to 3.1.0-SNAPSHOT .. Patch Set 2: Sorry for the spam; I'm testing something on the cherrypicker jenkins job to send a message via gerrit review, and I'm struggling a little bit with the syntax! -- To view, visit http://gerrit.cloudera.org:8080/10360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167 Gerrit-Change-Number: 10360 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 11 May 2018 18:54:38 + Gerrit-HasComments: No
[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10360 ) Change subject: Update version to 3.1.0-SNAPSHOT .. Patch Set 2: test test -- To view, visit http://gerrit.cloudera.org:8080/10360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167 Gerrit-Change-Number: 10360 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 11 May 2018 18:54:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py@130 PS2, Line 130: qualified_path = pytest.mark.skipif(IS_LOCAL, : reason="Tests rely on HDFS qualified paths") > Done. The local filesystem tests run with a single impalad, so it ends up w Ah, didn't realize that's that we only started one node for local fs tests. I'm not sure why that is, but okay. -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 18:54:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10360 ) Change subject: Update version to 3.1.0-SNAPSHOT .. Patch Set 2: --help -- To view, visit http://gerrit.cloudera.org:8080/10360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167 Gerrit-Change-Number: 10360 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 11 May 2018 18:53:10 + Gerrit-HasComments: No
[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10360 ) Change subject: Update version to 3.1.0-SNAPSHOT .. Patch Set 2: test -- To view, visit http://gerrit.cloudera.org:8080/10360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167 Gerrit-Change-Number: 10360 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 11 May 2018 18:52:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test File testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test: http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test@3 PS2, Line 3: # Check that hdfs writers respects mem_limit. > maybe add comments to the test files saying the mem limit is tuned for 3 no Done http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py@130 PS2, Line 130: mem_usage_different = pytest.mark.skipif(IS_LOCAL, : reason="Memory limit too low when running single node") > do we still need that? should all of these tests be "tuned_for_minicluster" Done. The local filesystem tests run with a single impalad, so it ends up with 3x the memory consumption in some cases. -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 18:41:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Hello Sailesh Mukil, Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10370 to look at the new patch set (#3). Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. IMPALA-7010: don't run memory usage tests on non-HDFS Moved a number of tests with tuned mem_limits. In some cases this required separating the tests from non-tuned functional tests. TestQueryMemLimit used very high and very low limits only, so seemed safe to run in all configurations. Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 --- A testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test A testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test A testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test R testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test M tests/common/skip.py M tests/query_test/test_insert.py M tests/query_test/test_kudu.py M tests/query_test/test_mem_usage_scaling.py M tests/query_test/test_nested_types.py M tests/query_test/test_sort.py M tests/query_test/test_spilling.py 15 files changed, 196 insertions(+), 156 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/10370/3 -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10370 ) Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test File testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test: http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test@3 PS2, Line 3: # Check that hdfs writers respects mem_limit. maybe add comments to the test files saying the mem limit is tuned for 3 node hdfs cluster, to help prevent these from accidentally being added back to the full matrix. http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py@130 PS2, Line 130: mem_usage_different = pytest.mark.skipif(IS_LOCAL, : reason="Memory limit too low when running single node") do we still need that? should all of these tests be "tuned_for_minicluster". And I don't understand the reason here anyway - local fs tests still use the minicluster (as do hdfs and isilon for that matter). They just use a different default filesystem. -- To view, visit http://gerrit.cloudera.org:8080/10370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662 Gerrit-Change-Number: 10370 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 11 May 2018 17:36:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10327 ) Change subject: IMPALA-6907: Close stale connections to removed cluster members .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2453/ -- To view, visit http://gerrit.cloudera.org:8080/10327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Gerrit-Change-Number: 10327 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 11 May 2018 17:31:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members
Hello Tianyi Wang, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10327 to look at the new patch set (#5). Change subject: IMPALA-6907: Close stale connections to removed cluster members .. IMPALA-6907: Close stale connections to removed cluster members Previously, ImpalaServer::MembershipCallback() is used by each Impala backend node to update cluster membership. It also removes stale connections to nodes which are no longer members of the cluster. However, the way it detects removed member is flawed as it relies on query_locations_ to determine whether stale connections may exist to the removed members. query_locations_ is a map of host name to a set of queries running on that host. A entry for a remote node only exists in query_locations_ if an Impalad node has acted as coordinator of a query with fragment instances scheduled to run on that remote node. This change fixes this problem by closing connections to remote hosts which are removed from the cluster regardless of whether it can be found in query_locations_. A new test is added to exercise this path by restarting Impalad backend nodes between queries. Also change impala_cluster.py to use bin/start-impala.sh to start Impala demon instead of directly forking and exec'ing Impalad. This is needed as start-impala.sh sets up the proper Java related environment variables. Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d --- M be/src/service/impala-server.cc M tests/common/impala_cluster.py M tests/custom_cluster/test_restart_services.py 3 files changed, 56 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/10327/5 -- To view, visit http://gerrit.cloudera.org:8080/10327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Gerrit-Change-Number: 10327 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 10: (8 comments) http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1235 PS10, Line 1235:* TODO: the following paragraph seems at least partially obsolate > Why not clean it up then? Point (1) is obsolete but point (2) is still vali Done http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@244 PS10, Line 244: Long.toString(System.currentTimeMillis() / 1000)); > Doesn't the HMS set this in alter_table()? It sets it if the property does not exist, but this would not work well for Kudu tables: after calling alter_table here, the table will not be loaded from HMS again (unlike in HDFS tables, where this is done to ensure HMS-Impala consistency). This means that if I would remove "transient_lastDdlTime" here, then HMS would calculate a valid value, but Impala catalogd would not know about this value. http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@785 PS10, Line 785: msTbl.putToParameters("impala.lastComputeStatsTime", > Create a constant for the table property in Table similar to what we have i I have put the constant to HdfsTable, because all the other property keys reside their, even those that are used by Kudu tables too. http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868 PS10, Line 2868: // HMS updates transient_lastDdlTime if it is removed. > I understand what this comment says, but I don't understand it's relevance This comment remained here by mistake. http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2870 PS10, Line 2870: Long.toString(System.currentTimeMillis() / 1000)); > We're doing "System.currentTimeMillis() / 1000" in quite a few places, I su I have created a bit different helper function. http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py File tests/metadata/test_last_ddl_time_update.py: http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@74 PS10, Line 74: def run_test(self, query, expect_changed_ddl_time, expect_changed_stats_time): > Passing a ";" delimited string is really weird. Why not pass a list of quer The "invalidate metadata %(TBL)s; describe %(TBL)s" combo was used the check that in case of Kudu tables, lastDdlTime is not increased by reloading the table (it used to increase it), so an extra query is needed after INVALIDATE METADATA to load the table. The two can be separated (like for other init queries) but I felt that they belong together. I replaced the string splitting with variable argument list - I can replace this with two separate calls if needed. http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@93 PS10, Line 93: # Hive uses a seconds granularity on the last ddl time. > Not just Hive - we do too. Just say "All times are stored in seconds" or so Isn't it an HMS convention in this case? Or is there a reason behind not using higher precision timestamp? http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@155 PS10, Line 155: h.expect_no_time_change("drop incremental stats %(TBL)s partition (j=1, s='2012')") > use "drop stats" instead to wipe everything (including incremental stats) Are you sure about this? I use DROP INCREMENTAL STATS on purpose to check +1 statement's effect in the test suite. -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri,
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457 PS18, Line 457: ret_status > Just fyi, there's no copy because return-value-optimisation applies to ret_ RVO eliminates the copy of ret_value into the thing that is accepting it, but there's still a copy from exec_status_ while holding the lock. But, yeah, agree with your point that it ultimately we end up with one copy from exec_status_ in both cases. -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 18 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 17:15:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10165 ) Change subject: IMPALA-6941: load more text scanner compression plugins .. Patch Set 11: I wasn't worried so much about the load data part, but more once the partition with those files exists (either via load data or via some other engine), do we produce the proper runtime error. maybe the case is the same as the text parsing error test case, and so we do have coverage for that? -- To view, visit http://gerrit.cloudera.org:8080/10165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6 Gerrit-Change-Number: 10165 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 17:11:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10256 ) Change subject: IMPALA-6957: calc thread resource requirement in planner .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift@416 PS6, Line 416: max_per_host_required_threads should we just go ahead and call this a reservation? Or, we can wait for later if/when we actually put some backend reservation accounting/limit in place. (Though we can still think of it as a reservation of an unlimited resource currently). Just thinking it would be nice standardize on resource terminology - this is effectively the min-thread-reservation. (and then we should clarify the thing above is a mem reservation). Okay to defer or argue against of course. http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift@80 PS6, Line 80: required_threads same question about reservation. http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@276 PS6, Line 276: .setMemEstimateBytes(0) why do that? -- To view, visit http://gerrit.cloudera.org:8080/10256 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140 Gerrit-Change-Number: 10256 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 17:09:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10165 ) Change subject: IMPALA-6941: load more text scanner compression plugins .. Patch Set 11: Could you have a look at the fe changes, Alex? The main change is to remove (untested) handling of unsupported compression types in the catalog and "LOAD DATA" planning that I convinced myself were useless. Instead the check for support happens during query execution. One key thing to keep in mind is that unrecognised suffixes are treated as uncompressed, e.g. some_file.xz would be treated as an uncompressed file because we don't know about the ".xz" suffix. -- To view, visit http://gerrit.cloudera.org:8080/10165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6 Gerrit-Change-Number: 10165 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 17:05:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10165 ) Change subject: IMPALA-6941: load more text scanner compression plugins .. Patch Set 11: There was no existing test coverage for the error path of "LOAD DATA" - the only tests load uncompressed files. I think we probably should have that coverage for completeness but I feel a bit better about it now that there's no special code path that is untested - "LOAD DATA" is agnostic to file extensions now. -- To view, visit http://gerrit.cloudera.org:8080/10165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6 Gerrit-Change-Number: 10165 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 17:01:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457 PS18, Line 457: ret_status > exec_status_ is protected by the lock, so the copy is necessary. Just fyi, there's no copy because return-value-optimisation applies to ret_status. -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 18 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 16:57:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10165 ) Change subject: IMPALA-6941: load more text scanner compression plugins .. Patch Set 11: Code-Review+1 That seems okay to me, though would be good to have Alex look at at least that last change. Why didn't a test need to be updated with that change? Shouldn't it cause some case to go from a table loading error to a runtime error? -- To view, visit http://gerrit.cloudera.org:8080/10165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6 Gerrit-Change-Number: 10165 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 16:51:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 20: Code-Review+1 Carry +1s -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 20 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 16:44:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Hello Lars Volker, Zoltan Borok-Nagy, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10116 to look at the new patch set (#11). Change subject: IMPALA-6131: Track time of last statistics update in metadata .. IMPALA-6131: Track time of last statistics update in metadata The timestamp of the last COMPUTE STATS operation is saved to table property "impala.lastComputeStatsTime". The format is the same as in "transient_lastDdlTime", so the two can be compared to check if the schema has changed since computing statistics. Other changes: - Handling of "transient_lastDdlTime" is simplified - the old logic set it to current time + 1, if the old version was >= current time, to ensure that it is always increased by DDL operations. This was useful in the past, as IMPALA-387 used lastDdlTime to check if partition data needs to be reloaded, but since IMPALA-1480, Impala does not rely on lastDdlTime at all. - Computing / setting stats on HDFS tables no longer increases "transient_lastDdlTime". - When Kudu tables are (re)loaded, it is checked if their HMS representation is up to date, and if it is, then IMetaStoreClient.alter_table() is not called. The old logic always called alter_table() after loading metadata from Kudu. This change was needed to ensure that "transient_lastDdlTime" works similarly in HDFS and Kudu tables, and should also make (re)loading Kudu tables faster. Notes: - Kudu will be able to sync its tables to HMS in the near future (see KUDU-2191), so the Kudu metadata handling in Impala may need to be redesigned. Testing: tests/metadata/test_last_ddl_time_update.py is extended by - also checking "impala.lastComputeStatsTime" - testing more SQL statements - tests for Kudu tables Note that test_last_ddl_time_update.py is ran only in exhaustive testing. Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M tests/metadata/test_last_ddl_time_update.py 8 files changed, 220 insertions(+), 173 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/10116/11 -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 11 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 20: (12 comments) Thanks for taking a look Sailesh. http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h@93 PS19, Line 93: /// 2. exec_state_lock_, backend_states_init_lock_, filter_lock_, > What about backend_states_init_lock_ ? Done http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h@206 PS19, Line 206: > comment requires an update. referenced the class level comment. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@325 PS18, Line 325: rySummary() and > const ref ExecState is a scalar (enum). We pass scalar by value - no reason to have indirection. I can make them const though, if you prefer, given these are leafs routines. Generally, I'm not a huge fan of const scalar args because it leads to const creep with little benefit (compiler has to perform analysis anyway, and usually easy enough to see when non-pointers are not modified/aliased). http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@350 PS18, Line 350: in all cases releases resources and cal > These look like they're read only, so they should be passed as const refs. Scalar and in registers, so not going to pass by reference. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@357 PS18, Line 357: > const ref made it const http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@360 PS18, Line 360: > const ref made it const http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@127 PS14, Line 127: (); > You mean Cancel() and UpdateBackendExecStatus()? Maybe this is not worth me This comment was pre-existing but I'm fine with removing it. It's covered in the class comment. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@279 PS14, Line 279: > To be more explicit, we should say "... or when Cancel() is called." Cancel() is not called for (execution) errors. Thats part of the point of the patch is to distinguish canceled state and error state. (It's true that the impala server layer can implement an error at that layer using Cancel(), but that's not an execution error, and I'd rather not muddle the terminology). Will update this comment to talk about backend cancellation, though. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457 PS18, Line 457: ret_status > Preferable to get rid of the automatic variable 'ret_status' and just retur exec_status_ is protected by the lock, so the copy is necessary. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@462 PS18, Line 462: Status ret_status; > Same here. Same locking. http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476 PS18, Line 476: DCHECK(exec_status_.ok()) << exec_status_; > If an error happens after exec_state_ is already set to ExecState::RETURNED My thinking was that it would be good to at least log the error, which is why I wrote the code that way. Alternatively, we could only log when a state transition happens, but it seems like it might be good to know when an backend error occurs even if not failing the query, but I'm also not a fan of log spew. Are you arguing for one way or the other? http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@486 PS18, Line 486: !status.ok() && !status.IsCancelled() && e > (!status.ok() && !status.IsCancelled() && exec_status_.IsCancelled()) This path should be rare, but sure we can avoid the copy in that case. -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 20 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 11 May 2018 16:37:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Hello Michael Ho, Sailesh Mukil, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10158 to look at the new patch set (#20). Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. IMPALA-5384, part 2: Simplify Coordinator locking and clarify state The is the final change to clarify and break up the Coordinator's lock. The state machine for the coordinator is made explicit, distinguishing between executing state and multiple terminal states. Logic to transition into a terminal state is centralized in one location and executes exactly once for each coordinator object. Derived from a patch for IMPALA_5384 by Marcel Kornacker. Testing: - exhaustive functional tests - stress test on minicluster with memory overcommitment. Verified from the logs that this exercises all these paths: - successful queries - client requested cancellation - error from exec FInstances RPC - error reported asynchronously via report status RPC - eos before backend execution completed Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e --- M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/client-request-state.cc M be/src/service/impala-server.h M be/src/util/counting-barrier.h 6 files changed, 393 insertions(+), 392 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10158/20 -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 20 Gerrit-Owner: Dan HechtGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()
Zoram Thanga has posted comments on this change. ( http://gerrit.cloudera.org:8080/10364 ) Change subject: IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10364/1/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/10364/1/be/src/runtime/runtime-state.cc@210 PS1, Line 210: // (e.g. once per row) before the fragment aborts. See IMPALA-6997. I wonder if we should explicitly check that the existing error is TErrorCode::MEM_LIMIT_EXCEEDED? I mean, could we come here after hitting some other error condition? -- To view, visit http://gerrit.cloudera.org:8080/10364 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701 Gerrit-Change-Number: 10364 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Fri, 11 May 2018 16:23:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10358 ) Change subject: IMPALA-6802 (part 3): Clean up authorization tests .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java: http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@750 PS1, Line 750: .error(selectError("functional.alltypes"), onColumn("functional", : "alltypes", "id", allExcept(TPrivilegeLevel.ALL, TPrivilegeLevel.SELECT))); I'm not sure this test makes sense because you can only grant SELECT at the column level. Maybe change the column name and remove "allExcept". I know this pattern is already used a lot of other places, but looking at the select tests, I don't think it's covered that the user is granted select to other (possibly all) columns except one, and that it should still fail. Maybe the existing tests can remain if case additional column privileges are added, but we still need to cover the other case. -- To view, visit http://gerrit.cloudera.org:8080/10358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7 Gerrit-Change-Number: 10358 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Fri, 11 May 2018 15:02:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Hello Gabor Kaszab, Zoltan Borok-Nagy, Csaba Ringhofer, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9986 to look at the new patch set (#6). Change subject: IMPALA-3307: Add support for IANA time-zone db .. IMPALA-3307: Add support for IANA time-zone db Impala currently uses two different libraries for timestamp manipulations: boost and glibc. Issues with boost: - Time-zone database is currently hard coded in timezone_db.cc. Impala admins cannot update it without upgrading Impala. - Time-zone database is flat, therefore can’t track year-to-year changes. - Time-zone database is not updated on a regular basis. Issues with glibc: - Uses /usr/share/zoneinfo/ database which could be out of sync on some of the nodes in the Impala cluster. - Uses the host system’s local time-zone. Different nodes in the Impala cluster might use a different local time-zone. - Conversion functions take a global lock, which causes severe performance degradation. In addition to the issues above, the fact that /usr/share/zoneinfo/ and the hard-coded boost time-zone database are both in use is a source of inconsistency in itself. This patch makes the following changes: - Instead of boost and glibc, impalad uses Google's CCTZ to implement time-zone conversions. - Introduces a new startup flag (--hdfs_zone_info_dir) to impalad to specify an HDFS/S3/ADLS location that contains the shared compiled IANA time-zone database. If the startup flag is set, impalad will use the specified time-zone database. Otherwise, impalad will use the default /usr/share/zoneinfo time-zone database. - impalad reads the entire time-zone database into an in-memory map on startup for fast lookups. - The name of the coordinator node’s local time-zone is saved to the query context when preparing query execution. This time-zone is used whenever the current time-zone is referred afterwards in an execution node. - Introduces a new startup flag (--hdfs_zone_abbrev_conf) to impalad to specify an HDFS/S3/ADLS path to a shared config file that contains definitions for non-standard time-zone abbreviations. Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 --- M CMakeLists.txt M be/CMakeLists.txt M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/common/global-types.h M be/src/exec/data-source-scan-node.cc M be/src/exec/data-source-scan-node.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/parquet-column-readers.cc M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/cast-functions-ir.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/decimal-operators.h M be/src/exprs/expr-test.cc M be/src/exprs/literal.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc A be/src/exprs/timezone_db-test.cc M be/src/exprs/timezone_db.cc M be/src/exprs/timezone_db.h M be/src/runtime/raw-value-test.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/runtime/timestamp-value.inline.h M be/src/service/impala-server.cc M be/src/service/impalad-main.cc M be/src/util/filesystem-util-test.cc M be/src/util/filesystem-util.cc M be/src/util/filesystem-util.h M be/src/util/hdfs-util-test.cc M be/src/util/hdfs-util.cc M be/src/util/hdfs-util.h M be/src/util/time-test.cc M be/src/util/time.cc M be/src/util/time.h M bin/bootstrap_toolchain.py M bin/impala-config.sh A cmake_modules/FindCctz.cmake M common/thrift/ImpalaInternalService.thrift M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/testutil/TestUtils.java M testdata/bin/create-load-data.sh M testdata/data/timezoneverification.csv A testdata/tzdb/abbrev.conf A testdata/tzdb/zoneinfo/AmerICA/ArgeNTINA/MendOZA A testdata/tzdb/zoneinfo/AmerICA/CancUN A testdata/tzdb/zoneinfo/UTC M testdata/workloads/functional-query/queries/QueryTest/exprs.test A tests/custom_cluster/test_custom_tzdb.py 53 files changed, 2,569 insertions(+), 1,092 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9986/6 -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 6 Gerrit-Owner: Attila JegesGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/timestamp-value.cc File be/src/runtime/timestamp-value.cc: http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/timestamp-value.cc@139 PS5, Line 139: // In case the resulting 'time_point' is ambiguous, we have to invalidate : // TimestampValue. : // 'civil_lookup' members and the details of handling ambiguity are described at: : // https://github.com/google/cctz/blob/a2dd3d0fbc811fe0a1d4d2dbb0341f1a3d28cb2a/ : // include/cctz/time_zone.h#L106 : if (UNLIKELY(from_cl.kind != cctz::time_zone::civil_lookup::UNIQUE) I have investigated a bit about this: - there is a Jira that complains about this behavior: https://issues.apache.org/jira/browse/IMPALA-3169 - Hive does not work like this, it returns a "valid" timestamp for repeated/skipped hours: select to_utc_timestamp(cast("2011-03-13 02:15:00" as timestamp), "America/Los_Angeles"), to_utc_timestamp(cast("2011-11-06 01:15:00" as timestamp), "America/Los_Angeles") result: 2011-03-13 10:15:00.0 2011-11-06 09:15:00.0 I think that we should do the same, at least for repeated values. I can imagine several valid queries where this would be the correct behavior, for example when we filter for a time interval. So I vote for solving IMPALA-3169 in this patch by choosing pre or post time in non UNIQUE cases too. If there are no test cases yet for skipped/repeated hours, then we should create some and expect the same results that Hive returns. -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 5 Gerrit-Owner: Attila JegesGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 May 2018 14:04:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 4: (18 comments) http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt File CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt@281 PS4, Line 281: Cctz > nit: CCTZ Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@139 PS4, Line 139: new_time_zone_(time_zone), new_tz_ > From reading the names of these 2 variables it's not clear what de differen Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@140 PS4, Line 140: /*overwrite*/ > Do you need this comment? Removed it http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@153 PS4, Line 153: Timezone * > nit: Timezone* Expect..() Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@155 PS4, Line 155: new_tz_ = TimezoneDatabase::FindTimezone(new_time_zone_); > I wonder if it makes sense to do this assignment in the constructor and the The main reason I implemented the class this way was that some tests that use 'ScopedTimeZoneOverride' don't need the actual Timezone pointer. On the other hand, checking always that the timezone name is valid doesn't hurt anyone. Done. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@503 PS4, Line 503: namespace > Why did you need this namespace? Putting stuff into an unnamed namespace is a common C++ idiom: it says that everything in the unnamed namespace is "local" to this translation unit. They're not visible from the outside, and their names won't clash with names in other translation units. Impala uses unnamed namespaces elsewhere too. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@504 PS4, Line 504: / TODO > What is the plan to get rid of the "Duplicate code" TODOs in this review? I removed the TODO comment. I cannot think of a better place for this function. We can create a shared class for this function only but that would be awkward. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc File be/src/exprs/timestamp-functions.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@55 PS4, Line 55: // This should raise some sort of error or at least return null. Hive just ignores it. > Shouldn't this be a TODO? Fixed the comment. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@87 PS4, Line 87: // This should raise some sort of error or at least return null. Hive just ignores it. > Same as above Fixed the comment. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc File be/src/exprs/timezone_db-test.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@57 PS4, Line 57: TzAbbev > nit: TzAbbrev? Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@68 PS4, Line 68: // Abbreviations must start with an uppercase letter. > If it has to start with an uppercase letter, can we add a test this with an Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@105 PS4, Line 105: // Misformatted time-zone names. > Can you again play around with upper vs lower case letters here? Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h File be/src/exprs/timezone_db.h: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@1 PS4, Line 1: // with the License. You may obtain a copy of the License at > Hmm, is the top of the Apache comment missing? Not sure what happened there. I probably inadvertently executed some mysterious vim command :) http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@91 PS4, Line 91: tz_seg > nit: might be just my preference but tz_segment is still short and I think Done http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc File be/src/exprs/timezone_db.cc: http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc@147 PS1, Line 147: > What I meant is that you could get rid of the offset_sec paramater if you c I understand. The problem is that then we would have to allocate an int64_t in the function and return a pointer to it wrapped in unique_ptr to avoid memory leaks. Seems more pain then gain. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc File be/src/exprs/timezone_db.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@365 PS4, Line 365: hdfsFile hdfs_file = hdfsOpenFile( > Just for my information: