[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12192 ) Change subject: IMPALA-8058: Fallback for HBase key scan range estimation .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1832/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce Gerrit-Change-Number: 12192 Gerrit-PatchSet: 2 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Sat, 19 Jan 2019 03:02:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12192 ) Change subject: IMPALA-8058: Fallback for HBase key scan range estimation .. Patch Set 2: (3 comments) Addressed code review comments. Rebased on latest master, which now shows cardinality estimates in the EXPLAIN plan, so updated the new test file accordingly. http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java File fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java: http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java@409 PS1, Line 409: > nit: add tablename, startKey and endKey too? Done http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java: http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@304 PS1, Line 304: // No useful estimate. Rely on HMS row count stats. > Probably mention here that this doesn't work as expected if the stats are m Done http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@304 PS1, Line 304: // No useful estimate. Rely on HMS row count stats. > Probably mention here that this doesn't work as expected if the stats are m Done -- To view, visit http://gerrit.cloudera.org:8080/12192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce Gerrit-Change-Number: 12192 Gerrit-PatchSet: 2 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Sat, 19 Jan 2019 02:29:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation
Hello Bharath Vissapragada, Zoram Thanga, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12192 to look at the new patch set (#2). Change subject: IMPALA-8058: Fallback for HBase key scan range estimation .. IMPALA-8058: Fallback for HBase key scan range estimation Impala supports "pushing" of HBase key range predicates to HBase so that Impala reads only rows within the target key range. The planner estimates the cardinality of such scans by sampling the rows within the range. However, we have seen cases where sampling returns rows for unknown reasons. The planner then ends up without a good cardinality estimate. (Specifically, the code does a division by zero and produces a huge estimate. See the ticket for details.) Impala appears to use the sampling strategy to compute cardinality because HBase uses generally do not gather table stats. The resulting estimates are often off by 2x or more. This is a problem in tests as it causes cardinality numbers to vary greatly from the expected values. Fortunately, tests do gather HMS stats. There may be cases where users do as well. This fix exploints that fact. This fix: * Creates a fall-back strategy that uses table cardinality from HMS and the selectivity of the key predicates to estimate cardinality when the sampling approach fails. * The fall-back strategy requires tracking the predicates used for HBase keys so that their selectivity can be applied during fall-back calculations. * Moved HBase key calculation out of the SingleNodePlanner into the HBase scan node as suggested by a "TO DO" in the code. Doing so simplified the new code. * In the spirit of IMPALA-7919, adds the key predicates to the HBase scan node in the EXPLAIN output. Testing: * Adds a query context option to disable the normal key sampling to force the use of the fall-back. Used for testing. * Adds a new set of HBase test cases that use the new feature to check plans with the fall-back approach. * Reran all existing tests. * Compared cardinality numbers for the two modes: sampling and HMS using the cardinality features of IMPALA-8021. The two approaches provide different results, but this is mostly due to the missing selectivity estimates for inequality operators. (That's a fix for another time.) Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce --- M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test A testdata/workloads/functional-planner/queries/PlannerTest/hbase-no-key-est.test M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test 10 files changed, 512 insertions(+), 116 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/12192/2 -- To view, visit http://gerrit.cloudera.org:8080/12192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce Gerrit-Change-Number: 12192 Gerrit-PatchSet: 2 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Zoram Thanga
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 20: (36 comments) http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307 PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() > eventPollingInterval Done http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319 PS21, Line 319: LOG.error("Unable to fetch the current notification event id from metastore." : + "Metastore event processing will be disabled.", : e); > nit: Multiple places in the code, try to format in fewer lines. Something l I used the command given in https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536 as advised by some of the other members of the team. Unfortunately, it doesn't like some of these lines. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@324 PS21, Line 324: e); > Will the code that handles this exception write to the log both the message The LOG.error above logs the trace as well since it takes in the underlying cause exception as the second argument. The exception is uncaught in this particular case on the caller side, since we want CatalogD to not come up if it is configured to using event processing but for some reasons isn't able to instantiate one. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206 PS21, Line 1206: // Even though we get the current notification event id before stopping the event > clarification: reset() is equivalent to invalidating everything and fetchin This is similar to Tim's comment and our discussion. The line 1216 gets the currentEventID (latest) from HMS. And the event processing begins from that id. So the event processing jumps to the latest event id once the reset() is complete. There is still a slight chance that we will reprocess some of the events which are generated during reset() execution. It is better than missing those events which would lead to inconsistent state between catalog and HMS as far as event processing is concerned. The comment gives details of that case. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1216 PS21, Line 1216: long currentEventId = metastoreEventProcessor_.getCurrentEventId(); > Should this be here? Or in the event processor itself? If here, why is it n It needs to be here to avoid the race condition which can lead to possible missed events. We want to get the latestEventId before triggering the reset() so that we can then restart the event processing after reset from this eventId. If we move the code in the start() method, there is a chance that we will miss processing some of the events which happen during after reset is complete, but before we started the event processing. See Tim's comment in the earlier review comments which gives a nice example. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1360 PS21, Line 1360: Reference dbWasFound, Reference tblWasFound) { > I still think that using the reference is non-standard Java. A simple solut used the second option of return boolean and throw exception when db is not found. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1363 PS21, Line 1363: Db db = getDb(dbName); > Let's think about this. We return the flags because we want to know if the if getting db before acquiring the lock has a race then it looks like addTable and renameTable also has a race. I will create separate JIRAs for fixing them http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372 PS21, Line 1372: tblWasFound.setRef(true); > I wonder, do we really need to know if the table existed or not? Is it even consider a case where user do create, drop, create with the same table name from Impala. In case of if the create event from the first create statement, the table presented in the event is stale and should not be used to add to the catalog. I think it is useful to log this information to make sure that the create event was ignored in such a case.
[Impala-ASF-CR] Update ASF copyright to current year, 2019
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12242 ) Change subject: Update ASF copyright to current year, 2019 .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f Gerrit-Change-Number: 12242 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 19 Jan 2019 01:20:39 + Gerrit-HasComments: No
[Impala-ASF-CR] Update ASF copyright to current year, 2019
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12242 ) Change subject: Update ASF copyright to current year, 2019 .. Update ASF copyright to current year, 2019 Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f Reviewed-on: http://gerrit.cloudera.org:8080/12242 Reviewed-by: Fredy Wijaya Tested-by: Impala Public Jenkins --- M NOTICE.txt 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Fredy Wijaya: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/12242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f Gerrit-Change-Number: 12242 Gerrit-PatchSet: 2 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1831/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 11 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 19 Jan 2019 00:04:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1830/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 23:53:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 11: PS 11 rebases PS10 -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 11 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 23:30:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 11: (10 comments) http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py File bin/plot_profile_resource_usage.py: http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@27 PS11, Line 27: d flake8: E501 line too long (162 > 90 characters) http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@31 PS11, Line 31: from impala_py_lib import profiles flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@32 PS11, Line 32: import argparse flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@33 PS11, Line 33: import sys flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@34 PS11, Line 34: import matplotlib flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@36 PS11, Line 36: import matplotlib.pyplot as plt flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/11/lib/python/impala_py_lib/profiles.py File lib/python/impala_py_lib/profiles.py: http://gerrit.cloudera.org:8080/#/c/12069/11/lib/python/impala_py_lib/profiles.py@28 PS11, Line 28: def decode_profile_line(line): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/12069/11/tests/common/impala_service.py File tests/common/impala_service.py: http://gerrit.cloudera.org:8080/#/c/12069/11/tests/common/impala_service.py@91 PS11, Line 91: ; flake8: E703 statement ends with a semicolon http://gerrit.cloudera.org:8080/#/c/12069/11/tests/observability/test_plot_profile_resource_usage.py File tests/observability/test_plot_profile_resource_usage.py: http://gerrit.cloudera.org:8080/#/c/12069/11/tests/observability/test_plot_profile_resource_usage.py@25 PS11, Line 25: class TestPlotProfileResourceUsage(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/12069/11/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12069/11/tests/query_test/test_observability.py@399 PS11, Line 399: e flake8: F841 local variable 'expected_strs' is assigned to but never used -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 11 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 23:25:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Hello Michael Ho, Philip Zeyliger, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12069 to look at the new patch set (#11). Change subject: IMPALA-7694: Add host resource usage metrics to profile .. IMPALA-7694: Add host resource usage metrics to profile This change adds a mechanism to collect host resource usage metrics to profiles. Metric collection can be controlled through a new query option 'RESOURCE_TRACE_RATIO'. It specifies the probability with which metrics collection will be enabled. Collection always happens per query for all executors that run one or more fragment instances of the query. This mechanism adds a new time series counter class that collects all measured values and does not re-sample them. It will re-sample values when printing them into a string profile, preserving up to 64 values, but Thrift profiles will contain the full list of values. We add a new section "Per Node Profiles" to the profile to store and show these values: Per Node Profiles: lv-desktop:22000: CpuIoWaitPercentage (500.000ms): 0, 0 CpuSysPercentage (500.000ms): 1, 1 CpuUserPercentage (500.000ms): 4, 0 - ScratchBytesRead: 0 - ScratchBytesWritten: 0 - ScratchFileUsedBytes: 0 - ScratchReads: 0 (0) - ScratchWrites: 0 (0) - TotalEncryptionTime: 0.000ns - TotalReadBlockTime: 0.000ns This change also uses the aforementioned mechanism to collect CPU usage metrics (user, system, and IO wait time). This change also adds a tool to decode a Thrift profile and plot the contained usage metrics using matplotlib. Example: https://user-images.githubusercontent.com/151514/49830685-bb7efd80-fd46-11e8-8e23-9f5bc47635c1.png This change also includes a few minor improvements to make the resulting code more readable: - Extend the PeriodicCounterUpdater to call functions to update global metrics before updating the counters. - Expose the scratch profile within the per node resource usage section. - Improve documentation of the profile counter classes. - Remove synchronization from StreamingSampler. - Remove a few pieces of dead code that otherwise would have required updates. - Factor some common code for profile decoding into the Impala python library Testing: This change contains a unit test for the system level metrics collection and e2e tests for the profile changes. It also contains a test for the plotting tool. Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/service/impala-server.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/CMakeLists.txt M be/src/util/periodic-counter-updater.cc M be/src/util/periodic-counter-updater.h M be/src/util/pretty-printer.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/streaming-sampler.h A be/src/util/system-state-info-test.cc A be/src/util/system-state-info.cc A be/src/util/system-state-info.h M bin/parse-thrift-profile.py A bin/plot_profile_resource_usage.py M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Metrics.thrift M common/thrift/RuntimeProfile.thrift A lib/python/impala_py_lib/profiles.py M tests/beeswax/impala_beeswax.py M tests/common/impala_service.py A tests/observability/test_plot_profile_resource_usage.py M tests/query_test/test_observability.py 37 files changed, 1,284 insertions(+), 256 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/12069/11 -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 11 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 4: (10 comments) http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc@251 PS2, Line 251: } > Is this how we determine whether the fragment had a scan in it or not? Poss Done http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524 PS3, Line 524: RuntimeProfile::Counter* c = p->GetCounter(ScanNode::SCAN_RANGES_COMPLETE_COUNTER); > ?? Sry, done. http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531 PS3, Line 531: p->GetCounter(KrpcDataStreamSender::TOTAL_BYTES_SENT_COUNTER); > This this be a constant somewhere? Done http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@779 PS2, Line 779: total_utilization.bytes_read); > Should we add comments here about what these new counters mean? (Especially I added comments to all of them. Once IMPALA-7550 comes around we can mode them to the counter definition. (WIP here: https://gerrit.cloudera.org/#/c/12116/) http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@794 PS2, Line 794: double xchg_scan_ratio = 0; > nit: extra line Done http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790 PS3, Line 790: // node. > Is it the case that It follows directly from these three COUNTER_SET calls, the addition for TotalBytesSent happens in L783. I added a test to the python tests. http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391 PS2, Line 391: "TotalInnerBytesSent", "ExchangeScanRatio", > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391 PS2, Line 391: > This query takes 3.4 secs on my machine, which is far from extreme, but I w I picked a query that does a partitioned exchange without being able to push predicates down to the scan, so that the TxRatio would be large. This was the simplest one that I could come up with. It takes 2.4 s on my machine. Do you have an idea for a faster one. http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391 PS2, Line 391: > flake8: W291 trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390 PS3, Line 390: expected_counters = ["TotalBytesRead", "TotalBytesSent", "TotalScanBytesSent", > Should we add small tests for the other new metrics you have? I added a second test that checks all counters are present, and amended this one to check that the *BytesSent counters add up. -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 23:20:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12229 to look at the new patch set (#4). Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. IMPALA-7731: Add Read/Exchange counters to profile Selective scans (and by extension selective fragment instances) take higher performance hits when reading data remotely. They can be identified by a low ratio between data being transmitted vs data being read from HDFS. This change adds several counters to the profile to make it easier to identify queries based on their scan instance selectivity. * TotalBytesSent - The total number of bytes sent by a query in exchange nodes. Does not include remote reads, data written to disk, or data sent to the client. * TotalScanBytesSent - The total number of bytes sent by fragment instances that had a scan node in their plan. * TotalInnerBytesSent - The total number of bytes sent by fragment instances that did not have a scan node in their plan, i.e. that received their input data from other instances through exchange node. * ExchangeScanRatio - The ratio between TotalScanBytesSent and TotalBytesRead, i.e. the selectivity over all fragment instances that had a scan node in their plan. This counter is also added to each fragment instance. * InnerNodeSelectivityRatio - The ratio between bytes sent by instances with a scan node in their plan and instances without a scan node in their plan. This indicates how well the inner nodes of the execution plan reduced the data volume. Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 --- M be/src/exec/scan-node.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/util/pretty-printer.h M tests/query_test/test_observability.py 12 files changed, 141 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/4 -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/cup/sql-parser.cup@1097 PS3, Line 1097: KW_ALTER KW_TABLE table_name:table KW_ADD KW_COLUMN if_not_exists_val:if_not_exists > Is there anything in particular that you have in mind to be pulled out into Each statement here starts with ALTER TABLE . In parser-speak: alter_tbl_stmt ::= KW_ALTER KW_TABLE table_name:table alter_table_tail alter_table_tail ::= KW_ADD KW_COLUMN ... | KW_ADD ... | KW_DROP ... Would need to pass the column name around somehow, which is probably why we have the current structure. Not a huge issue, we can always clean this up later. http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2048 PS8, Line 2048: Column col = tbl.getColumn(column.getColumnName()); What is the concurrency model? Suppose I start 100 threads each adding the same columns simultaneously. Is there a race condition between the check here and the actual add below? Also, suppose I add 50 threads which remove the same columns. Is there a race condition where we don't add a column because it occurs in our cache, though it has actually been removed from the table in HMS? We can't fix this issue, really, because there is no global locking. Still, we should think about expected behavior. http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@328 PS8, Line 328: AnalyzesOk("alter table functional.alltypes add column if not exists INT_COL int"); Just a side note: While this kind of test is better than no test, it is rather limited. Doing absolutely nothing in analyze() will allow this test to pass. Would be better to do a sampling of deeper tests. That is, get back the statement and probe to ensure that the table name was set and resolved. That the column names and types were set. Such checks are implicitly done later, but catching bugs is easier of we verify the data structures at each step. Adding deep tests is a bit more work, so not required here as we've not done this in the past. Offering it as something to consider in the future. (I've started adding such tests for the analyzer, etc.) -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 18 Jan 2019 23:01:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 3: Code-Review+2 (4 comments) Thanks! http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524 PS3, Line 524: //if (!p->metadata().__isset.plan_node_id) continue; ?? http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531 PS3, Line 531: RuntimeProfile::Counter* bytes_sent = p->GetCounter("TotalBytesSent"); This this be a constant somewhere? http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790 PS3, Line 790: // node. Is it the case that TotalBytesSent == TotalScanBytesSent + TotalInnerBytesSent ? Do we have a place to DCHECK assert that and/or test it? http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390 PS3, Line 390: def test_exchange_scan_ratio_non_zero(self): Should we add small tests for the other new metrics you have? -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 22:40:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1829/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 18 Jan 2019 22:38:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1828/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 22:24:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12245 ) Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1827/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 18 Jan 2019 22:18:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. Patch Set 8: Code-Review+1 (5 comments) Carry Bharath's +1. http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@612 PS7, Line 612: > nit: Column(s) have.., here and below. Done http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2046 PS7, Line 2046: colsToAdd = new Ar > nit: I think colsToAdd will be more clear. Please ignore if you disagree. Yeah that's a better name. Done. http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2049 PS7, Line 2049: if (ifNotExists && col != null) continue; : if (col != null) { : > single line and else if below? Done http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2053 PS7, Line 2053: tName(), tbl.getN > We are past analysis at this point. Throw a CatalogException? Also, include Done http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2057 PS7, Line 2057: > we don't use it anymore, update? Done -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 18 Jan 2019 22:04:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Fredy Wijaya has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) This patch adds IF NOT EXISTS support in ALTER TABLE ADD COLUMN and ALTER TABLE ADD COLUMNS. If IF NOT EXISTS is specified and a column already exists with this name, no error is thrown. If IF NOT EXISTS is specified for multiple columns and a column already exists, no error is thrown and a new column that does not exist will be added. Syntax: ALTER TABLE tbl ADD COLUMN [IF NOT EXISTS] i int ALTER TABLE tbl ADD [IF NOT EXISTS] COLUMNS (i int, j int) Testing: - Added new FE tests - Ran all FE tests - Updated E2E DDL tests - Ran all E2E DDL tests Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup R fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java C fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/alter-table.test M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test 11 files changed, 398 insertions(+), 182 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/12181/8 -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12229 to look at the new patch set (#3). Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. IMPALA-7731: Add Read/Exchange counters to profile Selective scans (and by extension selective fragment instances) take higher performance hits when reading data remotely. They can be identified by a low ratio between data being transmitted vs data being read from HDFS. This change adds several counters to the profile to make it easier to identify queries based on their scan instance selectivity. * TotalBytesRead - The total number of bytes read by a query. * TotalBytesSent - The total number of bytes sent by a query in exchange nodes. Does not include remote reads, data written to disk, or data sent to the client. * TotalScanBytesSent - The total number of bytes sent by fragment instances that had a scan node in their plan. * TotalInnerBytesSent - The total number of bytes sent by fragment instances that did not have a scan node in their plan, i.e. that received their input data from other instances through exchange node. * ExchangeScanRatio - The ratio between TotalScanBytesSent and TotalBytesRead, i.e. the selectivity over all fragment instances that had a scan node in their plan. This counter is also added to each fragment instance. * InnerNodeSelectivityRatio - The ratio between bytes sent by instances with a scan node in their plan and instances without a scan node in their plan. This indicates how well the inner nodes of the execution plan reduced the data volume. Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 --- M be/src/exec/scan-node.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/util/pretty-printer.h M tests/query_test/test_observability.py 11 files changed, 112 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/3 -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12245 ) Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/1826/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 18 Jan 2019 21:52:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Fredy Wijaya has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/12245 ) Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. IMPALA-3323: Fix unrecognizable shell option when --config_file is specified Impala shell defines a dictionary of default values for some shell options. Before this patch, the logic for --config_file checks if a shell option exists by using the default value dictionary, which does not contain the exhaustive list of shell options. This causes a valid option in the Impala shell config file to be treated as unrecognizable shell option due to the option not having a default value. The patch fixes the issue by changing the logic that checks for the existence of an option using the option list from optparse. The patch also fixes the missing dest parameter for ldap_password_cmd option. Testing: - Updated test_shell_commandline::test_config_file - Ran all shell tests Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea --- M shell/impala_shell.py M shell/option_parser.py M tests/shell/good_impalarc M tests/shell/test_shell_commandline.py 4 files changed, 21 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/12245/3 -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12245 ) Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/1825/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 18 Jan 2019 21:33:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12189 ) Change subject: IMPALA-7988: support loading data with dockerized Impalas .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 21:33:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12189 ) Change subject: IMPALA-7988: support loading data with dockerized Impalas .. IMPALA-7988: support loading data with dockerized Impalas This patch does the work to load data and run some end-to-end query tests on a dockerised cluster. Changes were required in start-impala-cluster.py/ImpalaCluster and in some configuration files. ImpalaCluster is used for various things, including discovering service ports and testing for cluster readiness. This patch adds basic support and uses it from start-impala-cluster.py to check for cluster readiness. Some logic is moved from start-impala-cluster.py to ImpalaCluster. Limitations: * We're fairly inconsistent about whether services listen only on a single interface (e.g. loopback, traditionally) or whether it listens on all interfaces. This doesn't fix all of those issues. E.g. HDFS datanodes listen on all interfaces to work around some issues. * Many tests don't pass yet, particularly those using ImpalaCluster(), which isn't initialised with the appropriate docker arguments. Testing: Did a full data load locally using a dockerised Impala cluster: START_CLUSTER_ARGS="--docker_network=impala-cluster" \ TEST_START_CLUSTER_ARGS="--docker_network=impala-cluster" \ ./buildall.sh -format -testdata -ninja -notests -skiptests -noclean Ran a selection of end-to-end tests touching HDFS, Kudu and HBase tables after I loaded data locally. Ran exhaustive tests with non-dockerised impala cluster. Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Reviewed-on: http://gerrit.cloudera.org:8080/12189 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M bin/impala-config.sh M bin/start-impala-cluster.py M fe/src/test/resources/hbase-site.xml.template M fe/src/test/resources/hive-default.xml M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M testdata/bin/create-load-data.sh M testdata/bin/run-all.sh M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl M testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl M tests/common/impala_cluster.py 11 files changed, 358 insertions(+), 229 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Update ASF copyright to current year, 2019
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12242 ) Change subject: Update ASF copyright to current year, 2019 .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3655/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f Gerrit-Change-Number: 12242 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 18 Jan 2019 21:26:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 10: > What was the bug you found? For step size 'step', the number of values we print is (num + 1) / step, but the code used num/step which printed the wrong number for 127 samples. -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 21:22:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12245 ) Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. Patch Set 2: (2 comments) Fixed some flake8 errors. http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@55 PS1, Line 55: raise InvalidOptionValueError("Unexpected value in configuration file. '" + value > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@75 PS1, Line 75: else: > flake8: E302 expected 2 blank lines, found 1 Done -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 18 Jan 2019 21:20:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Fredy Wijaya has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/12245 ) Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. IMPALA-3323: Fix unrecognizable shell option when --config_file is specified Impala shell defines a dictionary of default values for some shell options. Before this patch, the logic for --config_file checks if a shell option exists by using the default value dictionary, which does not contain the exhaustive list of shell options. This causes a valid option in the Impala shell config file to be treated as unrecognizable shell option due to the option not having a default value. The patch fixes the issue by changing the logic that checks for the existence of an option using the option list from optparse. The patch also fixes the missing dest parameter for ldap_password_cmd option. Testing: - Updated test_shell_commandline::test_config_file - Ran all shell tests Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea --- M shell/impala_shell.py M shell/option_parser.py M tests/shell/good_impalarc M tests/shell/test_shell_commandline.py 4 files changed, 21 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/12245/2 -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix FE tests that leak HMS connections .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1824/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 21:17:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix FE tests that leak HMS connections .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1823/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 21:16:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12245 ) Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@55 PS1, Line 55: def parse_shell_options(options, defaults, option_list): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@75 PS1, Line 75: def get_config_from_file(config_filename, option_list): flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 18 Jan 2019 21:15:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12245 Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file is specified .. IMPALA-3323: Fix unrecognizable shell option when --config_file is specified Impala shell defines a dictionary of default values for some shell options. Before this patch, the logic for --config_file checks if a shell option exists by using the default value dictionary, which does not contain the exhaustive list of shell options. This causes a valid option in the Impala shell config file to be treated as unrecognizable shell option due to the option not having a default value. The patch fixes the issue by changing the logic that checks for the existence of an option using the option list from optparse. The patch also fixes the missing dest parameter for ldap_password_cmd option. Testing: - Updated test_shell_commandline::test_config_file - Ran all shell tests Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea --- M shell/impala_shell.py M shell/option_parser.py M tests/shell/good_impalarc M tests/shell/test_shell_commandline.py 4 files changed, 14 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/12245/1 -- To view, visit http://gerrit.cloudera.org:8080/12245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea Gerrit-Change-Number: 12245 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy Wijaya
[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12129 ) Change subject: IMPALA-6664: Tag log statements with fragment or query ids. .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1822/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12129 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6634ef9d1a7346339f24f2d40a7a3aa36a535da8 Gerrit-Change-Number: 12129 Gerrit-PatchSet: 6 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Fri, 18 Jan 2019 21:08:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1821/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:51:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix FE tests that leak HMS connections .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@851 PS5, Line 851: // running the query with authtest/hostn...@realm.com user which is converted to > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1032 PS5, Line 1032: // Create an analysis context + FE with the test user > line too long (91 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:47:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections
Fredy Wijaya has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix FE tests that leak HMS connections .. IMPALA-8073: Fix FE tests that leak HMS connections This patch fixes FE tests that leak the HMS connections. I was to reproduce the issue by in SentryTestProxy by looping the test multiple times. Testing: - Ran SentryProxyTest in a loop without any issue. - Ran all FE tests Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 --- M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/testutil/BlockIdGenerator.java M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java M fe/src/test/java/org/apache/impala/util/SentryProxyTest.java 12 files changed, 206 insertions(+), 180 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/12241/6 -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix FE tests that leak HMS connections .. Patch Set 5: - Updated Catalog interface to implement AutoCloseable. - Fixed all other tests that leak HMS connections. -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:45:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix FE tests that leak HMS connections .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@851 PS5, Line 851: // running the query with authtest/hostn...@realm.com user which is converted to user line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1032 PS5, Line 1032: // Create an analysis context + FE with the test user (as defined in the policy file) line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:45:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections
Fredy Wijaya has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix FE tests that leak HMS connections .. IMPALA-8073: Fix FE tests that leak HMS connections This patch fixes FE tests that leak the HMS connections. I was to reproduce the issue by in SentryTestProxy by looping the test multiple times. Testing: - Ran SentryProxyTest in a loop without any issue. - Ran all FE tests Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 --- M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/testutil/BlockIdGenerator.java M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java M fe/src/test/java/org/apache/impala/util/SentryProxyTest.java 12 files changed, 205 insertions(+), 180 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/12241/5 -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1820/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:44:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.
Hello Paul Rogers, Zoltan Borok-Nagy, Zoram Thanga, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12129 to look at the new patch set (#6). Change subject: IMPALA-6664: Tag log statements with fragment or query ids. .. IMPALA-6664: Tag log statements with fragment or query ids. This implements much of the desire in IMPALA-6664 to tag all log statements with their query ids. It re-uses the existing ThreadDebugInfo infrastructure as well as the existing InstallLogMessageListenerFunction() patch to glog (currently used for log redaction) to prefix log messages with fragment ids or query ids, when available. The fragment id is the query id with the last bits incremented, so it's possible to correlate a given query's log messages. For example: $ grep 85420d575b9ff4b9:402b8868 logs/cluster/impalad.INFO I0108 10:39:16.453958 14752 impala-server.cc:1052] 85420d575b9ff4b9:402b8868] Registered query query_id=85420d575b9ff4b9:402b8868 session_id=aa45e480434f0516:101ae5ac12679d94 I0108 10:39:16.454738 14752 Frontend.java:1242] 85420d575b9ff4b9:402b8868] Analyzing query: select count(*) from tpcds.web_sales I0108 10:39:16.456627 14752 Frontend.java:1282] 85420d575b9ff4b9:402b8868] Analysis finished. I0108 10:39:16.463538 14818 admission-controller.cc:598] 85420d575b9ff4b9:402b8868] Schedule for id=85420d575b9ff4b9:402b8868 in pool_name=default-pool per_host_mem_estimate=180.02 MB PoolConfig: max_requests=-1 max_queued=200 max_mem=-1.00 B I0108 10:39:16.463603 14818 admission-controller.cc:603] 85420d575b9ff4b9:402b8868] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0, local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0) I0108 10:39:16.463780 14818 admission-controller.cc:635] 85420d575b9ff4b9:402b8868] Admitted query id=85420d575b9ff4b9:402b8868 I0108 10:39:16.463896 14818 coordinator.cc:93] 85420d575b9ff4b9:402b8868] Exec() query_id=85420d575b9ff4b9:402b8868 stmt=select count(*) from tpcds.web_sales I0108 10:39:16.464795 14818 coordinator.cc:356] 85420d575b9ff4b9:402b8868] starting execution on 2 backends for query_id=85420d575b9ff4b9:402b8868 I0108 10:39:16.466384 24891 impala-internal-service.cc:49] ExecQueryFInstances(): query_id=85420d575b9ff4b9:402b8868 coord=pannier.sf.cloudera.com:22000 #instances=2 I0108 10:39:16.467339 14818 coordinator.cc:370] 85420d575b9ff4b9:402b8868] started execution on 2 backends for query_id=85420d575b9ff4b9:402b8868 I0108 10:39:16.467536 14823 query-state.cc:579] 85420d575b9ff4b9:402b8868] Executing instance. instance_id=85420d575b9ff4b9:402b8868 fragment_idx=0 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=1 I0108 10:39:16.467627 14824 query-state.cc:579] 85420d575b9ff4b9:402b88680001] Executing instance. instance_id=85420d575b9ff4b9:402b88680001 fragment_idx=1 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=2 I0108 10:39:16.820933 14824 query-state.cc:587] 85420d575b9ff4b9:402b88680001] Instance completed. instance_id=85420d575b9ff4b9:402b88680001 #in-flight=1 status=OK I0108 10:39:17.122299 14823 krpc-data-stream-mgr.cc:294] 85420d575b9ff4b9:402b8868] DeregisterRecvr(): fragment_instance_id=85420d575b9ff4b9:402b8868, node=2 I0108 10:39:17.123500 24038 coordinator.cc:709] Backend completed: host=pannier.sf.cloudera.com:22001 remaining=2 query_id=85420d575b9ff4b9:402b8868 I0108 10:39:17.123509 24038 coordinator-backend-state.cc:265] query_id=85420d575b9ff4b9:402b8868: first in-progress backend: pannier.sf.cloudera.com:22000 I0108 10:39:17.167752 14752 impala-beeswax-server.cc:197] 85420d575b9ff4b9:402b8868] get_results_metadata(): query_id=85420d575b9ff4b9:402b8868 I0108 10:39:17.168762 14752 coordinator.cc:483] 85420d575b9ff4b9:402b8868] ExecState: query id=85420d575b9ff4b9:402b8868 execution completed I0108 10:39:17.168808 14752 coordinator.cc:608] 85420d575b9ff4b9:402b8868] Coordinator waiting for backends to finish, 1 remaining. query_id=85420d575b9ff4b9:402b8868 I0108 10:39:17.168880 14823 query-state.cc:587] 85420d575b9ff4b9:402b8868] Instance completed. instance_id=85420d575b9ff4b9:402b8868 #in-flight=0 status=OK I0108 10:39:17.168977 14821 query-state.cc:252] UpdateBackendExecState(): last report for 85420d575b9ff4b9:402b8868 I0108 10:39:17.174401 24038 coordinator.cc:709] Backend completed: host=pannier.sf.cloudera.com:22000 remaining=1 query_id=85420d575b9ff4b9:402b8868 I0108 10:39:17.174513 14752 coordinator.cc:814] 85420d575b9ff4b9:402b8868] Release admission control
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 10: Code-Review+2 What was the bug you found? -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:32:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@866 PS1, Line 866: void ImpalaHttpHandler::AdmissionControllerStateHandler(const Webserver::ArgumentMap& args, line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:14:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12244 Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission_controller" that provides the following details about all resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details are also available as JSON from the same http endpoint. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 --- M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h A www/admission_controller.tmpl 6 files changed, 659 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/1 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 10: (5 comments) Thanks for the reviews. Please see PS10. http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/pretty-printer.h File be/src/util/pretty-printer.h: http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/pretty-printer.h@139 PS8, Line 139: // Printed as integer values : case TUnit::BASIS_POINTS: { : DCHECK_LE(value, 1); : ss << (value / 100); : > Add "Prints integer values" as a comment on line 139-140, and we can call i Done http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/runtime-profile.cc@793 PS8, Line 793: stream << endl; > (Showing X of Y values from Thrift Profile)? Done http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.h File be/src/util/system-state-info.h: http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.h@70 PS8, Line 70: enum PROC_STAT_CPU_VALUES { : CPU_USER, : CPU_NICE, : CPU_SYSTEM, : CPU_IDLE, : CPU_IOWAIT : }; : : /// We store the CPU usage values in an array so that > I think I do prefer a struct, but I'm open to leaving it the way it is. I checked with Tim and we prefer to leave it the way it is for now. http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.cc File be/src/util/system-state-info.cc: http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.cc@33 PS8, Line 33: SystemStateInfo::SystemStateInfo() { > Indeed, TIL. It looks cleaner to me, although I had to search the internet I saw that this now angers clang-tidy: /home/ubuntu/Impala/be/src/util/system-state-info.cc:33:52: warning: missing field 'system' initializer [clang-diagnostic-missing-field-initializers] I'll wait for you thoughts on array vs struct in general, but if we stay with an array I'd rather use memset than add #pragma lines to ignore this. Alternatively we could disable the warning in clang-tidy based on this discussion that points out that newer versions of GCC don't warn about this anymore. http://gerrit.cloudera.org:8080/#/c/12069/8/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12069/8/tests/query_test/test_observability.py@419 PS8, Line 419: """Tests that the thrift profile contains a time series counter for CPU resource > Roughly, a C++ unit test could do: Thanks for the pointer, I added such a test. It turned out a bit more complicated than that, but I also found a bug. :) -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:10:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12069 ) Change subject: IMPALA-7694: Add host resource usage metrics to profile .. Patch Set 10: (10 comments) http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py File bin/plot_profile_resource_usage.py: http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@27 PS10, Line 27: d flake8: E501 line too long (162 > 90 characters) http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@31 PS10, Line 31: from impala_py_lib import profiles flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@32 PS10, Line 32: import argparse flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@33 PS10, Line 33: import sys flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@34 PS10, Line 34: import matplotlib flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@36 PS10, Line 36: import matplotlib.pyplot as plt flake8: E402 module level import not at top of file http://gerrit.cloudera.org:8080/#/c/12069/10/lib/python/impala_py_lib/profiles.py File lib/python/impala_py_lib/profiles.py: http://gerrit.cloudera.org:8080/#/c/12069/10/lib/python/impala_py_lib/profiles.py@28 PS10, Line 28: def decode_profile_line(line): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/12069/10/tests/common/impala_service.py File tests/common/impala_service.py: http://gerrit.cloudera.org:8080/#/c/12069/10/tests/common/impala_service.py@91 PS10, Line 91: ; flake8: E703 statement ends with a semicolon http://gerrit.cloudera.org:8080/#/c/12069/10/tests/observability/test_plot_profile_resource_usage.py File tests/observability/test_plot_profile_resource_usage.py: http://gerrit.cloudera.org:8080/#/c/12069/10/tests/observability/test_plot_profile_resource_usage.py@25 PS10, Line 25: class TestPlotProfileResourceUsage(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/12069/10/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12069/10/tests/query_test/test_observability.py@369 PS10, Line 369: e flake8: F841 local variable 'expected_strs' is assigned to but never used -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:10:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile
Hello Michael Ho, Philip Zeyliger, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12069 to look at the new patch set (#10). Change subject: IMPALA-7694: Add host resource usage metrics to profile .. IMPALA-7694: Add host resource usage metrics to profile This change adds a mechanism to collect host resource usage metrics to profiles. Metric collection can be controlled through a new query option 'RESOURCE_TRACE_RATIO'. It specifies the probability with which metrics collection will be enabled. Collection always happens per query for all executors that run one or more fragment instances of the query. This mechanism adds a new time series counter class that collects all measured values and does not re-sample them. It will re-sample values when printing them into a string profile, preserving up to 64 values, but Thrift profiles will contain the full list of values. We add a new section "Per Node Profiles" to the profile to store and show these values: Per Node Profiles: lv-desktop:22000: CpuIoWaitPercentage (500.000ms): 0, 0 CpuSysPercentage (500.000ms): 1, 1 CpuUserPercentage (500.000ms): 4, 0 - ScratchBytesRead: 0 - ScratchBytesWritten: 0 - ScratchFileUsedBytes: 0 - ScratchReads: 0 (0) - ScratchWrites: 0 (0) - TotalEncryptionTime: 0.000ns - TotalReadBlockTime: 0.000ns This change also uses the aforementioned mechanism to collect CPU usage metrics (user, system, and IO wait time). This change also adds a tool to decode a Thrift profile and plot the contained usage metrics using matplotlib. Example: https://user-images.githubusercontent.com/151514/49830685-bb7efd80-fd46-11e8-8e23-9f5bc47635c1.png This change also includes a few minor improvements to make the resulting code more readable: - Extend the PeriodicCounterUpdater to call functions to update global metrics before updating the counters. - Expose the scratch profile within the per node resource usage section. - Improve documentation of the profile counter classes. - Remove synchronization from StreamingSampler. - Remove a few pieces of dead code that otherwise would have required updates. - Factor some common code for profile decoding into the Impala python library Testing: This change contains a unit test for the system level metrics collection and e2e tests for the profile changes. It also contains a test for the plotting tool. Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/service/impala-server.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/CMakeLists.txt M be/src/util/periodic-counter-updater.cc M be/src/util/periodic-counter-updater.h M be/src/util/pretty-printer.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/streaming-sampler.h A be/src/util/system-state-info-test.cc A be/src/util/system-state-info.cc A be/src/util/system-state-info.h M bin/parse-thrift-profile.py A bin/plot_profile_resource_usage.py M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Metrics.thrift M common/thrift/RuntimeProfile.thrift A lib/python/impala_py_lib/profiles.py M tests/beeswax/impala_beeswax.py M tests/common/impala_service.py A tests/observability/test_plot_profile_resource_usage.py M tests/query_test/test_observability.py 37 files changed, 1,283 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/12069/10 -- To view, visit http://gerrit.cloudera.org:8080/12069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9 Gerrit-Change-Number: 12069 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7980: Fix spinning threads because of buggy handling of num unqueued files .
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12097 ) Change subject: IMPALA-7980: Fix spinning threads because of buggy handling of num_unqueued_files_. .. Patch Set 2: (10 comments) http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@19 PS2, Line 19: set to 1 initially; decrement at exit points of : IssueInitialRanges that seems like the negation of what I'd expect given the name above? http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@21 PS2, Line 21: base-sequence-scanner.cc. did you consider an approach like adding something like this to the top level base class: virtual bool AllScanRangesIssued() = 0; and then in parquet you can implement it as 'return initial_scan_ranges_issued_;' and then all the fancy counter stuff can go into base-sequence-scanner.cc? http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@38 PS2, Line 38: ~30 threads to be spinning in the kernel. We were able to use perf to finda lot typo http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@74 PS2, Line 74: a an typo http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@266 PS2, Line 266: Should Must? http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@277 PS2, Line 277: plugins impala has plugins? perhaps we can use [[deprecated]] or something here? ATTRIBUTE_DEPRECATED? (do we have that in impala's copy of gutil?) http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@287 PS2, Line 287: no_more_add_disk_io_ranges_possible_.Add(delta); can you DCHECK that the result doesn't go below zero? http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@471 PS2, Line 471: no_more_add_disk_io_ranges_possible_ > I feel like this should be called more_add_disk_io_ranges_possible_, since yea, I was also confused by the inverted naming here. If we want to make this reflect its actual numeric value, maybe something like remaining_scan_range_addition_steps ? ie the Parquet scanner has one "scan range addition step" whereas the text ones have one such step per file? Or perhaps remaining_scan_range_issuances_? http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.cc@453 PS2, Line 453: UpdateNoMoreAddDiskIoRangesBarrier(-1); is it important to not decrement this in the error cases below on line 471-494? ie why not just unconditionally decrement this, in the same way that you unconditionally set initial_ranges_issued_ = true on L448? http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node.h File be/src/exec/hdfs-scan-node.h: http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node.h@139 PS2, Line 139: : /// Number of times scanner thread didn't find work to do. : RuntimeProfile::Counter* scanner_thread_workless_loops_counter_ = nullptr; hrm, I see the value of this for your test, but not entirely convinced it makes sense to keep around in profiles. Curious on others' opinions -- To view, visit http://gerrit.cloudera.org:8080/12097 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I133de13238d3d05c510e2ff771d48979125735b1 Gerrit-Change-Number: 12097 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 18 Jan 2019 19:31:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix flaky SentryProxyTest .. Patch Set 3: Also, you could implement AutoCloseable and use try-with-resources. -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 19:42:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. Patch Set 7: Code-Review+1 (5 comments) Nice test coverage. Paul, looks like you had some comments about refactoring. Can you take the final pass? http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@612 PS7, Line 612: Column nit: Column(s) have.., here and below. http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2046 PS7, Line 2046: nonExistingColumns nit: I think colsToAdd will be more clear. Please ignore if you disagree. http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2049 PS7, Line 2049: if (ifNotExists && col != null) { : continue; : } single line and else if below? http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2053 PS7, Line 2053: AnalysisException We are past analysis at this point. Throw a CatalogException? Also, include the tbl name? http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2057 PS7, Line 2057: and ifNotExists is true, do nothing we don't use it anymore, update? -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 18 Jan 2019 19:41:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Update ASF copyright to current year, 2019
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12242 ) Change subject: Update ASF copyright to current year, 2019 .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1819/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f Gerrit-Change-Number: 12242 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 18 Jan 2019 19:18:29 + Gerrit-HasComments: No
[Impala-ASF-CR] Update ASF copyright to current year, 2019
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12242 ) Change subject: Update ASF copyright to current year, 2019 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f Gerrit-Change-Number: 12242 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 18 Jan 2019 19:11:35 + Gerrit-HasComments: No
[Impala-ASF-CR] Update ASF copyright to current year, 2019
Jim Apple has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12242 Change subject: Update ASF copyright to current year, 2019 .. Update ASF copyright to current year, 2019 Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f --- M NOTICE.txt 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12242/1 -- To view, visit http://gerrit.cloudera.org:8080/12242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f Gerrit-Change-Number: 12242 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple
[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix flaky SentryProxyTest .. Patch Set 3: It looks like a few other tests like PartialCatalogInfoTest, CatalogTest, LocalCatalogTest, CatalogdTableInvalidatorTest and FrontendTestBase create catalogs and I don't think I see them close them - do they need to be fixed too? -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 18:35:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12241 ) Change subject: IMPALA-8073: Fix flaky SentryProxyTest .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1818/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 18:25:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12189 ) Change subject: IMPALA-7988: support loading data with dockerized Impalas .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1817/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 18:02:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12226 ) Change subject: IMPALA-7800: Reject new connections after --fe_service_threads .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py File tests/custom_cluster/test_frontend_connection_limit.py: http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@49 PS6, Line 49: client.execute(query) Come to think of it, I don't even think you need to do a query. Just connecting eats the thread on the server side, so I don't think you need threading in the test at all. Just make 5 impala clients, and they'll each eat the thread. -- To view, visit http://gerrit.cloudera.org:8080/12226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559 Gerrit-Change-Number: 12226 Gerrit-PatchSet: 6 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 18 Jan 2019 18:00:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12226 ) Change subject: IMPALA-7800: Reject new connections after --fe_service_threads .. Patch Set 6: (5 comments) The historical tidbit that struck my interests is the following: * This helps solve IMPALA-4135, where connections were timing out while waiting in the * OS accept queue, by ensuring that accept() is called as quickly as possible. */ class TAcceptQueueServer : public TServer { So, at some point in the past, users, when connecting to a busy impala, would hit a timeout. For our purposes, let's pretend that "busy" is that all fe_service_threads are handling a session, and those sessions are ... lasting forever. Then, we decided, with IMPALA-4135, to make that "connect timeout" not work, by accepting the connection right away, and users would block. And now, we have an option to go full circle, where the user will get rejected right away, albeit without a timeout, and with an error message. Part of the problem stems from the fact that "idle" sessions still eat up a thread, at least for impala-shell. (I ran bin/start-impala-cluster.py --impalad_args=-fe_service_threads=1 and ran two impala-shell's. One of them is just stuck, with the TCP connection opened, but progress blocked in "self.imp_service.PingImpalaService()" from the perspective of impala-shell.) What's the motivation to make this change, and, also to default it off? We can already tell that a user has N sessions because they're exposed in the UI. I think the metrics impala.thrift-server.beeswax-frontend.connection-setup-queue-size impala.thrift-server.beeswax-frontend.connections-in-use impala.thrift-server.beeswax-frontend.total-connections might be somewhat insufficient to tell us we're in this state, but exposing the task queue size directly as a metric is something we can do. Today, without this flag, if there are many non-interactive, but well-behaved users, they can submit 256 queries concurrently, and only 64 of them will run at a time because of the connection limit, but everyone will eventually drain out. With this flag, we'll see cases where some of this queries get rejected. http://gerrit.cloudera.org:8080/#/c/12226/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12226/6//COMMIT_MSG@18 PS6, Line 18: By default we retain the current behavior. If What's the point of this work if we don't change the default? When do we expect it to be used? http://gerrit.cloudera.org:8080/#/c/12226/6/be/src/rpc/TAcceptQueueServer.cpp File be/src/rpc/TAcceptQueueServer.cpp: http://gerrit.cloudera.org:8080/#/c/12226/6/be/src/rpc/TAcceptQueueServer.cpp@166 PS6, Line 166: if (FLAGS_reject_request_on_max_fe_threads) { : Synchronized s(tasksMonitor_); : if (maxTasks_ > 0 && tasks_.size() >= maxTasks_) { : LOG(INFO) << "All " << maxTasks_ << " server threads are in use. " : << "Rejecting new connection request."; : throw TTransportException("Server busy. Please try again later."); : } : } Regardless of the value of FLAGS_reject_request_on_max_fe_threads, do you want to (a) warn that tasks are sitting the queue in the log? Note that we already have queue_size_metric_, so maybe it's already monitored. http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py File tests/custom_cluster/test_frontend_connection_limit.py: http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@35 PS6, Line 35: statement = "select * from tpch_parquet.lineitem limit 5" Please use a query with a "sleep(...)" in it. 'select * limit 5" can execute very quickly, and your test_server_busy() test might be flaky because of it. http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@47 PS6, Line 47: query = self.statement.format() I don't think you need the .format(), either here or in line 99. I also don't think you need the variable query, which you only use once, and is just self.statement. http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@50 PS6, Line 50: except Exception as e: : print 'Unexpected exception: ' + e.message : sys.exit(1) It's not nice to sys.exit() from tests. The exception is enough. Just let it percolate out. -- To view, visit http://gerrit.cloudera.org:8080/12226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559 Gerrit-Change-Number: 12226 Gerrit-PatchSet: 6 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12241 Change subject: IMPALA-8073: Fix flaky SentryProxyTest .. IMPALA-8073: Fix flaky SentryProxyTest This patch fixes SentryProxyTest that leaks the HMS connections. I was able to reproduce the issue by looping the test multiple times. Testing: - Ran SentryProxyTest in a loop without any issue. Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 --- M fe/src/test/java/org/apache/impala/util/SentryProxyTest.java 1 file changed, 66 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/12241/3 -- To view, visit http://gerrit.cloudera.org:8080/12241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91 Gerrit-Change-Number: 12241 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya
[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12226 ) Change subject: IMPALA-7800: Reject new connections after --fe_service_threads .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1816/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559 Gerrit-Change-Number: 12226 Gerrit-PatchSet: 6 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 18 Jan 2019 17:54:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1815/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 18 Jan 2019 17:29:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads
Zoram Thanga has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12226 Change subject: IMPALA-7800: Reject new connections after --fe_service_threads .. IMPALA-7800: Reject new connections after --fe_service_threads The current implementation of the FE thrift server waits indefinitely to open the new session, if the maximum number of FE service threads specified by --fe_service_threads has been allocated. This patch introduces a startup flag to control how the server should treat new connection requests if we have run out of the configured number of server threads. By default we retain the current behavior. If --reject_request_on_max_fe_threads=true, however, the new connection requests are rejected by the server. Testing: Added a new custom cluster test suite to exercise the new code. Ran core tests. Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559 --- M be/src/rpc/TAcceptQueueServer.cpp A tests/custom_cluster/test_frontend_connection_limit.py 2 files changed, 115 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/12226/6 -- To view, visit http://gerrit.cloudera.org:8080/12226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559 Gerrit-Change-Number: 12226 Gerrit-PatchSet: 6 Gerrit-Owner: Zoram Thanga
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12189 ) Change subject: IMPALA-7988: support loading data with dockerized Impalas .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3654/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 17:31:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12189 to look at the new patch set (#6). Change subject: IMPALA-7988: support loading data with dockerized Impalas .. IMPALA-7988: support loading data with dockerized Impalas This patch does the work to load data and run some end-to-end query tests on a dockerised cluster. Changes were required in start-impala-cluster.py/ImpalaCluster and in some configuration files. ImpalaCluster is used for various things, including discovering service ports and testing for cluster readiness. This patch adds basic support and uses it from start-impala-cluster.py to check for cluster readiness. Some logic is moved from start-impala-cluster.py to ImpalaCluster. Limitations: * We're fairly inconsistent about whether services listen only on a single interface (e.g. loopback, traditionally) or whether it listens on all interfaces. This doesn't fix all of those issues. E.g. HDFS datanodes listen on all interfaces to work around some issues. * Many tests don't pass yet, particularly those using ImpalaCluster(), which isn't initialised with the appropriate docker arguments. Testing: Did a full data load locally using a dockerised Impala cluster: START_CLUSTER_ARGS="--docker_network=impala-cluster" \ TEST_START_CLUSTER_ARGS="--docker_network=impala-cluster" \ ./buildall.sh -format -testdata -ninja -notests -skiptests -noclean Ran a selection of end-to-end tests touching HDFS, Kudu and HBase tables after I loaded data locally. Ran exhaustive tests with non-dockerised impala cluster. Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 --- M bin/impala-config.sh M bin/start-impala-cluster.py M fe/src/test/resources/hbase-site.xml.template M fe/src/test/resources/hive-default.xml M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M testdata/bin/create-load-data.sh M testdata/bin/run-all.sh M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl M testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl M tests/common/impala_cluster.py 11 files changed, 358 insertions(+), 229 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12189/6 -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12189 ) Change subject: IMPALA-7988: support loading data with dockerized Impalas .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 17:31:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7979: Enhance decoders to support value-skipping
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12172 ) Change subject: IMPALA-7979: Enhance decoders to support value-skipping .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1814/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57 Gerrit-Change-Number: 12172 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Jan 2019 17:30:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12189 ) Change subject: IMPALA-7988: support loading data with dockerized Impalas .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py File tests/common/impala_cluster.py: http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py@223 PS5, Line 223: _find_docker_containers > Nit: I believe you can make this fully private (__find_docker_containers) i Done -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 17:28:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/12189 ) Change subject: IMPALA-7988: support loading data with dockerized Impalas .. Patch Set 5: Code-Review+2 (1 comment) This looks good to me. http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py File tests/common/impala_cluster.py: http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py@223 PS5, Line 223: _find_docker_containers Nit: I believe you can make this fully private (__find_docker_containers) if you want. It looks like we only use _get_container_info elsewhere. Not important. -- To view, visit http://gerrit.cloudera.org:8080/12189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517 Gerrit-Change-Number: 12189 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Fri, 18 Jan 2019 17:20:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7979: Enhance decoders to support value-skipping
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12172 ) Change subject: IMPALA-7979: Enhance decoders to support value-skipping .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder-test.cc File be/src/exec/parquet/parquet-bool-decoder-test.cc: http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder-test.cc@75 PS1, Line 75: 100 falses, 100 trues, 100 alternat > Sorry, I am still not 100% ok with the testdata, as it will only use repeat Added 100 alternating values http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/exec/parquet/parquet-bool-decoder-test.cc File be/src/exec/parquet/parquet-bool-decoder-test.cc: http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/exec/parquet/parquet-bool-decoder-test.cc@73 PS3, Line 73: TestDecodeAnd > The name seems a bit misleading, as the test checks both plain and RLE enco Renamed this test http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/bit-stream-utils-test.cc File be/src/util/bit-stream-utils-test.cc: http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/bit-stream-utils-test.cc@17 PS3, Line 17: > optional: I am unsure whether we should do it in this change, but it would For now I'll it here. I don't know if we could re-use some of these stuff for ORC or some other formats, probably not. http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/bit-stream-utils-test.cc@91 PS3, Line 91: // against a reader that doesn't skip. > Might be worth adding a brief comment explaining the testing strategy, e.g. Done http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/rle-test.cc File be/src/util/rle-test.cc: http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/rle-test.cc@301 PS3, Line 301: vector values{1, 0}; > Yeah, I like randomised tests. I don't think this code is so complex that i Yeah, I was also thinking about some fuzz testing, but wasn't sure how to output the failed test cases in order to let users reproduce the issue for debugging. I extended the string 'description' in ValidateRleSkip() to output all of its parameters, including the values as well. Other approach would be to not use a random seed, so failures would be deterministic and reproducible. Anyway, I kept this test case also. -- To view, visit http://gerrit.cloudera.org:8080/12172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57 Gerrit-Change-Number: 12172 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Jan 2019 16:49:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7731: Add TxRatio counter to instances and query profile
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add TxRatio counter to instances and query profile .. Patch Set 2: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@794 PS2, Line 794: nit: extra line http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391 PS2, Line 391: lineitem This query takes 3.4 secs on my machine, which is far from extreme, but I would prefer to choose a cheaper one if it is enough for the test. -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 2 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 16:52:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/12181/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12181/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2048 PS4, Line 2048: Column col = tbl.getColumn(column.getColumnName()); : if (ifNotExists && col != null) { : > Not sure if we can skip the ifNotExists check. It is possible that the msTb You're right. Done. -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 18 Jan 2019 16:47:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7979: Enhance decoders to support value-skipping
Hello Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12172 to look at the new patch set (#4). Change subject: IMPALA-7979: Enhance decoders to support value-skipping .. IMPALA-7979: Enhance decoders to support value-skipping This commit adds value-skipping functionality to the decoders. Value-skipping is useful when we know that we won't need the following N values, so instead of decoding and dropping them, we can just "jump" through them. This feature is a prerequisite for Parquet page skipping (IMPALA-5843). I added backend tests for all the decoders. Backed tests related to bitpacking are moved to the newly created bit-stream-utils-test. Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57 --- M be/src/exec/parquet/CMakeLists.txt A be/src/exec/parquet/parquet-bool-decoder-test.cc M be/src/exec/parquet/parquet-bool-decoder.cc M be/src/exec/parquet/parquet-bool-decoder.h M be/src/util/CMakeLists.txt A be/src/util/bit-stream-utils-test.cc M be/src/util/bit-stream-utils.h M be/src/util/bit-stream-utils.inline.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 12 files changed, 735 insertions(+), 130 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12172/4 -- To view, visit http://gerrit.cloudera.org:8080/12172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57 Gerrit-Change-Number: 12172 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)
Fredy Wijaya has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/12181 ) Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) .. IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S) This patch adds IF NOT EXISTS support in ALTER TABLE ADD COLUMN and ALTER TABLE ADD COLUMNS. If IF NOT EXISTS is specified and a column already exists with this name, no error is thrown. If IF NOT EXISTS is specified for multiple columns and a column already exists, no error is thrown and a new column that does not exist will be added. Syntax: ALTER TABLE tbl ADD COLUMN [IF NOT EXISTS] i int ALTER TABLE tbl ADD [IF NOT EXISTS] COLUMNS (i int, j int) Testing: - Added new FE tests - Ran all FE tests - Updated E2E DDL tests - Ran all E2E DDL tests Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup R fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java C fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/alter-table.test 9 files changed, 395 insertions(+), 179 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/12181/7 -- To view, visit http://gerrit.cloudera.org:8080/12181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9 Gerrit-Change-Number: 12181 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 21: (14 comments) Patch code flow and logic lgtm generally. Just had some nits/clarifications. Publishing these while I'm looking at the tests. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307 PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() eventPollingInterval http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319 PS21, Line 319: LOG.error("Unable to fetch the current notification event id from metastore." : + "Metastore event processing will be disabled.", : e); nit: Multiple places in the code, try to format in fewer lines. Something like LOG.error("..." + ".", e) http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206 PS21, Line 1206: // Even though we get the current notification event id before stopping the event clarification: reset() is equivalent to invalidating everything and fetching from scratch from HMS. In that case, do we still restart the event processing from where we left off (before reset()) or can we start with the latest event ID from HMS? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1804 PS21, Line 1804: Table incompleteTable = IncompleteTable.createUninitializedTable(db, tblName); move it after L1807? http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@195 PS13, Line 195: > curious to understand why do you think this method has a race? nvm missed the synchronized part. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@155 PS21, Line 155: event %d of type %s on table %s I think it is cleaner to define a debugString(NotificationEvent event) {} that formats it cleanly and consistently across all invocations. We can use it to dump event state in multiple places. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@160 PS21, Line 160: if (!event.getEventType().equalsIgnoreCase("CREATE_DATABASE") : && !event.getEventType().equalsIgnoreCase("DROP_DATABASE")) { : tblName_ = Preconditions.checkNotNull(event.getTableName()); : } else { : tblName_ = null; : } nit: Reformat to the following? (fewer branches and use enum) tblName_ = null; if (eventType_ == MetaStoreEventType.CREATE_TABLE) { tblName_ = ... } http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@181 PS21, Line 181: tblName_ checkNotNull()? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@198 PS21, Line 198:private final String dbName_; : private final String tblName_; aren't these set in the parent c'tor? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@347 PS21, Line 347: droppedTable_ isn't this more like tblTobeDropped_ ? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@434 PS21, Line 434: // nit: remove, you are already in a comment block. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@310 PS21, Line 310: if (currentEventId <= lastSyncedEventId_) { : return Collections.emptyList(); : } nit: single line
[Impala-ASF-CR] IMPALA-6503: Support reading complex types from ORC format files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12168 ) Change subject: IMPALA-6503: Support reading complex types from ORC format files .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1813/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I244dc9d2b3e425393f90e45632cb8cdbea6cf790 Gerrit-Change-Number: 12168 Gerrit-PatchSet: 8 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 14:41:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6503: Support reading complex types from ORC format files
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12168 to look at the new patch set (#8). Change subject: IMPALA-6503: Support reading complex types from ORC format files .. IMPALA-6503: Support reading complex types from ORC format files We’ve supported reading primitive types from ORC files (IMPALA-5717). In this patch we add support for complex types (struct/array/map). In IMPALA-5717, we depend on the ORC lib to read ORC binaries. The ORC lib can materialize ORC column binaries into its representation (orc::ColumnVectorBatch), so we don’t need to do anything about decoding/decompression in hdfs-orc-scanner. Since it already supports complex types, we’ll still depend on it. What we need to add in IMPALA-6503 are two things: 1. Specify which nested columns we need to the ORC lib 2. Transform outputs of ORC lib (nested orc::ColumnVectorBatch) into Impala’s representation To format the materialization, we implement several ORC column readers used in hdfs-orc-scanner. Each kind of reader treats a column type. Don’t like the Parquet readers (used in hdfs-parquet-scanner) which materializes Parquet column binaries into tuple values directly, the ORC readers (in hdfs-orc-scanner) just need to transform outputs of the ORC lib into tuple/slot values. Tests: * Enable existing tests for complex types (test_nested_types.py, test_tpch_nested_queries.py) for ORC. Change-Id: I244dc9d2b3e425393f90e45632cb8cdbea6cf790 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h A be/src/exec/orc-column-readers.cc A be/src/exec/orc-column-readers.h A be/src/exec/orc-metadata-utils.cc A be/src/exec/orc-metadata-utils.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java A testdata/ComplexTypesTbl/README A testdata/ComplexTypesTbl/nonnullable.orc A testdata/ComplexTypesTbl/nullable.orc M testdata/bin/create-load-data.sh M testdata/bin/load_nested.py M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/complex-types-file-formats.test M testdata/workloads/functional-query/queries/QueryTest/max-nesting-depth.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-limit.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test M testdata/workloads/tpch_nested/tpch_nested_core.csv M testdata/workloads/tpch_nested/tpch_nested_dimensions.csv M testdata/workloads/tpch_nested/tpch_nested_exhaustive.csv M testdata/workloads/tpch_nested/tpch_nested_pairwise.csv M tests/query_test/test_nested_types.py M tests/query_test/test_tpch_nested_queries.py 29 files changed, 1,695 insertions(+), 449 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12168/8 -- To view, visit http://gerrit.cloudera.org:8080/12168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I244dc9d2b3e425393f90e45632cb8cdbea6cf790 Gerrit-Change-Number: 12168 Gerrit-PatchSet: 8 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12129 ) Change subject: IMPALA-6664: Tag log statements with fragment or query ids. .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/12129/2/be/src/common/thread-debug-info.h File be/src/common/thread-debug-info.h: http://gerrit.cloudera.org:8080/#/c/12129/2/be/src/common/thread-debug-info.h@a116 PS2, Line 116: > Cool. I'll take a look at impala-gdb.py and fix it appropriately. Printing Thanks! Do you plan to fix it in this change, or in a follow-up commit? http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-hs2-server.cc@126 PS5, Line 126: ScopedThreadContext(GetThreadDebugInfo(), query_ctx.query_id); Currently it creates a temporary object that is deleted promptly. You need to create a variable so the TDI object only gets reset at the end of the function. http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-internal-service.cc File be/src/service/impala-internal-service.cc: http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-internal-service.cc@49 PS5, Line 49: ScopedThreadContext(GetThreadDebugInfo(), params.query_ctx.query_id); It creates a temporary that is deleted promptly. -- To view, visit http://gerrit.cloudera.org:8080/12129 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6634ef9d1a7346339f24f2d40a7a3aa36a535da8 Gerrit-Change-Number: 12129 Gerrit-PatchSet: 5 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Fri, 18 Jan 2019 13:48:41 + Gerrit-HasComments: Yes