[Impala-ASF-CR] IMPALA-9053: DDLs should generate lineage graphs.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/14458 ) Change subject: IMPALA-9053: DDLs should generate lineage graphs. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/14458/1/testdata/workloads/functional-query/queries/QueryTest/lineage.test File testdata/workloads/functional-query/queries/QueryTest/lineage.test: PS1: > Also add a test for CTAS as a sanity check? CTAS queries are already covered later in this .test file. -- To view, visit http://gerrit.cloudera.org:8080/14458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6c7ed9fe3265fd777fe93590cf4eb2d9ba0dd1e Gerrit-Change-Number: 14458 Gerrit-PatchSet: 1 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 17 Oct 2019 04:55:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 17 Oct 2019 04:21:18 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Fix webserver's use of sq_printf/sq_write Previously, the webserver used multiple calls to sq_printf and sq_write when sending most responses. This can lead to a bad interaction between Nagle's algorithm and TCP delayed acks which can add significant latency to RTT. This patch modifies the SendResponse() function to buffer the entire response and send it in a single sq_write call. Testing: - Ran all existing webserver tests. Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Reviewed-on: http://gerrit.cloudera.org:8080/14471 Reviewed-by: Adar Dembo Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins --- M be/src/util/webserver.cc 1 file changed, 16 insertions(+), 9 deletions(-) Approvals: Adar Dembo: Looks good to me, but someone else must approve Thomas Tauber-Marshall: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7506: support global INVALIDATE METADATA in local catalog mode
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14307 ) Change subject: IMPALA-7506: support global INVALIDATE METADATA in local catalog mode .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/14307/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/14307/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1104 PS5, Line 1104: if (ctx.collectFullUpdates || tbl.getCatalogVersion() <= ctx.toVersion) { > For your concerns, I think it's ok since we already have the logic to force I think the concerns of concurrent DDLs can be fixed by IMPALA-9062, that we don't need to acquire table locks in minimal topic mode, since we just need the db name, table name and catalog version. -- To view, visit http://gerrit.cloudera.org:8080/14307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib61a7ab1ffa062620ffbc2dadc34bd7a8ca9e549 Gerrit-Change-Number: 14307 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 17 Oct 2019 03:10:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4811/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 17 Oct 2019 00:41:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 28: (25 comments) Hi Radford. As Bharath and Fredy have disappeared I'll review this now :-) Code looks good. Like you I'm slightly worried about the timeout queue. It's somehow wasteful to keep all that state around when the common case is that the tasks completed quickly. http://gerrit.cloudera.org:8080/#/c/13748/28//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13748/28//COMMIT_MSG@19 PS28, Line 19: See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. How can a user decide whether to use daemon threads? http://gerrit.cloudera.org:8080/#/c/13748/28//COMMIT_MSG@24 PS28, Line 24: flag `query_event_hook_timeout_s`, which specified a timeout value Nit: specifies http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@47 PS28, Line 47: * {@link QueryEventHook}s that utilizes a fixed thread pool pulling fixed capacity thread pool? http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@62 PS28, Line 62: * submission, you will have no indication that rejection has taken place. "there is no indication" i.e. passive voice may be clearer http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@81 PS28, Line 81: * For example, suppose you have 2 hook tasks that take roughly 3 seconds to This example is great, thanks http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@82 PS28, Line 82: * execute. You then create an executor with a thread pool of size 1 and a hook You can consider this a nit :-) , but I find the use of 'you' jarring http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@182 PS28, Line 182: final ArrayBlockingQueue boundedQueue = Can you use BlockingQueue ? http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@290 PS28, Line 290: t); Does LOG.error work when there is an extra Throwable argument? Looks a bit weird to me http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@315 PS28, Line 315: // a timeout-check task is scheduled for every hook task that is Nit: capitalize 'a' http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@318 PS28, Line 318: // task queue to grow unbounded if hook tasks are scheduled at an interval This approach seems clever, and the code is neat and tidy... So suppose I have problems getting my hooks to always execute. I know they usually take 5 seconds but just in case I will set the timeout to 30 mins. Now my timeout queue gets quite full I think? How much garbage does the timeout queue keep from being collected? http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@327 PS28, Line 327: protected void checkHookTimeout( private http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@354 PS28, Line 354: private static CompletableFuture failedFuture(Throwable e) { Maybe explain what is intended to happen http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@365 PS28, Line 365: private static String mname(String... names) { Maybe mName is clearer? http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@370 PS28, Line 370: Counter getInterruptedCounter(QueryEventHook hook, String method) { Unused? http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@75 PS28, Line 75: // constants to share with the executor so that no mismatches Nit: Constants http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@78 PS28, Line 78: // it'd be nice if we could get the flag names from the be Maybe instead say "The must be kept in sync with..."
[Impala-ASF-CR] IMPALA-8999: make union scheduling work with mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14384 ) Change subject: IMPALA-8999: make union scheduling work with mt_dop .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4810/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14384 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d2e9c86b530da3053e49d42b837dca0b1348ff2 Gerrit-Change-Number: 14384 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 17 Oct 2019 00:23:20 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5100/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:59:55 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 2: Code-Review+2 carrying forward -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:59:36 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:59:06 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/14471/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14471/1//COMMIT_MSG@12 PS1, Line 12: This can lead to a bad interaction between Nagle's algorithm and TCP : delayed acks which can add significant latency to RTT. > This manifested in Kudu most noticeably in webserver-test, specifically in No, we don't have any such tests http://gerrit.cloudera.org:8080/#/c/14471/1/be/src/util/webserver.cc File be/src/util/webserver.cc: http://gerrit.cloudera.org:8080/#/c/14471/1/be/src/util/webserver.cc@185 PS1, Line 185: // Buffer the output and send it in a single call to sq_write in order to avoid > Could add a comment here explaining why we're buffering in the ostringstrea Done -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:58:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 1: Code-Review+2 Same comment as Adar about including a brief explanation, but LGTM -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:57:00 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Hello Adar Dembo, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14471 to look at the new patch set (#2). Change subject: Fix webserver's use of sq_printf/sq_write .. Fix webserver's use of sq_printf/sq_write Previously, the webserver used multiple calls to sq_printf and sq_write when sending most responses. This can lead to a bad interaction between Nagle's algorithm and TCP delayed acks which can add significant latency to RTT. This patch modifies the SendResponse() function to buffer the entire response and send it in a single sq_write call. Testing: - Ran all existing webserver tests. Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f --- M be/src/util/webserver.cc 1 file changed, 16 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14471/2 -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9053: DDLs should generate lineage graphs.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14458 ) Change subject: IMPALA-9053: DDLs should generate lineage graphs. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/14458/1/testdata/workloads/functional-query/queries/QueryTest/lineage.test File testdata/workloads/functional-query/queries/QueryTest/lineage.test: PS1: Also add a test for CTAS as a sanity check? -- To view, visit http://gerrit.cloudera.org:8080/14458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6c7ed9fe3265fd777fe93590cf4eb2d9ba0dd1e Gerrit-Change-Number: 14458 Gerrit-PatchSet: 1 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 23:49:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4809/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:43:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8999: make union scheduling work with mt dop
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14384 ) Change subject: IMPALA-8999: make union scheduling work with mt_dop .. Patch Set 11: I'm running exhaustive tests still, but this should be ready for review. -- To view, visit http://gerrit.cloudera.org:8080/14384 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d2e9c86b530da3053e49d42b837dca0b1348ff2 Gerrit-Change-Number: 14384 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:40:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8999: make union scheduling work with mt dop
Hello Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14384 to look at the new patch set (#11). Change subject: IMPALA-8999: make union scheduling work with mt_dop .. IMPALA-8999: make union scheduling work with mt_dop This change unifies mt_dop scheduling between the union and scan cases. Testing: Manually checked that fragments with unions get parallelised to the correct degree, both as a result of scans within the fragment and input fragments. Extend TestMtDopAdmissionSlots (renamed to TestMtDopScheduling) to confirm that queries that were not parallelised before are now parallelised. These tests verify the number of instances of each operator using the ExecSummary embedded in the profile. Change-Id: I0d2e9c86b530da3053e49d42b837dca0b1348ff2 --- M be/src/scheduling/query-schedule.cc M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M fe/src/main/java/org/apache/impala/planner/UnionNode.java D testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-admission-slots.test A testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test M tests/query_test/test_mt_dop.py 8 files changed, 372 insertions(+), 213 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/14384/11 -- To view, visit http://gerrit.cloudera.org:8080/14384 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d2e9c86b530da3053e49d42b837dca0b1348ff2 Gerrit-Change-Number: 14384 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14471 ) Change subject: Fix webserver's use of sq_printf/sq_write .. Patch Set 1: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/14471/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14471/1//COMMIT_MSG@12 PS1, Line 12: This can lead to a bad interaction between Nagle's algorithm and TCP : delayed acks which can add significant latency to RTT. This manifested in Kudu most noticeably in webserver-test, specifically in those tests that issued many HTTP requests with connection reuse (i.e. after keep-alive was enabled). We saw a 40ms delay between each request. Are there any Impala tests that works similarly? http://gerrit.cloudera.org:8080/#/c/14471/1/be/src/util/webserver.cc File be/src/util/webserver.cc: http://gerrit.cloudera.org:8080/#/c/14471/1/be/src/util/webserver.cc@185 PS1, Line 185: std::ostringstream oss; Could add a comment here explaining why we're buffering in the ostringstream. -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 23:04:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix webserver's use of sq printf/sq write
Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14471 Change subject: Fix webserver's use of sq_printf/sq_write .. Fix webserver's use of sq_printf/sq_write Previously, the webserver used multiple calls to sq_printf and sq_write when sending most responses. This can lead to a bad interaction between Nagle's algorithm and TCP delayed acks which can add significant latency to RTT. This patch modifies the SendResponse() function to buffer the entire response and send it in a single sq_write call. Testing: - Ran all existing webserver tests. Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f --- M be/src/util/webserver.cc 1 file changed, 14 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14471/1 -- To view, visit http://gerrit.cloudera.org:8080/14471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I343ae83d6324bc710e4cf96d66975a7c9694706f Gerrit-Change-Number: 14471 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14357 ) Change subject: IMPALA-8998: admission control accounting for mt_dop .. IMPALA-8998: admission control accounting for mt_dop This integrates mt_dop with the "slots" mechanism that's used for non-default executor groups. The idea is simple - the degree of parallelism on a backend determines the number of slots consumed. The effective degree of parallelism is used, not the raw mt_dop setting. E.g. if the query only has a single input split and executes only a single fragment instance on a host, we don't want to count the full mt_dop value for admission control. --admission_control_slots is added as a new flag that replaces --max_concurrent_queries, since the name better reflects the concept. --max_concurrent_queries is kept for backwards compatibility and has the same meaning as --admission_control_slots. The admission control logic is extended to take this into account. We also add an immediate rejection code path since it is now possible for queries to not be admittable based on the # of available slots. We only factor in the "width" of the plan - i.e. the number of instances of fragments. We don't account for the number of distinct fragments, since they may not actually execute in parallel with each other because of dependencies. This number is added to the per-host profile as the "AdmissionSlots" counter. Testing: Added unit tests for rejection and queue/admit checks. Also includes a fix for IMPALA-9054 where we increase the timeout. Added end-to-end tests: * test_admission_slots in test_mt_dop.py that checks the admission slot calculation via the profile. * End-to-end admission test that exercises the admit immediately and queueing code paths. Added checks to test_verify_metrics (which runs after end-to-end tests) to ensure that the per-backend slots in use goes to 0 when the cluster is quiesced. Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Reviewed-on: http://gerrit.cloudera.org:8080/14357 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M common/thrift/StatestoreService.thrift A testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-admission-slots.test M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_executor_groups.py M tests/query_test/test_cancellation.py M tests/query_test/test_mt_dop.py M tests/util/auto_scaler.py M tests/verifiers/metric_verifier.py M tests/verifiers/test_verify_metrics.py M www/backends.tmpl 20 files changed, 416 insertions(+), 88 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 22 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14357 ) Change subject: IMPALA-8998: admission control accounting for mt_dop .. Patch Set 21: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 21 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 20:26:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14357 ) Change subject: IMPALA-8998: admission control accounting for mt_dop .. Patch Set 21: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 21 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 20:18:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9053: DDLs should generate lineage graphs.
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/14458 ) Change subject: IMPALA-9053: DDLs should generate lineage graphs. .. Patch Set 1: Code-Review+1 I've touched lineage-related code in the past but have to admit that I don't really have much domain knowledge. That said, the changes look good to me -- To view, visit http://gerrit.cloudera.org:8080/14458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6c7ed9fe3265fd777fe93590cf4eb2d9ba0dd1e Gerrit-Change-Number: 14458 Gerrit-PatchSet: 1 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 19:48:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@240 PS6, Line 240: strncmp(current_pos, "\\\"", 2) == 0) > It is zero-terminated You're correct, it is null-terminated for iso-sql format strings, because the DateTimeFormatContext instance is always created using a null-terminated string (in CastFormatExpr::OpenEvaluator()). On the other hand, DateTimeFormatContext class does not guarantee that DateTimeFormatContext::fmt is null-terminated. E.g. TimestampFunctions::UnixAndFromUnixPrepare() creates a DateTimeFormatContext instance where fmt is not null-terminated. For this reason, we are very careful in SimpleDateFormatTokenizer::Tokenize() not to assume that format strings will end with a null-byte. Maybe add a comment at the beginning of this file to explain that iso-sql format strings are always null-terminated? http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py File tests/query_test/test_cast_with_format.py: http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@738 PS6, Line 738: select cast("1985-\a\b\f\n\r\t\v\'12-09" as ''' : r'''date format '-"\a\b\f\n\r\t\v\'"MM-DD') This test is about special escape sequences but non-special chars in the text token are escaped too. Is that intentional? -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 16 Oct 2019 18:02:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8742: Switch to ScanRange::bytes to read() instead of len()
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14348 ) Change subject: IMPALA-8742: Switch to ScanRange::bytes_to_read() instead of len() .. IMPALA-8742: Switch to ScanRange::bytes_to_read() instead of len() IMPALA-7543 introduced sub-ranges in scan ranges. These are smaller parts of the scan ranges that actually need to be read, other parts of the scan range can be skipped. Currently sub-ranges are only used in the Parquet scanner during page filtering. With sub-ranges the scan range has a new field 'bytes_to_read_', that is the sum of the lengths of the sub-ranges. Or, if there are no sub-ranges, 'bytes_to_read_' equals to field 'len_' which is the length of the whole scan range. At some parts of Impala ScanRange::len() is being used instead of ScanRange::bytes_to_read(). It doesn't cause a bug because only the Parquet scanner uses sub-ranges, i.e. bytes_to_read() usually equals to len(). The Parquet scanner also doesn't hit the bug because it tracks which pages it reads. However, it can be a potential source of bugs in the future to leave the invocations of len() instead of bytes_to_read(). Also, the scanners might allocate more memory than needed. At couple of places we still need to invoke len(), e.g. when we test scan-range containment (for local splits), or when we test whether a scan range contains the mid-point of a Parquet row group. Testing: Added a scanner reservation test. Ran the exhaustive tests. Change-Id: Ie896db3f4b5f3e2272d81c2d360049af09c41d9c Reviewed-on: http://gerrit.cloudera.org:8080/14348 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/base-sequence-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/request-context.cc M be/src/runtime/io/scan-range.cc M testdata/workloads/functional-query/queries/QueryTest/scanner-reservation.test 8 files changed, 24 insertions(+), 8 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie896db3f4b5f3e2272d81c2d360049af09c41d9c Gerrit-Change-Number: 14348 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8742: Switch to ScanRange::bytes to read() instead of len()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14348 ) Change subject: IMPALA-8742: Switch to ScanRange::bytes_to_read() instead of len() .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie896db3f4b5f3e2272d81c2d360049af09c41d9c Gerrit-Change-Number: 14348 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 16 Oct 2019 17:57:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14357 ) Change subject: IMPALA-8998: admission control accounting for mt_dop .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4808/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 19 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 16:44:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14357 ) Change subject: IMPALA-8998: admission control accounting for mt_dop .. Patch Set 21: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5099/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 21 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 16:03:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py File tests/query_test/test_cast_with_format.py: http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@768 PS6, Line 768: covered > Don't see the difference. This seems nit of the nits :D "covered" and "surrounded" mean very different things: https://www.thefreedictionary.com/covered https://www.thefreedictionary.com/surrounded :) -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 16 Oct 2019 16:08:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14357 ) Change subject: IMPALA-8998: admission control accounting for mt_dop .. Patch Set 20: Ran into IMPALA-9054, including a fix -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 20 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 16 Oct 2019 16:03:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Hello Andrew Sherman, Lars Volker, Abhishek Rawat, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14357 to look at the new patch set (#20). Change subject: IMPALA-8998: admission control accounting for mt_dop .. IMPALA-8998: admission control accounting for mt_dop This integrates mt_dop with the "slots" mechanism that's used for non-default executor groups. The idea is simple - the degree of parallelism on a backend determines the number of slots consumed. The effective degree of parallelism is used, not the raw mt_dop setting. E.g. if the query only has a single input split and executes only a single fragment instance on a host, we don't want to count the full mt_dop value for admission control. --admission_control_slots is added as a new flag that replaces --max_concurrent_queries, since the name better reflects the concept. --max_concurrent_queries is kept for backwards compatibility and has the same meaning as --admission_control_slots. The admission control logic is extended to take this into account. We also add an immediate rejection code path since it is now possible for queries to not be admittable based on the # of available slots. We only factor in the "width" of the plan - i.e. the number of instances of fragments. We don't account for the number of distinct fragments, since they may not actually execute in parallel with each other because of dependencies. This number is added to the per-host profile as the "AdmissionSlots" counter. Testing: Added unit tests for rejection and queue/admit checks. Also includes a fix for IMPALA-9054 where we increase the timeout. Added end-to-end tests: * test_admission_slots in test_mt_dop.py that checks the admission slot calculation via the profile. * End-to-end admission test that exercises the admit immediately and queueing code paths. Added checks to test_verify_metrics (which runs after end-to-end tests) to ensure that the per-backend slots in use goes to 0 when the cluster is quiesced. Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M common/thrift/StatestoreService.thrift A testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-admission-slots.test M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_executor_groups.py M tests/query_test/test_cancellation.py M tests/query_test/test_mt_dop.py M tests/util/auto_scaler.py M tests/verifiers/metric_verifier.py M tests/verifiers/test_verify_metrics.py M www/backends.tmpl 20 files changed, 416 insertions(+), 88 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/14357/20 -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 20 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8998: admission control accounting for mt dop
Hello Andrew Sherman, Lars Volker, Abhishek Rawat, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14357 to look at the new patch set (#19). Change subject: IMPALA-8998: admission control accounting for mt_dop .. IMPALA-8998: admission control accounting for mt_dop This integrates mt_dop with the "slots" mechanism that's used for non-default executor groups. The idea is simple - the degree of parallelism on a backend determines the number of slots consumed. The effective degree of parallelism is used, not the raw mt_dop setting. E.g. if the query only has a single input split and executes only a single fragment instance on a host, we don't want to count the full mt_dop value for admission control. --admission_control_slots is added as a new flag that replaces --max_concurrent_queries, since the name better reflects the concept. --max_concurrent_queries is kept for backwards compatibility and has the same meaning as --admission_control_slots. The admission control logic is extended to take this into account. We also add an immediate rejection code path since it is now possible for queries to not be admittable based on the # of available slots. We only factor in the "width" of the plan - i.e. the number of instances of fragments. We don't account for the number of distinct fragments, since they may not actually execute in parallel with each other because of dependencies. This number is added to the per-host profile as the "AdmissionSlots" counter. Testing: Added unit tests for rejection and queue/admit checks. Added end-to-end tests: * test_admission_slots in test_mt_dop.py that checks the admission slot calculation via the profile. * End-to-end admission test that exercises the admit immediately and queueing code paths. Added checks to test_verify_metrics (which runs after end-to-end tests) to ensure that the per-backend slots in use goes to 0 when the cluster is quiesced. Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M common/thrift/StatestoreService.thrift A testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-admission-slots.test M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_executor_groups.py M tests/query_test/test_cancellation.py M tests/query_test/test_mt_dop.py M tests/util/auto_scaler.py M tests/verifiers/metric_verifier.py M tests/verifiers/test_verify_metrics.py M www/backends.tmpl 20 files changed, 416 insertions(+), 88 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/14357/19 -- To view, visit http://gerrit.cloudera.org:8080/14357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7b6b6262ef238df26b491352656a26e4163e46e5 Gerrit-Change-Number: 14357 Gerrit-PatchSet: 19 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4807/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 16 Oct 2019 14:47:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 28: (1 comment) added some detail to the comment that needs attention http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@313 PS28, Line 313: // TODO: IMPALA-XXX > In this latest approach, I schedule a task that will cancel the hook task i Note that this potential for unbounded growth is possible even though the hook task executor has a bounded work queue. Consider the following scenario: (1) user configures a hook timeout of 3 seconds (2) hook tasks take 1 second to execute (3) hook tasks are submitted every 2 seconds (due to query execution) Because (2) and (3), hook tasks will never fill the bounded queue and we can say that a hook task will be submitted every 2 seconds. Every hook task submitted induces a timeout task to begin after 3 seconds. Because timeout tasks are effectively submitted every 2 seconds but only run after 3 seconds, this will cause the timeout task queue to grow. -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 28 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 14:34:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 28: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4806/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 28 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 14:23:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 6: (11 comments) http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-parser.cc@225 PS6, Line 225: switch (**format) { : case 'b': return '\b'; : case 'n': return '\n'; : case 'r': return '\r'; : case 't': return '\t'; : } > Please check what escape sequences are supported by ANSI SQL standard. Well, I'm not convinced that ANSI SQL is relevant here as according to this page it doesn't allow escaping quotes and double quotes. https://www.ibm.com/support/knowledgecenter/en/SSGU8G_12.1.0/com.ibm.esqlc.doc/ids_esqlc_0015.htm This doc about MySql is kind of in line with my implementation: https://www.oreilly.com/library/view/mysql-cookbook/0596001452/ch04s02.html It also says that 2 sequential double quotes works as an escaped double quote same as with a backslash. However, I don't feel the need for this. Additionally MySql has \0 as a special char (ANSII NULL) but again, don't find it important. http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@109 PS6, Line 109: *current_pos != nullptr > nit: current_pos != nullptr && *current_pos != nullptr Done http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@112 PS6, Line 112: *current_pos < str_end > nit: str_begin <= *current_pos && *current_pos < str_end Done http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@240 PS6, Line 240: strncmp(current_pos, "\\\"", 2) == 0) > This is safe only if 'current_pos' is zero-terminated, but I don't think it It is zero-terminated http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@273 PS6, Line 273: strncmp(*current_pos, "\\\"", 2) > same as L240. see above http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@277 PS6, Line 277: strncmp(*current_pos, "\\\"", 4) > same as L240. see above http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@282 PS6, Line 282: strncmp(*current_pos, "", 2) > same as L240. see above http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc File be/src/runtime/datetime-parser-common.cc: http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc@201 PS6, Line 201: continue > nit: can be removed, here and below. Done http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc@202 PS6, Line 202:} else if (!tok.is_double_escaped && strncmp(text_it, "\\\"", 2) == 0) { : result.append("\""); : ++text_it; : continue; : } else if (strncmp(text_it, "", 2) == 0) { : result.append("\\"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\b", 2) == 0) { : result.append("\b"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\n", 2) == 0) { : result.append("\n"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\r", 2) == 0) { : result.append("\r"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\t", 2) == 0) { : result.append("\t"); : ++text_it; : continue; : } > Are these the only escape sequences supported by ANSI SQL? See the answer in datetime-iso-sql-format-parser.cc http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py File tests/query_test/test_cast_with_format.py: http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@663 PS6, Line 663:# Format part is surrounded by double quotes so the quotes indicating the start and : # end of the text token has to be escaped. > I did some testing around when the format string is surrounded by ' and the If the format is surrounded by single quotes but the text double quotes surrounding the text token is escaped is a valid scenario. From the code it's not possible to decide whether a
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/14291/8/tests/query_test/test_cast_with_format.py File tests/query_test/test_cast_with_format.py: http://gerrit.cloudera.org:8080/#/c/14291/8/tests/query_test/test_cast_with_format.py@743 PS8, Line 743: assert result.data == [r'''1985 text tab used for whitespace http://gerrit.cloudera.org:8080/#/c/14291/8/tests/query_test/test_cast_with_format.py@747 PS8, Line 747: assert result.data == [r'''1985 text tab used for whitespace http://gerrit.cloudera.org:8080/#/c/14291/8/tests/query_test/test_cast_with_format.py@751 PS8, Line 751: assert result.data == [r'''1985 text tab used for whitespace http://gerrit.cloudera.org:8080/#/c/14291/8/tests/query_test/test_cast_with_format.py@755 PS8, Line 755: assert result.data == [r'''1985 text tab used for whitespace -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 16 Oct 2019 14:02:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Hello Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14291 to look at the new patch set (#8). Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 This patch adds additional datetime format tokens on top of what was introduced with Milestone 1 (IMPALA-8703). The tokens introduced: - Free text token: Surrounded by double quotes, a free text section can be given in the format where the same text is expected in the input without the surrounding double quotes. - FX modifier: This modifier has to be given at the beginning of the format and is valid for the whole format. In a string to datetime conversion this forces strict separator matching and expects all the tokens in the input to have the same length as their maximum length. E.g. A month has to be of length 2 prefixed by zero if needed. This is the default in a datetime to string conversion. - FM modifier: This modifier affects only the following token and overrides the FX modifier. Using this the value of a token can be shorter than the max length if followed by a separator. E.g. 1-digit month, less than 4-digit year, etc. This is the default behaviour in a string to datetime conversion. Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 --- M be/src/runtime/date-parse-util.cc M be/src/runtime/datetime-iso-sql-format-parser.cc M be/src/runtime/datetime-iso-sql-format-parser.h M be/src/runtime/datetime-iso-sql-format-tokenizer.cc M be/src/runtime/datetime-iso-sql-format-tokenizer.h M be/src/runtime/datetime-parser-common.cc M be/src/runtime/datetime-parser-common.h M be/src/runtime/timestamp-parse-util.cc M tests/query_test/test_cast_with_format.py 9 files changed, 632 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/14291/8 -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 28: (1 comment) I've implemented a simpler approach as suggested by Bharath, but it comes with a caveat. Please carefully review where I've commented (PS28, FixedCapacityQueryHookExecutor.java:313) http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@313 PS28, Line 313: // TODO: IMPALA-XXX In this latest approach, I schedule a task that will cancel the hook task if it is still running after `hookTimeout` duration. This makes the code simpler (fewer threads) and easier to reason about (straight-forward use of executors). However, it comes at a cost, described in the comment. I've pre-emptively placed a TODO comment here to draw attention to it -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 28 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 13:45:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8742: Switch to ScanRange::bytes to read() instead of len()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14348 ) Change subject: IMPALA-8742: Switch to ScanRange::bytes_to_read() instead of len() .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5098/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie896db3f4b5f3e2272d81c2d360049af09c41d9c Gerrit-Change-Number: 14348 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 16 Oct 2019 13:42:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8742: Switch to ScanRange::bytes to read() instead of len()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14348 ) Change subject: IMPALA-8742: Switch to ScanRange::bytes_to_read() instead of len() .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie896db3f4b5f3e2272d81c2d360049af09c41d9c Gerrit-Change-Number: 14348 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 16 Oct 2019 13:42:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#28). Change subject: IMPALA-8571: harden QueryEventHook execution .. IMPALA-8571: harden QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* Hook performance metrics are captured by the frontend, but not yet published to the backend and are therefore not user-visible. See IMPALA-8914 for implementation of this feature. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 11 files changed, 814 insertions(+), 114 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/28 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 28 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h File be/src/runtime/datetime-iso-sql-format-parser.h: http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h@78 PS7, Line 78: format text token inside the format http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h@81 PS7, Line 81: format text token http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h@81 PS7, Line 81: and moves : // '*format' to 'a'. I don't think part this is true. In GetNextCharFromTextToken() *format is moved to the last character of the escape sequence (to '"' in the example given above). http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc@48 PS7, Line 48: if (tok->type == SEPARATOR && !dt_ctx.fx_modifier) { : bool res = ProcessSeparatorSequence(_pos, end_pos, dt_ctx, ); : if (!res || current_pos == end_pos) return res; : DCHECK(i < dt_ctx.toks.size()); : // Next token, following the separator sequence. : tok = _ctx.toks[i]; : } I'd prefer to handle strict separator matching (tok->type == SEPARATOR && dt_ctx.fx_modifier) right after this, instead of L183. http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc@56 PS7, Line 56: const char* token_end_pos = FindEndOfToken(current_pos, end_pos - current_pos, *tok); : if (token_end_pos == nullptr) return false; : int token_len = token_end_pos - current_pos; : : if (dt_ctx.fx_modifier && !tok->fm_modifier && tok->type != TEXT && : token_len != tok->len) { : return false; : } for TEXT tokens, 'token_end_pos' and 'token_len' aren't set here. 'token_end_pos' is set in L170. For clarity I'd move the code in L170-182 right before L56, and execute L56-193 only if tok->type != TEXT. http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc@185 PS7, Line 185: token_len == 1 dt_ctx.fx_modifier && token_len == 1 http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-parser-common.h File be/src/runtime/datetime-parser-common.h: http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-parser-common.h@134 PS7, Line 134: then that -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 16 Oct 2019 12:11:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9052: Events processor should skip blacklisted database and table
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14457 ) Change subject: IMPALA-9052: Events processor should skip blacklisted database and table .. IMPALA-9052: Events processor should skip blacklisted database and table IMPALA-8797 introduced a configuration which can blacklist certain database and tables so that catalog never loads them. This is useful to skip the Hive's internal book-keeping tables which are unreadable by Impala (like Information schema and sys tables). However, if there are events generated on such tables, the events processor goes into error state because the database does not exist. This patch adds support for ignoring blacklisted database and tables in the events processor. Testing: 1. Added a new test case which makes sure that events processor is not in error state after receiving events on blacklisted tables and dbs. 2. Ran MetastoreEventsProcessorTest Change-Id: Ic5a53b722e6225e9cad9954f447f821c2c677e60 Reviewed-on: http://gerrit.cloudera.org:8080/14457 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M tests/custom_cluster/test_event_processing.py 2 files changed, 42 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic5a53b722e6225e9cad9954f447f821c2c677e60 Gerrit-Change-Number: 14457 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9052: Events processor should skip blacklisted database and table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14457 ) Change subject: IMPALA-9052: Events processor should skip blacklisted database and table .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5a53b722e6225e9cad9954f447f821c2c677e60 Gerrit-Change-Number: 14457 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Oct 2019 11:53:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py File tests/query_test/test_cast_with_format.py: http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@782 PS7, Line 782: # Strict separator matching. : result = self.client.execute("select cast('2001-03-02 03:10:15' as timestamp format" : "'FX MM-DD HH12:MI:SS')") : assert result.data == ["NULL"] : : result = self.client.execute("select cast('2001-03-03 03:10:15' as timestamp format" : "'FX-MM-DD HH12::MI:SS')") : assert result.data == ["NULL"] : : result = self.client.execute("select cast('2001-03-04' as timestamp format" : "'FX-MM-DD ')") : assert result.data == ["NULL"] : : # Strict token length matching. : result = self.client.execute("select cast('2001-3-05' as timestamp format " : "'FX-MM-DD')") : assert result.data == ["NULL"] Maybe add positive tests too for these. e.g after L783: select cast('2001 03-02 03:10:15' as timestamp format 'FX MM-DD HH12:MI:SS') http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@866 PS7, Line 866: doesn't don't http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@872 PS7, Line 872: result = self.client.execute("select cast(cast('2001-03-09' as date) " : "as string format '-FMMM-FMDD')") : assert result.data == ["2001-3-9"] Maybe add a test were year token is FM-modified e.g.: select cast(date'0001-03-09' as string format 'FM-FMMM-FMDD') select cast(date'0001-03-09' as string format 'FMYY-FMMM-FMDD') http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@882 PS7, Line 882:result = self.client.execute("select cast(cast('2001-04-09' as date) " : "as string format 'FX-FMMM-FMDD')") Test FM-modified year token here too. http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@1045 PS7, Line 1045: "select cast('1985-11-21text' as timestamp format '-MM-DD\"text')" Again, please use raw python strings, here and below. -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 16 Oct 2019 09:47:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9052: Events processor should skip blacklisted database and table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14457 ) Change subject: IMPALA-9052: Events processor should skip blacklisted database and table .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5a53b722e6225e9cad9954f447f821c2c677e60 Gerrit-Change-Number: 14457 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Oct 2019 07:39:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9052: Events processor should skip blacklisted database and table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14457 ) Change subject: IMPALA-9052: Events processor should skip blacklisted database and table .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5097/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5a53b722e6225e9cad9954f447f821c2c677e60 Gerrit-Change-Number: 14457 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Oct 2019 07:39:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9052: Events processor should skip blacklisted database and table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14457 ) Change subject: IMPALA-9052: Events processor should skip blacklisted database and table .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5096/ -- To view, visit http://gerrit.cloudera.org:8080/14457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5a53b722e6225e9cad9954f447f821c2c677e60 Gerrit-Change-Number: 14457 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Oct 2019 06:13:19 + Gerrit-HasComments: No