[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rather than in each scan node in isolation. Introduce a ScannerMemLimiter class that belongs to the QueryState that tracks the amount of memory estimated to be consumed for all scanner threads running for the query on the current backend. Also check soft memory limits to see if scanner threads should be started or the current scanner thread should stop. The long-term plan is to switch to the MT scan node implementations. When that happens this code can be removed. In the meantime this code is imperfect but will help avoid OOM in many scenarios. Testing: Added regression tests for HDFS and Kudu where we previously could run out of memory with a low mem_limit. Manual testing: * Ran query tests with --thread_creation_fault_injection=true for a bit, confirmed no crashes. * ran single-node stress test for Kudu and Parquet for 10-20 min each. Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Reviewed-on: http://gerrit.cloudera.org:8080/11103 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/CMakeLists.txt M be/src/runtime/query-state.cc M be/src/runtime/query-state.h A be/src/runtime/scanner-mem-limiter.cc A be/src/runtime/scanner-mem-limiter.h A testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test M tests/query_test/test_mem_usage_scaling.py 15 files changed, 444 insertions(+), 49 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 Aug 2018 21:24:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 11: Hit a timeout fetching from mvn repo: 23:09:57 [ERROR] Failed to execute goal on project impala-frontend: Could not resolve dependencies for project org.apache.impala:impala-frontend:jar:0.1-SNAPSHOT: Could not transfer artifact org.apache.sentry:sentry-core-model-db:jar:2.0.0-cdh6.x-20180808.083811-517354 from/to impala.cdh.repo (https://native-toolchain.s3.amazonaws.com/build/cdh_components/517354/maven): Connect to native-toolchain.s3.amazonaws.com:443 [native-toolchain.s3.amazonaws.com/52.219.28.30] failed: Connection timed out (Connection timed out) -> [Help 1] -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 Aug 2018 18:07:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3025/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 Aug 2018 18:08:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 11: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3016/ -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 Aug 2018 00:19:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/359/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 23:12:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3016/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 22:40:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 22:40:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Hello Bikramjeet Vig, Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11103 to look at the new patch set (#10). Change subject: IMPALA-7096: restore scanner thread memory heuristics .. IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rather than in each scan node in isolation. Introduce a ScannerMemLimiter class that belongs to the QueryState that tracks the amount of memory estimated to be consumed for all scanner threads running for the query on the current backend. Also check soft memory limits to see if scanner threads should be started or the current scanner thread should stop. The long-term plan is to switch to the MT scan node implementations. When that happens this code can be removed. In the meantime this code is imperfect but will help avoid OOM in many scenarios. Testing: Added regression tests for HDFS and Kudu where we previously could run out of memory with a low mem_limit. Manual testing: * Ran query tests with --thread_creation_fault_injection=true for a bit, confirmed no crashes. * ran single-node stress test for Kudu and Parquet for 10-20 min each. Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/CMakeLists.txt M be/src/runtime/query-state.cc M be/src/runtime/query-state.h A be/src/runtime/scanner-mem-limiter.cc A be/src/runtime/scanner-mem-limiter.h A testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test M tests/query_test/test_mem_usage_scaling.py 15 files changed, 444 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/10 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 9: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3014/ -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 22:02:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 8: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/357/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 21:49:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/11103/8/be/src/runtime/scanner-mem-limiter.cc File be/src/runtime/scanner-mem-limiter.cc: http://gerrit.cloudera.org:8080/#/c/11103/8/be/src/runtime/scanner-mem-limiter.cc@32 PS8, Line 32: ScanNode* const node; it might be clearer to remove that now (so there's no question as to whether this equals the map key or not). -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 21:29:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3014/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 21:21:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 9: Code-Review+2 (6 comments) Carry +2 http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h File be/src/runtime/scanner-mem-limiter.h: http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@32 PS7, Line 32: limiting the aggregate memory consumpt > is it to limit the number of scanner threads, or the aggregate memory consu Done http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@42 PS7, Line 42: as this objec > garbled Done http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@43 PS7, Line 43: (i.e. as long as the below methods > is that requirement because the instance of this class happens to also be a Done http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@71 PS7, Line 71: /// ClaimMemoryForScannerThread() will not be called. > do we need that? why not just iterate over the map? do we want ordering but I started off with the vector then kept it as an optimisation to allow more efficient iteration in ClaimMemoryForScannerThread. Now that I look at it again I doubt there's a significant enough difference between this and unordered_map to justify the complexity. http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc File be/src/runtime/scanner-mem-limiter.cc: http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@75 PS7, Line 75: e a crude heuristic of guessing that the scan : // will > what code? before this change or before the change that removed the origina Added a more concrete description of which commits added/removed the code. http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@81 PS7, Line 81: addtl_consumption += static_cast((consumption * 1.5) / num_threads); : } > I don't really understand that. Why would adding this thread increase consu Yeah exactly. Tried to improve the comment. -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 21:20:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Hello Bikramjeet Vig, Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11103 to look at the new patch set (#8). Change subject: IMPALA-7096: restore scanner thread memory heuristics .. IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rather than in each scan node in isolation. Introduce a ScannerMemLimiter class that belongs to the QueryState that tracks the amount of memory estimated to be consumed for all scanner threads running for the query on the current backend. Also check soft memory limits to see if scanner threads should be started or the current scanner thread should stop. The long-term plan is to switch to the MT scan node implementations. When that happens this code can be removed. In the meantime this code is imperfect but will help avoid OOM in many scenarios. Testing: Added regression tests for HDFS and Kudu where we previously could run out of memory with a low mem_limit. Manual testing: * Ran query tests with --thread_creation_fault_injection=true for a bit, confirmed no crashes. * ran single-node stress test for Kudu and Parquet for 10-20 min each. Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/CMakeLists.txt M be/src/runtime/query-state.cc M be/src/runtime/query-state.h A be/src/runtime/scanner-mem-limiter.cc A be/src/runtime/scanner-mem-limiter.h A testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test M tests/query_test/test_mem_usage_scaling.py 15 files changed, 446 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/8 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 7: Code-Review+2 (6 comments) http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h File be/src/runtime/scanner-mem-limiter.h: http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@32 PS7, Line 32: limiting the number of scanner threads is it to limit the number of scanner threads, or the aggregate memory consumption by scanner threads? http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@42 PS7, Line 42: as long until garbled http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@43 PS7, Line 43: tears down all control structures. is that requirement because the instance of this class happens to also be a QueryState control structure? If so, maybe clearer to just say node must outlive the lifetime of this object? (i.e. to indicate that this object is gonna keep a reference to it). Since, it's not really dictated by this abstraction who owns the instance of '*this'. http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@71 PS7, Line 71: std::vector> registered_scans_; do we need that? why not just iterate over the map? do we want ordering but don't want to use map<>? http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc File be/src/runtime/scanner-mem-limiter.cc: http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@75 PS7, Line 75: This is carried over from old versions of the : // code. what code? before this change or before the change that removed the original heuristic? only because this is so arbitrary, it might help be be more specific in this reference. http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@81 PS7, Line 81: // Add the expected increase in consumption for existing threads. : addtl_consumption += static_cast(consumption * 0.5); I don't really understand that. Why would adding this thread increase consumption of the other threads? oh, maybe this is just saying that we're guessing in the future the threads may happen to grow by 50% over their current usage? -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Aug 2018 18:03:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 09 Aug 2018 01:32:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/256/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Aug 2018 22:40:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2961/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Aug 2018 22:17:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 7: Code-Review+1 carry -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Aug 2018 22:17:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Hello Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11103 to look at the new patch set (#7). Change subject: IMPALA-7096: restore scanner thread memory heuristics .. IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rather than in each scan node in isolation. Introduce a ScannerMemLimiter class that belongs to the QueryState that tracks the amount of memory estimated to be consumed for all scanner threads running for the query on the current backend. Also check soft memory limits to see if scanner threads should be started or the current scanner thread should stop. The long-term plan is to switch to the MT scan node implementations. When that happens this code can be removed. In the meantime this code is imperfect but will help avoid OOM in many scenarios. Testing: Added regression tests for HDFS and Kudu where we previously could run out of memory with a low mem_limit. Manual testing: * Ran query tests with --thread_creation_fault_injection=true for a bit, confirmed no crashes. * ran single-node stress test for Kudu and Parquet for 10-20 min each. Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/CMakeLists.txt M be/src/runtime/query-state.cc M be/src/runtime/query-state.h A be/src/runtime/scanner-mem-limiter.cc A be/src/runtime/scanner-mem-limiter.h A testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test M tests/query_test/test_mem_usage_scaling.py 15 files changed, 446 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/7 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc@55 PS6, Line 55: he > nit: The Done http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h File be/src/exec/kudu-scan-node.h: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h@77 PS6, Line 77: GetEstimatedMemPerThread > nit: maybe have the same name here as in hdfs-scan node Done http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc File be/src/exec/kudu-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc@159 PS6, Line 159: / Cases 5, 6 and 7. > nit: copy-paste error Not sure how I didn't see that when reading through the patch. http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h File be/src/runtime/scanner-mem-limiter.h: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@30 PS6, Line 30: /// Class to keep track of the global state of scanner threads and how much memory > nit: I know its implied, but maybe just mention explicitly that it is used Good point - "global" was a bad choice http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@42 PS6, Line 42: Each 'node' can only be registered once. > maybe add a dcheck for that Done -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Aug 2018 22:17:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 6: Code-Review+1 (5 comments) looks good. just a few mits http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc@55 PS6, Line 55: he nit: The http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h File be/src/exec/kudu-scan-node.h: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h@77 PS6, Line 77: GetEstimatedMemPerThread nit: maybe have the same name here as in hdfs-scan node http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc File be/src/exec/kudu-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc@159 PS6, Line 159: / Cases 5, 6 and 7. nit: copy-paste error http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h File be/src/runtime/scanner-mem-limiter.h: http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@30 PS6, Line 30: /// Class to keep track of the global state of scanner threads and how much memory nit: I know its implied, but maybe just mention explicitly that it is used to keep track of the scanner threads on a per query per host level. The first time i read it, without looking at the whole patch it seemed like it kept track of all scanner threads running on the host. http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@42 PS6, Line 42: Each 'node' can only be registered once. maybe add a dcheck for that -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Aug 2018 21:23:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/204/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Aug 2018 20:05:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/exec/hdfs-scan-node.cc@59 PS4, Line 59: const int SCANNER_THREAD_MEM_USAGE = 32 * 1024 * 1024; > Make configurable? Done http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/runtime/scanner-mem-limiter.h File be/src/runtime/scanner-mem-limiter.h: http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/runtime/scanner-mem-limiter.h@26 PS4, Line 26: #include "common/atomic.h" > Not needed Done -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Aug 2018 19:09:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Hello Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11103 to look at the new patch set (#6). Change subject: IMPALA-7096: restore scanner thread memory heuristics .. IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rather than in each scan node in isolation. Introduce a ScannerMemLimiter class that belongs to the QueryState that tracks the amount of memory estimated to be consumed for all scanner threads running for the query on the current backend. Also check soft memory limits to see if scanner threads should be started or the current scanner thread should stop. The long-term plan is to switch to the MT scan node implementations. When that happens this code can be removed. In the meantime this code is imperfect but will help avoid OOM in many scenarios. Testing: Added regression tests for HDFS and Kudu where we previously could run out of memory with a low mem_limit. Manual testing: * Ran query tests with --thread_creation_fault_injection=true for a bit, confirmed no crashes. * ran single-node stress test for Kudu and Parquet for 10-20 min each. Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/CMakeLists.txt M be/src/runtime/query-state.cc M be/src/runtime/query-state.h A be/src/runtime/scanner-mem-limiter.cc A be/src/runtime/scanner-mem-limiter.h A testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test M tests/query_test/test_mem_usage_scaling.py 15 files changed, 441 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/6 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/176/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 03 Aug 2018 16:17:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Hello Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11103 to look at the new patch set (#5). Change subject: IMPALA-7096: restore scanner thread memory heuristics .. IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rather than in each scan node in isolation. Introduce a ScannerMemLimiter class that belongs to the QueryState that tracks the amount of memory estimated to be consumed for all scanner threads running for the query on the current backend. Also check soft memory limits to see if scanner threads should be started or the current scanner thread should stop. The long-term plan is to switch to the MT scan node implementations. When that happens this code can be removed. In the meantime this code is imperfect but will help avoid OOM in many scenarios. Testing: Added regression tests for HDFS and Kudu where we previously could run out of memory with a low mem_limit. Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/CMakeLists.txt M be/src/runtime/query-state.cc M be/src/runtime/query-state.h A be/src/runtime/scanner-mem-limiter.cc A be/src/runtime/scanner-mem-limiter.h A testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test M tests/query_test/test_mem_usage_scaling.py 15 files changed, 439 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/5 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11103 ) Change subject: IMPALA-7096: restore scanner thread memory heuristics .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/172/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 03 Aug 2018 02:10:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11103 Change subject: IMPALA-7096: restore scanner thread memory heuristics .. IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rather than in each scan node in isolation. Introduce a ScannerMemLimiter class that belongs to the QueryState that tracks the amount of memory estimated to be consumed for all scanner threads running for the query on the current backend. Also check soft memory limits to see if scanner threads should be started or the current scanner thread should stop. The long-term plan is to switch to the MT scan node implementations. When that happens this code can be removed. In the meantime this code is imperfect but will help avoid OOM in many scenarios. Testing: Added regression tests for HDFS and Kudu where we previously could run out of memory with a low mem_limit. Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/runtime/CMakeLists.txt M be/src/runtime/query-state.cc M be/src/runtime/query-state.h A be/src/runtime/scanner-mem-limiter.cc A be/src/runtime/scanner-mem-limiter.h A testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test M tests/query_test/test_mem_usage_scaling.py 15 files changed, 444 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/4 -- To view, visit http://gerrit.cloudera.org:8080/11103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b Gerrit-Change-Number: 11103 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig