Riza Suminto has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18126 )
Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. ...................................................................... IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until the memory limit exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granularity. If this query option not set, set to 0, or negative value, backend will revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in query-options-test.cc and TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Reviewed-on: http://gerrit.cloudera.org:8080/18126 Reviewed-by: Csaba Ringhofer <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 89 insertions(+), 6 deletions(-) Approvals: Csaba Ringhofer: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
