[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..

IMPALA-7096: restore scanner thread memory heuristics

This restores some of the heuristics removed in IMPALA-4835 that can
help scans from hitting OOM conditions. The heuristics are implemented
at the query level rather than in each scan node in isolation.

Introduce a ScannerMemLimiter class that belongs to the QueryState that
tracks the amount of memory estimated to be consumed for all scanner
threads running for the query on the current backend.

Also check soft memory limits to see if scanner threads should be
started or the current scanner thread should stop.

The long-term plan is to switch to the MT scan node implementations.
When that happens this code can be removed. In the meantime this
code is imperfect but will help avoid OOM in many scenarios.

Testing:
Added regression tests for HDFS and Kudu where we previously could
run out of memory with a low mem_limit.

Manual testing:
* Ran query tests with --thread_creation_fault_injection=true for a
  bit, confirmed no crashes.
* ran single-node stress test for Kudu and Parquet for 10-20 min each.

Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Reviewed-on: http://gerrit.cloudera.org:8080/11103
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
A be/src/runtime/scanner-mem-limiter.cc
A be/src/runtime/scanner-mem-limiter.h
A 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
M tests/query_test/test_mem_usage_scaling.py
15 files changed, 444 insertions(+), 49 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Aug 2018 21:24:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 11:

Hit a timeout fetching from mvn repo:

23:09:57 [ERROR] Failed to execute goal on project impala-frontend: Could not 
resolve dependencies for project 
org.apache.impala:impala-frontend:jar:0.1-SNAPSHOT: Could not transfer artifact 
org.apache.sentry:sentry-core-model-db:jar:2.0.0-cdh6.x-20180808.083811-517354 
from/to impala.cdh.repo 
(https://native-toolchain.s3.amazonaws.com/build/cdh_components/517354/maven): 
Connect to native-toolchain.s3.amazonaws.com:443 
[native-toolchain.s3.amazonaws.com/52.219.28.30] failed: Connection timed out 
(Connection timed out) -> [Help 1]


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Aug 2018 18:07:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3025/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Aug 2018 18:08:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3016/


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Aug 2018 00:19:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/359/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 23:12:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3016/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 22:40:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 22:40:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11103

to look at the new patch set (#10).

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..

IMPALA-7096: restore scanner thread memory heuristics

This restores some of the heuristics removed in IMPALA-4835 that can
help scans from hitting OOM conditions. The heuristics are implemented
at the query level rather than in each scan node in isolation.

Introduce a ScannerMemLimiter class that belongs to the QueryState that
tracks the amount of memory estimated to be consumed for all scanner
threads running for the query on the current backend.

Also check soft memory limits to see if scanner threads should be
started or the current scanner thread should stop.

The long-term plan is to switch to the MT scan node implementations.
When that happens this code can be removed. In the meantime this
code is imperfect but will help avoid OOM in many scenarios.

Testing:
Added regression tests for HDFS and Kudu where we previously could
run out of memory with a low mem_limit.

Manual testing:
* Ran query tests with --thread_creation_fault_injection=true for a
  bit, confirmed no crashes.
* ran single-node stress test for Kudu and Parquet for 10-20 min each.

Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
A be/src/runtime/scanner-mem-limiter.cc
A be/src/runtime/scanner-mem-limiter.h
A 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
M tests/query_test/test_mem_usage_scaling.py
15 files changed, 444 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/10
--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3014/


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 22:02:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 8:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/357/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 21:49:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11103/8/be/src/runtime/scanner-mem-limiter.cc
File be/src/runtime/scanner-mem-limiter.cc:

http://gerrit.cloudera.org:8080/#/c/11103/8/be/src/runtime/scanner-mem-limiter.cc@32
PS8, Line 32:   ScanNode* const node;
it might be clearer to remove that now (so there's no question as to whether 
this equals the map key or not).



--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 21:29:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3014/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 21:21:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 9: Code-Review+2

(6 comments)

Carry +2

http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h
File be/src/runtime/scanner-mem-limiter.h:

http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@32
PS7, Line 32: limiting the aggregate memory consumpt
> is it to limit the number of scanner threads, or the aggregate memory consu
Done


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@42
PS7, Line 42: as this objec
> garbled
Done


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@43
PS7, Line 43: (i.e. as long as the below methods
> is that requirement because the instance of this class happens to also be a
Done


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@71
PS7, Line 71:   /// ClaimMemoryForScannerThread() will not be called.
> do we need that? why not just iterate over the map? do we want ordering but
I started off with the vector then kept it as an optimisation to allow more 
efficient iteration in ClaimMemoryForScannerThread. Now that I look at it again 
I doubt there's a significant enough difference between this and unordered_map 
to justify the complexity.


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc
File be/src/runtime/scanner-mem-limiter.cc:

http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@75
PS7, Line 75: e a crude heuristic of guessing that the scan
:   // will
> what code? before this change or before the change that removed the origina
Added a more concrete description of which commits added/removed the code.


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@81
PS7, Line 81: addtl_consumption += static_cast((consumption * 
1.5) / num_threads);
:   }
> I don't really understand that. Why would adding this thread increase consu
Yeah exactly. Tried to improve the comment.



--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 21:20:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11103

to look at the new patch set (#8).

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..

IMPALA-7096: restore scanner thread memory heuristics

This restores some of the heuristics removed in IMPALA-4835 that can
help scans from hitting OOM conditions. The heuristics are implemented
at the query level rather than in each scan node in isolation.

Introduce a ScannerMemLimiter class that belongs to the QueryState that
tracks the amount of memory estimated to be consumed for all scanner
threads running for the query on the current backend.

Also check soft memory limits to see if scanner threads should be
started or the current scanner thread should stop.

The long-term plan is to switch to the MT scan node implementations.
When that happens this code can be removed. In the meantime this
code is imperfect but will help avoid OOM in many scenarios.

Testing:
Added regression tests for HDFS and Kudu where we previously could
run out of memory with a low mem_limit.

Manual testing:
* Ran query tests with --thread_creation_fault_injection=true for a
  bit, confirmed no crashes.
* ran single-node stress test for Kudu and Parquet for 10-20 min each.

Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
A be/src/runtime/scanner-mem-limiter.cc
A be/src/runtime/scanner-mem-limiter.h
A 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
M tests/query_test/test_mem_usage_scaling.py
15 files changed, 446 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/8
--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 7: Code-Review+2

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h
File be/src/runtime/scanner-mem-limiter.h:

http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@32
PS7, Line 32: limiting the number of scanner threads
is it to limit the number of scanner threads, or the aggregate memory 
consumption by scanner threads?


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@42
PS7, Line 42: as long until
garbled


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@43
PS7, Line 43: tears down all control structures.
is that requirement because the instance of this class happens to also be a 
QueryState control structure? If so, maybe clearer to just say node must 
outlive the lifetime of this object? (i.e. to indicate that this object is 
gonna keep a reference to it). Since, it's not really dictated by this 
abstraction who owns the instance of '*this'.


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.h@71
PS7, Line 71:   std::vector> registered_scans_;
do we need that? why not just iterate over the map? do we want ordering but 
don't want to use map<>?


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc
File be/src/runtime/scanner-mem-limiter.cc:

http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@75
PS7, Line 75: This is carried over from old versions of the
:   // code.
what code? before this change or before the change that removed the original 
heuristic? only because this is so arbitrary, it might help be be more specific 
in this reference.


http://gerrit.cloudera.org:8080/#/c/11103/7/be/src/runtime/scanner-mem-limiter.cc@81
PS7, Line 81:   // Add the expected increase in consumption for existing 
threads.
:   addtl_consumption += static_cast(consumption * 
0.5);
I don't really understand that. Why would adding this thread increase 
consumption of the other threads?
oh, maybe this is just saying that we're guessing in the future the threads may 
happen to grow by 50% over their current usage?



--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Aug 2018 18:03:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Aug 2018 01:32:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/256/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Aug 2018 22:40:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2961/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Aug 2018 22:17:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 7: Code-Review+1

carry


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Aug 2018 22:17:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-08 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11103

to look at the new patch set (#7).

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..

IMPALA-7096: restore scanner thread memory heuristics

This restores some of the heuristics removed in IMPALA-4835 that can
help scans from hitting OOM conditions. The heuristics are implemented
at the query level rather than in each scan node in isolation.

Introduce a ScannerMemLimiter class that belongs to the QueryState that
tracks the amount of memory estimated to be consumed for all scanner
threads running for the query on the current backend.

Also check soft memory limits to see if scanner threads should be
started or the current scanner thread should stop.

The long-term plan is to switch to the MT scan node implementations.
When that happens this code can be removed. In the meantime this
code is imperfect but will help avoid OOM in many scenarios.

Testing:
Added regression tests for HDFS and Kudu where we previously could
run out of memory with a low mem_limit.

Manual testing:
* Ran query tests with --thread_creation_fault_injection=true for a
  bit, confirmed no crashes.
* ran single-node stress test for Kudu and Parquet for 10-20 min each.

Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
A be/src/runtime/scanner-mem-limiter.cc
A be/src/runtime/scanner-mem-limiter.h
A 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
M tests/query_test/test_mem_usage_scaling.py
15 files changed, 446 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/7
--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc@55
PS6, Line 55: he
> nit: The
Done


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h
File be/src/exec/kudu-scan-node.h:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h@77
PS6, Line 77: GetEstimatedMemPerThread
> nit: maybe have the same name here as in hdfs-scan node
Done


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc@159
PS6, Line 159: / Cases 5, 6 and 7.
> nit: copy-paste error
Not sure how I didn't see that when reading through the patch.


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h
File be/src/runtime/scanner-mem-limiter.h:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@30
PS6, Line 30: /// Class to keep track of the global state of scanner threads 
and how much memory
> nit: I know its implied, but maybe just mention explicitly that it is used
Good point - "global" was a bad choice


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@42
PS6, Line 42: Each 'node' can only be registered once.
> maybe add a dcheck for that
Done



--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Aug 2018 22:17:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-07 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 6: Code-Review+1

(5 comments)

looks good. just a few mits

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/hdfs-scan-node.cc@55
PS6, Line 55: he
nit: The


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h
File be/src/exec/kudu-scan-node.h:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.h@77
PS6, Line 77: GetEstimatedMemPerThread
nit: maybe have the same name here as in hdfs-scan node


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/exec/kudu-scan-node.cc@159
PS6, Line 159: / Cases 5, 6 and 7.
nit: copy-paste error


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h
File be/src/runtime/scanner-mem-limiter.h:

http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@30
PS6, Line 30: /// Class to keep track of the global state of scanner threads 
and how much memory
nit: I know its implied, but maybe just mention explicitly that it is used to 
keep track of the scanner threads on a per query per host level.
The first time i read it, without looking at the whole patch it seemed like it 
kept track of all scanner threads running on the host.


http://gerrit.cloudera.org:8080/#/c/11103/6/be/src/runtime/scanner-mem-limiter.h@42
PS6, Line 42: Each 'node' can only be registered once.
maybe add a dcheck for that



--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Aug 2018 21:23:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/204/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Aug 2018 20:05:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/exec/hdfs-scan-node.cc@59
PS4, Line 59: const int SCANNER_THREAD_MEM_USAGE = 32 * 1024 * 1024;
> Make configurable?
Done


http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/runtime/scanner-mem-limiter.h
File be/src/runtime/scanner-mem-limiter.h:

http://gerrit.cloudera.org:8080/#/c/11103/4/be/src/runtime/scanner-mem-limiter.h@26
PS4, Line 26: #include "common/atomic.h"
> Not needed
Done



--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Aug 2018 19:09:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-06 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11103

to look at the new patch set (#6).

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..

IMPALA-7096: restore scanner thread memory heuristics

This restores some of the heuristics removed in IMPALA-4835 that can
help scans from hitting OOM conditions. The heuristics are implemented
at the query level rather than in each scan node in isolation.

Introduce a ScannerMemLimiter class that belongs to the QueryState that
tracks the amount of memory estimated to be consumed for all scanner
threads running for the query on the current backend.

Also check soft memory limits to see if scanner threads should be
started or the current scanner thread should stop.

The long-term plan is to switch to the MT scan node implementations.
When that happens this code can be removed. In the meantime this
code is imperfect but will help avoid OOM in many scenarios.

Testing:
Added regression tests for HDFS and Kudu where we previously could
run out of memory with a low mem_limit.

Manual testing:
* Ran query tests with --thread_creation_fault_injection=true for a
  bit, confirmed no crashes.
* ran single-node stress test for Kudu and Parquet for 10-20 min each.

Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
A be/src/runtime/scanner-mem-limiter.cc
A be/src/runtime/scanner-mem-limiter.h
A 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
M tests/query_test/test_mem_usage_scaling.py
15 files changed, 441 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/6
--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/176/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Aug 2018 16:17:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-03 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11103

to look at the new patch set (#5).

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..

IMPALA-7096: restore scanner thread memory heuristics

This restores some of the heuristics removed in IMPALA-4835 that can
help scans from hitting OOM conditions. The heuristics are implemented
at the query level rather than in each scan node in isolation.

Introduce a ScannerMemLimiter class that belongs to the QueryState that
tracks the amount of memory estimated to be consumed for all scanner
threads running for the query on the current backend.

Also check soft memory limits to see if scanner threads should be
started or the current scanner thread should stop.

The long-term plan is to switch to the MT scan node implementations.
When that happens this code can be removed. In the meantime this
code is imperfect but will help avoid OOM in many scenarios.

Testing:
Added regression tests for HDFS and Kudu where we previously could
run out of memory with a low mem_limit.

Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
A be/src/runtime/scanner-mem-limiter.cc
A be/src/runtime/scanner-mem-limiter.h
A 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
M tests/query_test/test_mem_usage_scaling.py
15 files changed, 439 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/5
--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11103 )

Change subject: IMPALA-7096: restore scanner thread memory heuristics
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/172/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 03 Aug 2018 02:10:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7096: restore scanner thread memory heuristics

2018-08-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11103


Change subject: IMPALA-7096: restore scanner thread memory heuristics
..

IMPALA-7096: restore scanner thread memory heuristics

This restores some of the heuristics removed in IMPALA-4835 that can
help scans from hitting OOM conditions. The heuristics are implemented
at the query level rather than in each scan node in isolation.

Introduce a ScannerMemLimiter class that belongs to the QueryState that
tracks the amount of memory estimated to be consumed for all scanner
threads running for the query on the current backend.

Also check soft memory limits to see if scanner threads should be
started or the current scanner thread should stop.

The long-term plan is to switch to the MT scan node implementations.
When that happens this code can be removed. In the meantime this
code is imperfect but will help avoid OOM in many scenarios.

Testing:
Added regression tests for HDFS and Kudu where we previously could
run out of memory with a low mem_limit.

Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
A be/src/runtime/scanner-mem-limiter.cc
A be/src/runtime/scanner-mem-limiter.h
A 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
M tests/query_test/test_mem_usage_scaling.py
15 files changed, 444 insertions(+), 49 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11103/4
--
To view, visit http://gerrit.cloudera.org:8080/11103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib9907fa8c4d2b0b85f67f4f160899c1c258ad82b
Gerrit-Change-Number: 11103
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig