[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting a lower bound of 2 * MAX_ROW_SIZE while
computing the min reservation for the PlanRootSink.

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Reviewed-on: http://gerrit.cloudera.org:8080/16765
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
5 files changed, 118 insertions(+), 9 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 04 Dec 2020 23:55:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 04 Dec 2020 18:23:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6729/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 04 Dec 2020 18:23:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-04 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 04 Dec 2020 17:46:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7774/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 04 Dec 2020 05:15:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-03 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 7:

(3 comments)

Thanks Bikram,

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG@14
PS6, Line 14: computing the min reservation for the PlanRootS
> nit: i think this description is slightly opaque since the reader would hav
Done


http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS6, Line 90: bufferSize, queryOptions.getMax_row_size());
> nit: can you add a small comment on why we need 2 maxRowBufferSize, either
I edited the method documentation a bit to explain this. Hope it is clear.


http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py@426
PS6, Line 426: DEBUG_ACTION_V
> nit: switch to ALL_CAPS to be consistent with rest of the codebase.
Done



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 04 Dec 2020 04:55:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-03 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#7).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting a lower bound of 2 * MAX_ROW_SIZE while
computing the min reservation for the PlanRootSink.

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
5 files changed, 118 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/7
--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-03 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 6:

(3 comments)

looks good, will +2 once the nits are resolved

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG@14
PS6, Line 14: minMemReservationBytes as 2 * maxRowBufferSize.
nit: i think this description is slightly opaque since the reader would have to 
know how maxRowBufferSize is computed. I think something like this would 
suffice:

This patch fixes this by setting a lower bound of 2 * MAX_ROW_SIZE while 
computing the min reservation for the PlanRootSink


http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS6, Line 90: long minMemReservationBytes = 2 * maxRowBufferSize;
nit: can you add a small comment on why we need 2 maxRowBufferSize, either here 
or in the method comment above


http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py@426
PS6, Line 426: _debug_actions
nit: switch to ALL_CAPS to be consistent with rest of the codebase.
DEBUG_ACTION_VALUES



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 04 Dec 2020 01:08:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-02 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97: // have been set.
> Good point! I didin't realize IMPALA-9851 when adding that DebugString() in
Done.
Filed as IMPALA-10374.



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 02 Dec 2020 18:34:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-12-01 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97: // have been set.
> Added row size and batch_queue_->DebugString() info in the log.
Good point! I didin't realize IMPALA-9851 when adding that DebugString() in 
GroupingAggregator::Partition::SerializeStreamForSpilling(). Please file a 
separate JIRA for it.



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 02 Dec 2020 04:07:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-30 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@26
PS3, Line 26:
> A better test is using SET_DENY_RESERVATION_PROBABILITY to verify the Buffe
Added debug action SET_DENY_RESERVATION_PROBABILITY@1.0 in 
TestResultSpoolingMaxReservation.
test_spilling_large_rows already exercise SET_DENY_RESERVATION_PROBABILITY with 
mt_dop=1.


http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97: // have been set.
> It'd be helpful to log the row size and batch_queue_->DebugString() here.
Added row size and batch_queue_->DebugString() info in the log.

However, this reminds me of IMPALA-9851.
BufferedTupleStream::DebugString() iterate std::list that can
potentially grow very large. While most references to this method are
behind DCHEK logging (thus stripped in release build), one of them is
used to build Status message through call to
GroupingAggregator::Partition::DebugString() at this line:
https://github.com/apache/impala/blob/5530b62/be/src/exec/grouping-aggregator-partition.cc#L155

IMPALA-9851 already cap error message to 128kb at most, but that does
not eliminate the cost to iterate page list in
BufferedTupleStream::DebugString(). Should I file this as separate Jira?


http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS3, Line 90:   long minMemReservationBytes = 2 * maxRowBufferSize;
> We should be able to work using the minimal reservation. So I think the pro
Make sense.
Bump up minMemReservation instead of maxMemReservationBytes in patch set #6.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@68
PS3, Line 68: exec_options['max_row_size'] = 16 * 1024
> test_spilling and test_query_retries need to force spill by lowering spill
Done


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@106
PS3, Line 106:   assert re.search(plan_root_sink_reservation_limit, profile)
> Thanks for the context above, makes sense to keep this too.
Done



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 30 Nov 2020 22:32:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7753/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 30 Nov 2020 22:29:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-30 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#6).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting the PlanRootSink's
minMemReservationBytes as 2 * maxRowBufferSize.

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
5 files changed, 115 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/6
--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-30 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@106
PS3, Line 106:   assert re.search(plan_root_sink_reservation_limit, profile)
> This assertion, however, is OK to delete, because it is not the focus of th
Thanks for the context above, makes sense to keep this too.



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 30 Nov 2020 19:52:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-26 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97: DCHECK(false) << "Rows should be added in unpinned mode 
unless an error occurred";
It'd be helpful to log the row size and batch_queue_->DebugString() here.



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Nov 2020 09:14:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@26
PS3, Line 26:   remain unchanged after lowering the MAX_ROW_SIZE.
A better test is using SET_DENY_RESERVATION_PROBABILITY to verify the 
BufferedPlanRootSink can work in minimal reservation. There are some examples 
in query_test/test_spilling.py::TestSpillingDebugActionDimensions. Maybe we can 
add a test in test_spilling_large_rows.


http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS3, Line 90:   long minMemReservationBytes = 2 * bufferSize;
We should be able to work using the minimal reservation. So I think the problem 
is minMemReservationBytes should be 2 * maxRowBufferSize here, i.e. reserve mem 
for one large read page and one large write page.

The minMemReservation calculation of the final aggregation node is an example:
https://github.com/apache/impala/blob/8ea49e9b026d48b46e9fbd98dc5286f3e6dfa93d/fe/src/main/java/org/apache/impala/planner/AggregationNode.java#L582.
 The final AggregationNode needs n pages, then the min mem reservation is 
bufferSize * (n-2) + maxRowBufferSize * 2.

Here the BufferedPlanRootSink has a SpillableRowBatchQueue which is backed by a 
BufferedTupleStream. The stream is a read-write stream so may pin one read page 
and one write page at the same time, and they could both be a large page. So we 
should at least reserve 2 * maxRowBufferSize of mem for it. With enough 
reservation, batch_queue_->AddRow won't fail and hit the DCHECK:
https://github.com/apache/impala/blob/eea617b/be/src/runtime/spillable-row-batch-queue.cc#L97



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Nov 2020 07:33:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@10
PS3, Line 10: This happens
: because results are spilled using a SpillableRowBatchQueue which 
needs 2
: buffer (read and write) with at least MAX_ROW_SIZE bytes per 
buffer.
: This patch fixes this by s
> nit: how about :
Done


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@68
PS3, Line 68: exec_options['max_row_size'] = 16 * 1024
> I think you can remove this change and the one in test_query_retries since
test_spilling and test_query_retries need to force spill by lowering spill 
related query options.
As consequence of this patch, MAX_ROW_SIZE now also need to be lowered in order 
to force spill.

The default MAX_ROW_SIZE is too high (512 KB) for the context of the tests.
With changes introduced in PlanRootSink.java, default MAX_ROW_SIZE will now 
contribute to pushing maxMemReservationBytes to 1 MB, which in turn will fit 
all rows in memory (no spill happens).

I have double check by commenting the MAX_ROW_SIZE option, and these tests 
failed without it.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@106
PS3, Line 106:   assert re.search(plan_root_sink_reservation_limit, profile)
This assertion, however, is OK to delete, because it is not the focus of the 
test.
I just add this to verify that ReservationLimit stay the same as before when 
MAX_ROW_SIZE does not contribute to maxMemReservationBytes.
Similarly with the one I added in test_query_retries.
Let me know what you think of.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@419
PS3, Line 419:   """These tests verify that while calculating max_reservation 
for spooling these query
> nit: you can add a comment here saying that these tests verify that while c
Done


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@479
PS3, Line 479: ")
> nit: spills
Done



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Nov 2020 04:22:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7739/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Nov 2020 04:21:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#5).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting the PlanRootSink's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 95 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/5
--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#4).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 95 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/4
--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@10
PS3, Line 10: Underneath, a
: SpillableRowBatchQueue need 2 buffer (read and write) each to fit 
at
: least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
: maxMemReservationBytes as:
nit: how about :
This happens because results are spilled using a SpillableRowBatchQueue which 
needs 2 buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer. 
This patch fixes this by setting the PlanRootSink's maxMemReservationBytes as: 
max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@68
PS3, Line 68: exec_options['max_row_size'] = 16 * 1024
I think you can remove this change and the one in test_query_retries since you 
have added a self contained test to verify max_row_size's contribution and 
these other tests are more concerned with testing other functionality.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@419
PS3, Line 419:
nit: you can add a comment here saying that these tests verify that while 
calculating max_reservation for spooling these query options are taken into 
account: MAX_ROW_SIZE, MAX_RESULT_SPOOLING_MEM and 
DEFAULT_SPILLABLE_BUFFER_SIZE.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@479
PS3, Line 479: spill
nit: spills



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Nov 2020 02:11:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7735/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 25 Nov 2020 19:13:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7734/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 25 Nov 2020 19:02:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
> got it, thanks for the explanation
I was wrong. Large DEFAULT_SPILLABLE_BUFFER_SIZE can wins calculation for 
maxMemReservationBytes by increasing minMemReservationBytes. The test case I 
mention earlier, however, stays true.

I adjust the commit message and add 
TestResultSpoolingMaxReservation::test_high_default_spillable_buffer to reflect 
this.


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 8 * 1024,
> can you add a test that confirms that changing this will have an effect. So
I add verification to check that query ReservationLimit remains unchanged after 
addition of MAX_ROW_SIZE query option.

I also add TestResultSpoolingMaxReservation to verify that all three of 
MAX_ROW_SIZE, MAX_RESULT_SPOOLING_MEM, and DEFAULT_SPILLABLE_BUFFER_SIZE query 
options can contribute towards increasing PLAN_ROOT_SINK's ReservationLimit.



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 25 Nov 2020 19:03:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#3).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 92 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/3
--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/2/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/2/tests/query_test/test_result_spooling.py@447
PS2, Line 447: q
flake8: F821 undefined name 'query'


http://gerrit.cloudera.org:8080/#/c/16765/2/tests/query_test/test_result_spooling.py@493
PS2, Line 493:
flake8: W391 blank line at end of file



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 25 Nov 2020 18:43:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-25 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#2).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 92 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/2
--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-24 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
> My understanding is that DEFAULT_SPILLABLE_BUFFER_SIZE control the minimal
got it, thanks for the explanation


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
> It intended to keep the test the same, keeping the maxMemReservationBytes t
can you add a test that confirms that changing this will have an effect. 
Something like have a test that would fail without using this and then setting 
this would let it run. Or maybe something that you can confirm, a metric 
perhaps that changes  when you set this



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 24 Nov 2020 21:11:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
> Is there any relation to this and the max_row_Size?
My understanding is that DEFAULT_SPILLABLE_BUFFER_SIZE control the minimal unit 
of page size when increasing memory reservation. It is intended to tune 
performance in spill operation (larger buffer sizes result in Impala issuing 
larger I/O requests to storage devices, which might result in higher 
throughput) rather than in-memory allocation.
it contribute to calculation of maxRowBufferSize, but not to 
maxMemReservationBytes.

Let say user only set MAX_ROW_SIZE to 256MB, and leave 
DEFAULT_SPILLABLE_BUFFER_SIZE and MAX_RESULT_SPOOLING_MEM as default (2MB and 
100 MB accordingly). Without the patch, the result is the following:
bufferSize = 2MB
maxRowBufferSize = 256MB (rounded up to nearest power of two)
minMemReservationBytes = 4MB
maxMemReservationBytes = 100MB

maxMemReservationBytes should be at least 2 * MAX_ROW_SIZE for reservation of 
read+write page to be satisfied, which is not true in this case.


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
> is this having any effect on the query? the max_result_spooling_mem is alre
It intended to keep the test the same, keeping the maxMemReservationBytes to 
8KB.

Default MAX_ROW_SIZE is 512KB. With the proposed changes in 
PlannerRootSink.java and without explicitly lowering MAX_ROW_SIZE to 4KB, 
MAX_RESULT_SPOOLING_MEM will be ignored and maxMemReservationBytes will be set 
to 1MB instead of intended 8KB.



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 24 Nov 2020 01:28:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
Is there any relation to this and the max_row_Size?
should it be able to hold the full row size as well?


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
is this having any effect on the query? the max_result_spooling_mem is already 
set to 8 and this is set to 4 which will result in the maxMemReservationBytes 
set to 8 in either.
Similar thing for the other test



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 24 Nov 2020 00:15:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-21 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16765


Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

Testing:
- Pass exhaustive tests.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 4 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/1
--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto