Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13883 )

Change subject: IMPALA-8779, IMPALA-8780: RowBatchQueue re-factoring and 
BufferedPRS impl
......................................................................


Patch Set 11:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.h
File be/src/exec/buffered-plan-root-sink.h:

http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.h@64
PS11, Line 64: //
nit: ///


http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.cc
File be/src/exec/buffered-plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.cc@27
PS11, Line 27: 10
This should be a static constant


http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.cc@54
PS11, Line 54:     while (batch_queue_->IsFull()) {
             :       is_full_.Wait(l);
             :       RETURN_IF_CANCELLED(state);
             :     }
May hang for a while if the sink is already cancelled when we get here, it may 
be more safer to do:

   while (!state->cancelled() && batch_queue->IsFull()) {
        is_full_.Wait(l);
   }
   RETURN_IF_CANCELLED(state);


http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.cc@103
PS11, Line 103:   rows_available_.NotifyOne();
              :   consumer_eos_.NotifyOne();
              :   is_full_.NotifyOne();
I think keeping NotifyAll() make sense for the cancellation path. Same for 
Close(). NotifyOne() makes assumption about the number of waiters on these 
condition variables, which seems dicey for the cancellation path and it doesn't 
hurt to do a NotifyAll() for the slow path.


http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.cc@123
PS11, Line 123:       // For now, if num_results < batch->num_rows(), we 
terminate returning results
              :       // early.
              :       if (num_results > 0 && num_results < batch->num_rows()) {
Not sure I understand this part ? Shouldn't this function still need to insert 
up to 'num_results' rows into 'results' in this case ?

In other words, if the row batch at the front of the queue is not completely 
consumed, don't you need to track how many rows have been consumed from it ?


http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/buffered-plan-root-sink.cc@127
PS11, Line 127: consumer_eos_.NotifyOne();
May need to do is_full_.NotifyOne() here in case the sender was blocked. Seems 
better to refactor this code (if possible) to always return at line 147.


http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/exec/scan-node.cc@262
PS11, Line 262:           ADD_TIMER(parent->runtime_profile(), 
"RowBatchQueueGetWaitTime"),
              :           ADD_TIMER(parent->runtime_profile(), 
"RowBatchQueuePutWaitTime"))
The code seems easier to follow if we keep the ADD_TIMER() macro at the old 
call sites and simply reference those timers here.


http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/util/blocking-queue.h
File be/src/util/blocking-queue.h:

http://gerrit.cloudera.org:8080/#/c/13883/11/be/src/util/blocking-queue.h@108
PS11, Line 108: get_wait_timer_)
get_wait_timer_ != nullptr



--
To view, visit http://gerrit.cloudera.org:8080/13883
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b1bb4b9c6f6e92c70e8fbee6ccdf48c2f85b7be
Gerrit-Change-Number: 13883
Gerrit-PatchSet: 11
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 26 Jul 2019 02:31:40 +0000
Gerrit-HasComments: Yes

Reply via email to