Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13883 )

Change subject: IMPALA-8779, IMPALA-8780: RowBatchQueue interface and 
BufferedPRS impl
......................................................................


Patch Set 4:

(4 comments)

Unifying the RowBatchQueue implementation and cleaning up the interface makes 
sense to me. Please see replies below for a suggestion to do it without 
splitting it into two classes in this patch. Also, some of the clean ups in 
KRPC code can be done in a follow-up patch.

http://gerrit.cloudera.org:8080/#/c/13883/3/be/src/exec/buffered-plan-root-sink.h
File be/src/exec/buffered-plan-root-sink.h:

http://gerrit.cloudera.org:8080/#/c/13883/3/be/src/exec/buffered-plan-root-sink.h@29
PS3, Line 29: The blocking behavior follows
            : /// the same semantics as BlockingPlanRootSink.
> Yeah, it probably makes more sense once we change BufferedPlanRootSink to u
Yes, I guess at some point soon, we may want to consolidate on the 
implementation but I am fine with keeping the two classes for now.


http://gerrit.cloudera.org:8080/#/c/13883/3/be/src/exec/buffered-plan-root-sink.h@42
PS3, Line 42: RowBatchQueue* batch_queue)
> Dependency injection. I'm not sure if this is the right way to do this in C
Thanks for the link. I can see the advantage being that different users of this 
class can pass in different implementation of RowBatchQueue interface for their 
own purposes. However, given the rather rigid use case of PlanRootSink right 
now, it seems like an unnecessary complication until the need for this arises. 
Also, please see comments elsewhere in which this ctor is invoked. The 
RowBatchQueue object seems to be leaked right now as the code stands.

Given the limited charter of BufferedPlanRootSink, the code seems simpler if 
the batch_queue is owned by BufferedPlanRootSink instead and that also makes 
the reasoning of the lifetime of RowBatchQueue object clearer (i.e. it won't 
outlive that of the owning BufferedPlanRootSink).


http://gerrit.cloudera.org:8080/#/c/13883/4/be/src/exec/buffered-plan-root-sink.cc
File be/src/exec/buffered-plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/13883/4/be/src/exec/buffered-plan-root-sink.cc@45
PS4, Line 45:  is_full_.Wait(l);
> The existing BlockingQueue doesn't expose its lock, which makes the synchro
I guess the answer to my question is that we access the BlockingQueue with 
'lock_' held so we really need a non-blocking interface so we won't block other 
threads from consuming the row batch. The 'lock_' is necessary for 
synchronization multiple threads calling GetNext() / Close() / Send() 
concurrently.

I suppose using RowBatchQueue::TryAddBatch() will fit the purpose, right ?


http://gerrit.cloudera.org:8080/#/c/13883/3/be/src/runtime/non-blocking-row-batch-queue.h
File be/src/runtime/non-blocking-row-batch-queue.h:

http://gerrit.cloudera.org:8080/#/c/13883/3/be/src/runtime/non-blocking-row-batch-queue.h@32
PS3, Line 32:  NonBlockingRowBatchQueue
> Adding TryAddBatch() makes sense to me.
Yes, I agree the clean up of the RowBatchQueue interface is a good thing and 
you seem to also agree that refactoring RowBatchQueue into two classes may be 
overkill. How about you keep the necessary clean-up for the RowBatchQueue 
interface but instead of splitting it into two classes, we can keep the 
original RowBatchQueue instead. Like your TODO suggested above, we can hide the 
implementation of RowBatchQueue by instantiating a BlockingQueue object in 
RowBatchQueue class instead of inheriting it from BlockingQueue.
In addition, a new interface called TryAddBatch() will be added to support the 
non-blocking behavior. Some modification may be needed in the BlockingQueue 
class to support the non-blocking insert behavior.

So, the end result is:
- we will still have a single RowBatchQueue class for this patch albeit with a 
better defined interface.

- when we get around to implement a version of the queue backed by BTS, we can 
do the refactoring like this patch and RowBatchQueue can naturally become 
BlockingRowBatchQueue.

The cleanup of KRPC sender / receiver can be done as a follow-on patch.

What do you think ?



--
To view, visit http://gerrit.cloudera.org:8080/13883
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b1bb4b9c6f6e92c70e8fbee6ccdf48c2f85b7be
Gerrit-Change-Number: 13883
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Tue, 23 Jul 2019 19:34:49 +0000
Gerrit-HasComments: Yes

Reply via email to