Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9527 )

Change subject: IMPALA-6609: Fix ownership of class members in 
KrpcDataStreamRecvr
......................................................................


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.h@163
PS1, Line 163:   /// Row schema.
> Add "Not owned."
Done


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.h@182
PS1, Line 182: client_
> Not your change, but I think we should we rename this to buffer_pool_client
Renamed to buffer_pool_client_.


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.h@201
PS1, Line 201: own most of the
             :   /// counters below
> if the only exception is the one you marked "Not owned", how about saying "
Done


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.h@201
PS1, Line 201: own most of the
             :   /// counters below
> Add "These profiles are guaranteed to live as long as their parent 'profile
Done


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.cc@a636
PS1, Line 636:
> Maybe I'm missing something, but I'm not able to understand why this was re
TakeOverEarlySender() may access 'mgr_'.

I was thinking that there may be a race between Close() and 
TakeOverEarlySender(). However, the race should be impossible given both 
TakeOverEarlySenders() and Close() are supposed to be called from the fragment 
execution thread context.

Reverted this line in PS2. Added a DCHECK() to validate the assumption.


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.cc@411
PS1, Line 411: recvr_->deferred_rpc_tracker()
> We access the deferred_rpc_tracker() here too. Should we check for cancella
Another subtle point which warrants a comment. See line 399 for the DCHECK. The 
contract is that the queue 'deferred_rpcs_' is drained during cancellation and 
no thread should add to 'deferred_rpcs_' after Cancel() is called. So, line 398 
implicitly handles the case of cancellation. Note the DCHECK() at line 399. 
Comments needed there.


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.cc@451
PS1, Line 451: Only update the 'deferred_rpc_tracker_' if the queue is still 
open.
> I think a "why" rather than "what" would be more useful here. Something lik
It's true that a queue may be cancelled but not yet closed if a fragment 
instance is cancelled but not yet closed. We want to avoid consuming from 
'deferred_rpc_trackers_' as all deferred RPCs are drained in Cancel() so 
deferred_rpc_trackers_'s consumption should go to zero after Cancel() has been 
called on all sender queues.

There is a window in KrpcDataStreamMgr::CreateRecvr() between FindRecvr() 
returns and when TakeOverEarlySender is called. In that window, the fragment 
instance can be cancelled by the coordinator. That's how a thread may end up 
calling TakeOverEarlySender() after it has been cancelled. This receiver 
shouldn't be closed at this point as this function is called in the context of 
the fragment execution thread and KrpcDataStreamReceiver::Close() is also 
supposed to be called from fragment execution thread only.

Added some explanation and DCHECK for it.


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.cc@568
PS1, Line 568: Note that the parent profile doesn't hold any ownership to these
             :   // profiles.
> I think this will be hard to decifer, is there a more direct thing we can s
This is trying to point out that 'profile_' won't actually free these two 
profiles so they are safe to access even after 'profile_' is deleted. Added the 
comments you suggested.


http://gerrit.cloudera.org:8080/#/c/9527/1/be/src/runtime/krpc-data-stream-recvr.cc@644
PS1, Line 644: deferred_rpc_tracker_->Close();
> This seems a little brittle, although it would work for us. In the header,
Yes, I believe a more clear explanation about the contract is warranted. In 
essence, it's assuming that all threads which try to consume from / insert into 
sender queues will hold the sender queue's lock and check for 'is_cancelled_' 
flag which acts as a surrogate on whether the queue is still open for business. 
If 'is_cancelled_' is true, it may mean Close() has been called on it already. 
So past line 642 above, no threads should really be consuming from 
deferred_rpc_tracker_ so the main thread here can safely close it without 
holding the lock.



--
To view, visit http://gerrit.cloudera.org:8080/9527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a
Gerrit-Change-Number: 9527
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Thu, 08 Mar 2018 04:13:54 +0000
Gerrit-HasComments: Yes

Reply via email to