[Impala-ASF-CR] IMPALA-6214: Determine and warn about stuck fragment instances.

Todd Lipcon (Code Review) Tue, 24 Jul 2018 16:37:08 -0700

Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11021 )


Change subject: IMPALA-6214: Determine and warn about stuck fragment instances.
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11021/2/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/11021/2/be/src/runtime/krpc-data-stream-recvr.cc@238
PS2, Line 238:       VLOG_QUERY << "wait arrival fragment_instance_id="
VLOG_QUERY is on by default, so this would become very very noisy, once per 
wait, no? I think we'd only want to log if we have hit a timeout from the below 
CV wait.

Also, I don't know the context of this code quite well enough, but isn't it 
normal to sometimes wait for minutes on a sender? For example, if the upstream 
node is a full sort, or a join with a lot of slow parents then the receiver 
side may block for minutes or even hours before making progress. In that case, 
I can see surfacing this kind of information in the profile or in some 
query-scoped log but maybe not in the global impalad log?

I think I don't quite understand the end goal of this JIRA well enough to 
evaluate whether this change is a net help. Why doesn't the existing 
data_wait_timer already tell us that this node is the blocking culprit, and 
from there we can just look at the fragment graph to understand what was 
blocking it?



--
To view, visit http://gerrit.cloudera.org:8080/11021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I260a1d0a3477e5c6a46094e664500c3e2ed7de62
Gerrit-Change-Number: 11021
Gerrit-PatchSet: 2
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Tue, 24 Jul 2018 23:36:23 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6214: Determine and warn about stuck fragment instances.

Reply via email to