[
https://issues.apache.org/jira/browse/IMPALA-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522500#comment-16522500
]
Dan Hecht commented on IMPALA-6818:
-----------------------------------
I forgot that UpdateStatus() RPC is responded with CANCELLED after all results
are returned, so in all cases the backend will ultimately be cancelled. So I
don't think this is blocked. Actually, maybe we should always respond to
UpdateStatus() with CANCELLED if query execution has terminated, whatever the
reason. That way, if a cancel RPC is dropped, once the backend tries to send a
report it will cancel. I've filed IMPALA-7205 for this improvement.
IMPALA-5119 is more about avoiding work in the RPC handler itself, so I don't
think it's relevant here. Maybe you were thinking of IMPALA-6984, but I don't
think that's blocking either for the reason stated above.
> Rethink data-stream sender/receiver startup sequencing
> ------------------------------------------------------
>
> Key: IMPALA-6818
> URL: https://issues.apache.org/jira/browse/IMPALA-6818
> Project: IMPALA
> Issue Type: Sub-task
> Components: Distributed Exec
> Reporter: Dan Hecht
> Assignee: Michael Ho
> Priority: Major
>
> IMPALA-1599 introduced parallel fragment startup, which is good for startup
> latency. However, it meant that data-stream senders can start before
> receivers, and there is a timeout to handle the case when the receiver never
> shows up:
> {code:java}
> Sender timed out waiting for receiver fragment instance{code}
> We see this timeout fairly regularly (e.g. when a host has a spike in load
> and does not process the exec rpc for a while). Let's rethink how this works
> to see if we can make it robust but being careful to not sacrifice startup
> time too much.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]