[jira] [Commented] (IMPALA-8845) Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState

Michael Ho (JIRA) Mon, 12 Aug 2019 14:29:30 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905594#comment-16905594
 ]


Michael Ho commented on IMPALA-8845:
------------------------------------

{quote} I haven't traced the issue exactly, but what I think is happening is 
that the MERGING-EXCHANGE operator in the coordinator fragment hits eos 
whenever it has received enough rows to reach the limit defined in the query, 
which could occur before the DATASTREAM SINK sends all the rows from the TopN / 
Scan Node fragment. {quote}

If I understand the above correctly, your observation was that the 
Merging-Exchange has been closed already and the other fragment instance is 
stuck in an RPC call. Usually, when the receiving fragment is closed, it will 
be put into a "closed receiver cache". Incoming traffic will probe against this 
cache and notices that it's closed already and short-circuits the reply to the 
DataStreamSender. At which point, the DataStreamSender should skip issuing the 
RPC (see [code here| 
https://github.com/apache/impala/blob/master/be/src/runtime/krpc-data-stream-sender.cc#L410-L411
 ] However, there is an expiration time (5 minutes) for entries in the cache so 
eventually expired entries will be removed. Traffic arriving for that receiver 
may be stuck for {{--datastream_sender_timeout_ms}} before returning with an 
error.

That said, if the DataStreamSender manages to 



> Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-8845
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8845
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>
> While testing IMPALA-8818, I found that IMPALA-8780 does not always cause all 
> non-coordinator fragments to shutdown. In certain setups, TopN queries 
> ({{select * from [table] order by [col] limit [limit]}}) where all results 
> are successfully spooled, still keep non-coordinator fragments alive.
> The issue is that sometimes the {{DATASTREAM SINK}} for the TopN <-- Scan 
> Node fragment ends up blocking waiting for a response to a {{TransmitData()}} 
> RPC. This prevents the fragment from shutting down.
> I haven't traced the issue exactly, but what I *think* is happening is that 
> the {{MERGING-EXCHANGE}} operator in the coordinator fragment hits {{eos}} 
> whenever it has received enough rows to reach the limit defined in the query, 
> which could occur before the {{DATASTREAM SINK}} sends all the rows from the 
> TopN / Scan Node fragment.
> So the TopN / Scan Node fragments end up hanging until they are explicitly 
> closed.
> The fix is to close the {{ExecNode}} tree in {{FragmentInstanceState}} as 
> eagerly as possible. Moving the close call to before the call to 
> {{DataSink::FlushFinal}} fixes the issue. It has the added benefit that it 
> shuts down and releases all {{ExecNode}} resources as soon as it can. When 
> result spooling is enabled, this is particularly important because 
> {{FlushFinal}} might block until the consumer reads all rows.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-8845) Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState

Reply via email to