[jira] [Commented] (IMPALA-8845) Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState

Michael Ho (JIRA) Tue, 13 Aug 2019 13:58:13 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906607#comment-16906607
 ]


Michael Ho commented on IMPALA-8845:
------------------------------------

I spent more time looking into the issue after managing to reproduce it 
locally. Apparently, the problem was that the merging-exchange hit eos so it 
stopped dequeuing from the data stream receiver. It could happen that some of 
the senders' batches were placed in the deferred queue due to capacity limit in 
the receiver. Consequently, the fact that the merging-exchange stops dequeuing 
means that the deferred batch will not be dequeued until the receiver is 
cancelled and closed. The sender would block waiting in {{TransmitData()}} rpc 
forever.

So, it's not a case of IMPALA-3990 as far as I can tell.

> Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-8845
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8845
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>
> While testing IMPALA-8818, I found that IMPALA-8780 does not always cause all 
> non-coordinator fragments to shutdown. In certain setups, TopN queries 
> ({{select * from [table] order by [col] limit [limit]}}) where all results 
> are successfully spooled, still keep non-coordinator fragments alive.
> The issue is that sometimes the {{DATASTREAM SINK}} for the TopN <-- Scan 
> Node fragment ends up blocking waiting for a response to a {{TransmitData()}} 
> RPC. This prevents the fragment from shutting down.
> I haven't traced the issue exactly, but what I *think* is happening is that 
> the {{MERGING-EXCHANGE}} operator in the coordinator fragment hits {{eos}} 
> whenever it has received enough rows to reach the limit defined in the query, 
> which could occur before the {{DATASTREAM SINK}} sends all the rows from the 
> TopN / Scan Node fragment.
> So the TopN / Scan Node fragments end up hanging until they are explicitly 
> closed.
> The fix is to close the {{ExecNode}} tree in {{FragmentInstanceState}} as 
> eagerly as possible. Moving the close call to before the call to 
> {{DataSink::FlushFinal}} fixes the issue. It has the added benefit that it 
> shuts down and releases all {{ExecNode}} resources as soon as it can. When 
> result spooling is enabled, this is particularly important because 
> {{FlushFinal}} might block until the consumer reads all rows.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-8845) Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState

Reply via email to