[
https://issues.apache.org/jira/browse/IMPALA-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564064#comment-16564064
]
Lars Volker commented on IMPALA-6214:
-------------------------------------
One place for this check could be in or around the loop in
FragmentInstanceState::ExecInternal(). We need to come up with a way to expose
this on a per-query basis, either on a status page or (preferably) in the
profile, so that other tools can parse this more easily. In addition, we should
think about how we can prevent false posiitives, i.e. only report an instance
that blocks, but is not blocked by any of its inputs.
> Determine and warn about stuck fragment instances
> -------------------------------------------------
>
> Key: IMPALA-6214
> URL: https://issues.apache.org/jira/browse/IMPALA-6214
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Lars Volker
> Assignee: Pranay Singh
> Priority: Major
> Labels: observability, supportability
>
> It would be great to have a programmatic way to determine if a fragment
> instance is hung by checking if it’s producing rows periodically. A fragment
> instance can appear to be not making progress because its input operator /
> fragment may be hung (e.g.the probe side of a join will not be able to make
> much progress until the build side is done and the build side itself could be
> another chain of joins). It'd be much easier to resolve this dependency chain
> programmatically to find the root of the cascade of delay.
> Details of algorithm are still unclear. It may be easier to include exec node
> states in query profile and analyze those, but this probably requires taking
> multiple snapshots of the query profiles over time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]