[
https://issues.apache.org/jira/browse/IMPALA-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921735#comment-16921735
]
Sahil Takiar commented on IMPALA-7551:
--------------------------------------
I started to dig deeper into this, and a few things to note about this issue:
For any query with an exchange node in the Coordinator fragment, this shouldn't
be a major issue. IMPALA-924 changed the {{ExchangeNode}} so that {{Open}}
blocks until rows are actually available. IMPALA-924 actually added some test
coverage for this in {{test_rows_availability.py}} as well, but it seems all
the queries run with {{num_nodes=0}} so there is no coverage when
{{num_nodes=1}} (which I think is where this issue was seen), which makes sense
since there is no exchange node when {{num_nodes=1}}.
Another thing to note in the context of result spooling:
* The Coordinator emits a "First Batch Sent" event after
{{PlanRootSink::Send}} is called
** When result spooling is disabled, this is the time that the client actually
fetched the first batch
** When result spooling is enabled, this is the time that the first batch was
spooled
I'm not sure if there is any documentation about query timeline events, but I
think we should add some, especially if the meaning changes depending on the
Impala configuration.
> Inaccurate timeline for "Rows Available"
> -----------------------------------------
>
> Key: IMPALA-7551
> URL: https://issues.apache.org/jira/browse/IMPALA-7551
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 3.1.0
> Reporter: Pooja Nilangekar
> Assignee: Sahil Takiar
> Priority: Major
> Labels: observability, query-lifecycle, ramp-up
>
> While debugging IMPALA-6932, it was noticed that the "Rows Available" metric
> in the query profile was a short duration (~ 1 second) for a long running
> limit 1 query (~ 1 hour).
> Currently, it tracks when Open() from the top-most node in the plan returns,
> not when the first row is actually produced. This can be misleading. A better
> timeline would be to return true when the first non-empty batch was added to
> the PlanRootSink.
> We should consider changing the definition of the FINISHED state accordingly
> as well, so that we don't transition to FINISHED until a row is actually
> available to fetch immediately.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]