[ 
https://issues.apache.org/jira/browse/IMPALA-5555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-5555:
-----------------------------------
    Description: 
Hue added a feature where after a user runs a query in Impala, we check the 
Query Profile (from the ImpalaD Web UI) for the RowsProduced statistic (from 
the Coordinator Fragment) and report that back as the total rows returned.

We're noticing that for some long running queries, the RowsProduced will be 
incorrect (reporting 4 despite getting 198 rows) *right* after the query is 
complete, but will be correct a few seconds later (validated by checking the 
query profile manually). We discovered that by adding a latency of a few 
seconds, we can usually get the correct RowsProduced.

But I was wondering if there's something smarter we can do, by checking either 
a value in the query profile itself, or somewhere else. We tried checking the 
hasResults value on the Thrift result handle as well as the status of the 
operation handle, but unfortunately these don't seem to have any effect (i.e. - 
they can be True or SUCCESSFUL even though the query profile doesn't have the 
right RowsProduced number).

Can something be added to the Query Profile itself to indicate that the 
RowsProduced is correct?

EDIT: Even though the original intent to guarantee that the value for 
RowsProduced was final by relying on profile finalization is not the right way 
to go as documented in the discussion below; it still makes sense to add a 
profile finalization counter to indicate that the final update has been 
recieved from the last fragment.

  was:
Hue added a feature where after a user runs a query in Impala, we check the 
Query Profile (from the ImpalaD Web UI) for the RowsProduced statistic (from 
the Coordinator Fragment) and report that back as the total rows returned.

We're noticing that for some long running queries, the RowsProduced will be 
incorrect (reporting 4 despite getting 198 rows) *right* after the query is 
complete, but will be correct a few seconds later (validated by checking the 
query profile manually). We discovered that by adding a latency of a few 
seconds, we can usually get the correct RowsProduced.

But I was wondering if there's something smarter we can do, by checking either 
a value in the query profile itself, or somewhere else. We tried checking the 
hasResults value on the Thrift result handle as well as the status of the 
operation handle, but unfortunately these don't seem to have any effect (i.e. - 
they can be True or SUCCESSFUL even though the query profile doesn't have the 
right RowsProduced number).

Can something be added to the Query Profile itself to indicate that the 
RowsProduced is correct?

EDIT: Even though the original intent was to 


> Add timeline event to query profile to indicate that it is finished
> -------------------------------------------------------------------
>
>                 Key: IMPALA-5555
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5555
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.8.0
>            Reporter: Jenny Kim
>            Assignee: Ethan
>            Priority: Major
>              Labels: observability, supportability
>
> Hue added a feature where after a user runs a query in Impala, we check the 
> Query Profile (from the ImpalaD Web UI) for the RowsProduced statistic (from 
> the Coordinator Fragment) and report that back as the total rows returned.
> We're noticing that for some long running queries, the RowsProduced will be 
> incorrect (reporting 4 despite getting 198 rows) *right* after the query is 
> complete, but will be correct a few seconds later (validated by checking the 
> query profile manually). We discovered that by adding a latency of a few 
> seconds, we can usually get the correct RowsProduced.
> But I was wondering if there's something smarter we can do, by checking 
> either a value in the query profile itself, or somewhere else. We tried 
> checking the hasResults value on the Thrift result handle as well as the 
> status of the operation handle, but unfortunately these don't seem to have 
> any effect (i.e. - they can be True or SUCCESSFUL even though the query 
> profile doesn't have the right RowsProduced number).
> Can something be added to the Query Profile itself to indicate that the 
> RowsProduced is correct?
> EDIT: Even though the original intent to guarantee that the value for 
> RowsProduced was final by relying on profile finalization is not the right 
> way to go as documented in the discussion below; it still makes sense to add 
> a profile finalization counter to indicate that the final update has been 
> recieved from the last fragment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to