[
https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681952#comment-16681952
]
Tim Armstrong commented on IMPALA-5397:
---------------------------------------
[~tmgstev] so the topic of this JIRA is purely about adjusting the duration
calculation to produce more useful numbers in the scenario, but you're asking
for something else.
I'm generally opposed to adding more band-aids to this area of the code (like
additional layers of timeouts) instead of fixing the actual problem - adding
more complexity is going to make it more confusing rather than less. We do plan
to work on some of the underlying problems but there's not a quick fix. There's
really only so much we can do if clients don't close clients or sessions when
they're done with them. Unregistering things on the server side automatically
is pretty disruptive because, from the client's point of view, the query just
disappeared and it can't report exactly what happened or clean up in a
controlled way. If that's what you want, --idle_session_timeout does achieve
that.
So ultimately we want to:
* make improvements to be more robust to slow or misbehaving clients (keeping
queries open but reducing the amount of resources that are retained by the
queries) - IMPALA-1575, IMPALA-4268, etc
* continue to clean up query lifecycle behaviour
* get clients to close queries promptly when they're done with the results.
This will be easier with query lifecycle improvements since we'll be able to
better pin-point queries that were left dangling by the client.
If you're seeing a problem in a specific scenario with a specific client I'd
encourage you to file a bug against that client (or explain the scenario here)
and we can try to improve that behaviour or get it fixed in the client. We've
seen behaviour in various clients that leads to problems like this and is a
clear-cut bug.
Idle query timeout also does exactly what it says in the docs as far as I can
see - it cancels the queries. Do you have an example where docs say otherwise?
> Queries/sessions that are left idle after executing a query report incorrect
> duration
> --------------------------------------------------------------------------------------
>
> Key: IMPALA-5397
> URL: https://issues.apache.org/jira/browse/IMPALA-5397
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.9.0
> Reporter: Mostafa Mokhtar
> Priority: Major
> Labels: query-lifecycle
>
> When queries are executed from Hue then the session is left idle and
> incorrect query duration is reported.
> As the session is left alive the query duration keeps going up even though
> the query stats is FINISHED.
> Queries below finished in 1s640ms while the reported time is much longer.
> |User||Default Db||Statement||Query Type||Start Time||Waiting
> Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource
> Pool||Details||Action|
> |hue/[email protected]|tpcds_1000_parquet|select
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31
> 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row
> fetched|1|root.default|Details|Close|
> |hue/[email protected]|tpcds_1000_parquet|select
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31
> 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 (
> 100%)|FINISHED|1|root.default|Details|
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]