[ 
https://issues.apache.org/jira/browse/IMPALA-9225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated IMPALA-9225:
---------------------------------
    Description: 
If query retries are enabled, a query should not return any results to the 
client until all results are spooled. The issue is that once a query starts 
returning results, retrying the query becomes increasingly complex and is not 
supported in the initial version of IMPALA-9124. Retrying a query while 
returning results could cause incorrect results, especially for 
non-deterministic queries (e.g. when the results are not ordered).

Since a query can fail anytime while results are being produced, transparent 
retries are most effective if they can be done during any point of query 
execution.

The one edge case is what happens if all query results cannot be contained in 
the allocated result spooling memory (including unpinned memory). In this case, 
retries for the query should be transparently disabled.

We should consider making this configurable, in case it leads to performance 
degradation. Although, I'm inclined to turn the flag on by default (e.g. always 
spool all returns before returning them), otherwise (depending on the query) 
query retries won't always be helpful.

  was:
If query retries are enabled, a query should not return any results to the 
client until all results are spooled. The issue is that once a query starts 
returning results, retrying the query becomes increasingly complex and is not 
supported in the initial version of IMPALA-9124. Retrying a query while 
returning results could cause incorrect results, especially for 
non-deterministic queries (e.g. when the results are not ordered).

Since a query can fail anytime while results are being produced, transparent 
retries are most effective if they can be done during any point of query 
execution.

The one edge case is what happens if all query results cannot be contained in 
the allocated result spooling memory (including unpinned memory). In this case, 
retries for the query should be transparently disabled.

We should consider making this configurable, in case it leads to performance 
degradation. Although, I'm included to turn the flag on by default (e.g. always 
spool all returns before returning them), otherwise (depending on the query) 
query retries won't always be helpful.


> Retryable queries should spool all results before returning any to the client
> -----------------------------------------------------------------------------
>
>                 Key: IMPALA-9225
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9225
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Sahil Takiar
>            Priority: Major
>
> If query retries are enabled, a query should not return any results to the 
> client until all results are spooled. The issue is that once a query starts 
> returning results, retrying the query becomes increasingly complex and is not 
> supported in the initial version of IMPALA-9124. Retrying a query while 
> returning results could cause incorrect results, especially for 
> non-deterministic queries (e.g. when the results are not ordered).
> Since a query can fail anytime while results are being produced, transparent 
> retries are most effective if they can be done during any point of query 
> execution.
> The one edge case is what happens if all query results cannot be contained in 
> the allocated result spooling memory (including unpinned memory). In this 
> case, retries for the query should be transparently disabled.
> We should consider making this configurable, in case it leads to performance 
> degradation. Although, I'm inclined to turn the flag on by default (e.g. 
> always spool all returns before returning them), otherwise (depending on the 
> query) query retries won't always be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to