[ 
https://issues.apache.org/jira/browse/IMPALA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895456#comment-16895456
 ] 

Sahil Takiar commented on IMPALA-8803:
--------------------------------------

CC: [[email protected]], [~kwho]

A few other points:
* Wondering if this is useful outside of the result spooling context; maybe 
there are queries where certain backends complete earlier than others? My first 
guess would be any Hash-Joins; the dimension table scan of a Hash-Join should 
complete earlier than the rest of the query? Assuming the dimension table scan 
fragment is running on its own backend.
* Considering the following approach for batching:
** Batching will be driven by the completion of {{BackendStates}}: the 
{{Coordinator}} will internally buffer a list of {{BackendStates}} that need to 
be released, when each backend completes, the {{Coordinator}} will check some 
number of conditions, and if those conditions are met, it will release the 
admitted memory for all buffered {{BackendStates}}
*** This is in contrast to adding a blocking queue of backends that need to be 
released, and adding a scheduled thread to periodically read from the queue and 
release the resources of any buffered backends
** Considering the following two conditions that would trigger the coordinator 
to release all of its buffered {{BackendStates}} (both would need to be true to 
trigger a call to admission control):
*** If "x" number of backends are buffered, then this condition returns true; 
"x" is initially set to num backends / 2, and it is exponentially decaying; so 
for a query running on 1000 nodes, you have an upper limit of {{log(num 
backends)}} calls to the admission controller
*** If "y" milliseconds have passed since the last time we released the 
buffered backends, this condition returns true ("y" can initially be set to 
1000); this avoids additional overhead of queries that complete relatively 
quickly, and perhaps do not take advantage of the result spooling benefits
** If a query is cancelled or hits an error, all remaining running backends 
will be released at once

> Coordinator should release admitted memory per-backend rather than per-query
> ----------------------------------------------------------------------------
>
>                 Key: IMPALA-8803
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8803
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>
> When {{SPOOL_QUERY_RESULTS}} is true, the coordinator backend may be long 
> lived, even though all other backends for the query have completed. 
> Currently, the Coordinator only releases admitted memory when the entire 
> query has completed (include the coordinator fragment) - 
> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/runtime/coordinator.cc#L562
> In order to more aggressively return admitted memory, the coordinator should 
> release memory when each backend for a query completes, rather than waiting 
> for the entire query to complete.
> Releasing memory per backend should be batched because releasing admitted 
> memory in the admission controller requires obtaining a global lock and 
> refreshing the internal stats of the admission controller. Batching will help 
> mitigate any additional overhead from releasing admitted memory per backend.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to