Adriano created IMPALA-7496:
-------------------------------

             Summary: Schedule query taking in account the mem available on the 
impalad nodes
                 Key: IMPALA-7496
                 URL: https://issues.apache.org/jira/browse/IMPALA-7496
             Project: IMPALA
          Issue Type: New Feature
          Components: Backend
            Reporter: Adriano


Environment description: cluster scale (50/100/150 nodes and terabyte of ram 
available) - Admission Control enabled.

Issue description:
Despite the coordinator chosen (with data and statistics unchanged) a query 
will be planned always in the same way based on the metainfo that the 
coordinator have.

The query will be scheduled always on the same nodes if the memory requirements 
for the admission are satisfied:
https://github.com/cloudera/Impala/blob/cdh5-2.7.0_5.9.1/be/src/scheduling/admission-controller.cc#L307-L333

Equal queries are planned/scheduled always in the same way (to hit always the 
same nodes). 
This often lead to queue the queries that are hitting the same nodes are queued 
(not admitted) as on those nodes there's no more memory available within its 
process limit despite the pool have lot of free memory and the overall cluster 
load is low.

When the plan is finished and the query can be evaluated to be admitted often 
happen that the admission is denied because one of the node have not enough 
memory to run the query operation (and the query is moved in the pool queue) 
despite the cluster have 50/100/150 nodes and terabyte of ram available.

Why the scheduler does not take in consideration the memory available on the 
nodes involved in the query before to buid the schedule, (maybe preferring a 
remote read/operation on a free memory node instead to include in the plan 
always the same nodes that will end to be:
1- overloaded
2- the query will be not immediately admitted, risking to be timedout in the 
pool queue

Since 2.7 the REPLICA_PREFERENCE can possibly help, but it's not good enough as 
it does not prevent the scheduler to choose busy nodes (with the same potential 
effect: query queued for lack of resource on specific node despite there are 
terabytes of free memory).


Feature Request:
It would be good if Impala had an option to execute queries (even with worse 
performance) excluding the nodes overloaded and including different nodes in 
order to get the query immediately admitted and executed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to