We were to check the response time while running a sample query 50 times
concurrently and we see the reduce operation(as per Shark Stages dashboard)
is run on the same machine no matter if it is already occupied. We think
better performance for concurrent querying could be achieved by random
selection of spark worker instead in a round-robin fashion.

Seems resourceOffers code part inside ClusterScheduler.scala is causing the
round-robin execution of tasks. Would like to hear if we are headed in
right direction.

-- 
Thanks
,
Praveen R

Reply via email to