Instead of setting spark.locality.wait, try setting individual
locality waits specifically.

Namely, spark.locality.wait.PROCESS_LOCAL to high value (so that
process local tasks are always scheduled in case the task set has
process local tasks).
Set spark.locality.wait.NODE_LOCAL and spark.locality.wait.RACK_LOCAL
to low value - so that in case task set has no process local tasks,
both node local and rack local tasks are scheduled asap.

>From your description, this will alleviate the problem you mentioned.


Kay's comment, IMO, is slightly general in nature - and I suspect
unless we overhaul how preferred locality is specified, and allow for
taskset specific hints for schedule, we cant resolve that IMO.


Regards,
Mridul



On Thu, Nov 13, 2014 at 1:25 PM, MaChong <machon...@sina.com> wrote:
> Hi,
>
> We are running a time sensitive application with 70 partition and 800MB each 
> parition size. The application first load data from database in different 
> cluster, then apply a filter, cache the filted data, then apply a map and a 
> reduce, finally collect results.
> The application will be finished in 20 seconds if we set spark.locality.wait 
> to a large value (30 minutes). And it will use 100 seconds, if we set 
> spark.locality.wait a small value(less than 10 seconds)
> We have analysed the driver log and found lot of NODE_LOCAL and RACK_LOCAL 
> level tasks, normally a PROCESS_LOCAL task only takes 15 seconds, but 
> NODE_LOCAL or RACK_LOCAL tasks will take 70 seconds.
>
> So I think we'd better set spark.locality.wait to a large value(30 minutes), 
> until we meet this problem:
>
> Now our application will load data from hdfs in the same spark cluster, it 
> will get NODE_LOCAL and RACK_LOCAL level tasks during loading stage, if the 
> tasks in loading stage have same locality level, ether NODE_LOCAL or 
> RACK_LOCAL it works fine.
> But if the tasks in loading stage get mixed locality level, such as 3 
> NODE_LOCAL tasks, and 2 RACK_LOCAL tasks, then the TaskSetManager of loading 
> stage will submit the 3 NODE_LOCAL tasks as soon as resources were offered, 
> then wait for spark.locality.wait.node, which was setted to 30 minutes, the 2 
> RACK_LOCAL tasks will wait 30 minutes even though resources are avaliable.
>
>
> Does any one have met this problem? Do you have a nice solution?
>
>
> Thanks
>
>
>
>
> Ma chong

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to