Hi Mridul, I have tried your method, it works fine for this case. I set locality.process to 30 minutes, locality.node and locality.rack to 3 seconds. I got loading stage's RACK level tasks submitted after 3 seconds wait, and only PROCESS tasks after loading stage.
Hi Kay, Our case is exactly same as what you say. I do agree with you that this is an issue need to be fixed. In my opinion, locality.wait is a simple way to handle locality problem, but not good enough. In fact I always disable the timeout in my application to get better performance. Every time I set the timeout to lower value, It will run into a loop that: some tasks slow -> locality wait timeout -> run tasks in poor locality -> network and memory overload-> more tasks become slow. So I think may be a better locality strategy is needed, Such as determined by cache size or actual useable locality. Thanks for your guy's help. --Ma Chong -------------------------------- ----- 原始邮件 ----- 发件人:Mridul Muralidharan <mri...@gmail.com> 收件人:Kay Ousterhout <k...@eecs.berkeley.edu> 抄送人:MaChong <machon...@sina.com>, dev <dev@spark.apache.org> 主题:Re: Problems with spark.locality.wait 日期:2014年11月14日 04点53分