[GitHub] [spark] cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset scheduling delay timer if allocated slots are fully utilized

2019-12-03 Thread GitBox
cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset 
scheduling delay timer if allocated slots are fully utilized
URL: https://github.com/apache/spark/pull/26696#issuecomment-561220688
 
 
   The per-slot timer sounds promising to me. I'll think more about it in the 
following days.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset scheduling delay timer if allocated slots are fully utilized

2019-12-03 Thread GitBox
cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset 
scheduling delay timer if allocated slots are fully utilized
URL: https://github.com/apache/spark/pull/26696#issuecomment-561062101
 
 
   @viirya The locality wait time is a global config, not per job/stage. Even 
if it's per job/stage, I'm not sure how to set an optimal value.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset scheduling delay timer if allocated slots are fully utilized

2019-12-02 Thread GitBox
cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset 
scheduling delay timer if allocated slots are fully utilized
URL: https://github.com/apache/spark/pull/26696#issuecomment-561039532
 
 
   Sufficient discussions are needed for this problem. AFAIK, the issue of 
delay scheduling is: it has a timer per task set manager, and the timer gets 
reset as soon as there is one task from this task set manager gets scheduled on 
a preferred location.
   
   A stage may keep waiting for locality and not leverage available nodes in 
the cluster, if its task duration is shorter than the locality wait time (3 
seconds by default).
   
   A simple solution is: we never reset the timer. When a stage has been 
waiting long enough for locality, this stage should not wait for locality 
anymore. However, this may hurt performance if the last task is scheduled to a 
non-preferred location, and a preferred location becomes available right after 
this task gets scheduled, and locality can bring 50x speed up.
   
   I don't have a good idea now. cc @JoshRosen @tgravescs @vanzin @jiangxb1987 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset scheduling delay timer if allocated slots are fully utilized

2019-12-02 Thread GitBox
cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset 
scheduling delay timer if allocated slots are fully utilized
URL: https://github.com/apache/spark/pull/26696#issuecomment-561029067
 
 
   ok to test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org