gatorsmile commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task 
location
URL: https://github.com/apache/spark/pull/26633#issuecomment-558888608
 
 
   @tgravescs Changing the default value of `spark.locality.wait` is a very 
important topic. We need to collect more feedbacks from the community, instead 
of making the decision among us. 3 second is just a magic number. Anybody knows 
the history? Why we chose 3 seconds instead of 1 second or 0.5 second?
   
   Also, the perf is related to the environment and the workload patterns 
[e.g., the cost of shuffling data, the current workload sizing, and the 
cluster's resource availability]. When running a short or streaming query in an 
idle local cluster, setting it to zero might not be a bad idea. When running it 
in a cloud environment, I do not know which value is the best. This really 
needs to do more performance testing using common workloads to find the next 
magic number. 
   
   Normally, we should be really careful when introducing any performance 
related change. The decisions we made will impact a lot of end users. Any 
**larger than 5%** perf regression for a single query [from the perf benchmark] 
is not acceptable when I worked for a commercial database. 
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to