Github user cleaton commented on the pull request:

    https://github.com/apache/spark/pull/3868#issuecomment-71982990
  
    Thank you for the comments @tdas . I think the diff file for this PR has 
become a bit misleading. It looks like there is more change than there actually 
is. This is a summary of the PR in its current state.
    
    * 1. Propagate the graceful shutdown information to receiverTracker / 
receiverExecutors stop function.
    * 2. Add a Future to wrap the whole shutdown sequence.
    
    The second part is because before  "spark.streaming.gracefulStopTimeout" 
was used to timeout the check for 
"jobScheduler.receiverTracker.hasUnallocatedBlocks" inside JobGenerator.scala . 
This is now removed from jobGenerator (only remove the timeout from the while 
loop) and instead use gracefulStopTimeout configuration to change the wrapping 
Future timeout.
    
    Other than that I have not modified the original shutdown logic, it just 
shows up as removed and re-added in the diff file. 
    
    My purpose with this patch has always been achieve something like your PR1. 
But because of the "spark.streaming.gracefulStopTimeout" variable it is not so 
easy to add another place to gracefully wait without modifying other parts a 
little bit.
    
    ----
    
    About the default timeout change from 10x batch duration to 100x batch 
duration. Don't you believe 10x is quite small when the batch duration is 1 
second or less? 30 seconds might not be enough if a receiver is slow to 
shutdown for some reason.
    
    I think the whole idea to have a "spark.streaming.gracefulStopTimeout" 
configuration option outside the code is a bit weird. I had no idea there was 
such option before I started looking into this code and I can not find it 
mention in the official documentation. I believe most developers are unaware of 
this timeout and just assumes graceful == "possibly wait forever and have to 
manually force kill if too slow" (at least this was my initial thought.)
    
    Isn't it better to expose this timeout from the ssc.stop() function 
instead,  making it more visible for developers and giving them an option to 
actually wait forever if necessary?
    
    
    When does the merge window for Spark 1.3 close?
    
    Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to