Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/3868#issuecomment-72101801
  
    1. The merge window for features closes this weekend. However this is a bug 
fix so we can make it to early next week as well. But the change has to be 
surgical. 
    
    2. Default timeout - if batch sizes are less than 1 second, then i see your 
point. Ideally, the receivers should shutdown immediately, and stop receiving 
data, so only few batches should be generated and executed after the graceful 
shutdown command has been given. In practice that is probably a wrong 
assumption. However, this is also confusing if we wait for 100x batches. Most 
people run ~10s  batches, which means the system will wait for 1000 seconds. 
That's kind a confusing as well if the system is stuck for a while even with 
default configuration. So I guess the most desirable thing to do that balances 
both short and large batches is. 
    1. Default timeout max(10x batch size, 1 minute)
    2. Log warning if there is a timeout.
    3. Expose the timeout configuration
    
    For 3, i donot want to add another configuration parameter to stop(), there 
are already a large combination of optional parameters in stop() and each 
addition gets harder to maintain in the Java and Python API. I want to just 
expose the SparkConf parameter. 
    
    Regarding this patch, I agree that the change is probably not much, but 
since this behavior of the code is not very tested, I am unsure of doing this 
so late in the release cycle for 1.3. So for this release, it would be great if 
we can do a surgical fix that solves the bug and expose the timeout. And have a 
separate PR that cleans this up with refactoring and proper tests testing all 
aspects of the behavior.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to