Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/4467#issuecomment-73824469
I realized that this is a tricky thing to fix while maintaining the
stopGracefully semantics. Stop gracefully must ensure that if there are
receivers that have already started, they must be stopped and all the received
data processed before stopping completely. But what happens to the receivers
that are still starting and have not registered yet? We have to wait for them
to all be started, because if we dont, they may have started and pull data
already, which may lead to loosing data. This is not good.
So to solve this correctly. We probably need a Starting state as well. And
stopGracefully must stasrt the stoppign process only after the system has
reached Started state. So it has to wait for all the receivers to have started,
otherwise it is hard to guarantee that all the receivers are correctly stopped.
Also, this behavior must be properly unit tested with different state
transitions, etc. Even before that, I would like to see what is the ideal state
behavior -
* if state = X, then allow register,
* if state = Y do not start stopping,
etc .
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]