Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/8714#issuecomment-159137863
@jerryshao this change looks pretty good. I think I have a good
understanding of the problem. The behavior before this patch was:
1. Master launches an executor and sends a message to both the driver and
the worker
2. Driver responds with `RUNNING`, while worker responds with `LOADING`
3. Driver's response arrives at the master first, so the executor is now
`LOADING` forever
In (2), it's weird for the driver to even respond and tell the Master the
executor state. This should be the job of the Worker. I think the new behavior
we want is:
1. Master launches an executor and sends a message to both the driver and
the worker
2. Worker responds with `RUNNING`
3. Master forwards the state change to the driver
This guarantees that there is only one source of executor state change
(i.e. the worker), and the driver only receives information. I think this patch
in its current state is pretty close. Thanks for catching this issue.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]