GitHub user zsxwing opened a pull request:
https://github.com/apache/spark/pull/16345
[SPARK-17755][Core]Use workerRef to send RegisterWorkerResponse to avoid
the race condition
## What changes were proposed in this pull request?
The root cause of this issue is that RegisterWorkerResponse and
LaunchExecutor are sent via two different channels (TCP connections) and their
order is not guaranteed.
This PR changes the master and worker codes to use `workerRef` to send
RegisterWorkerResponse, so that RegisterWorkerResponse and LaunchExecutor are
sent via the same connection.
## How was this patch tested?
Jenkins
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zsxwing/spark SPARK-17755
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16345.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16345
----
commit b4b55528edc5e9c92f28cf81ea81e72748790100
Author: Shixiong Zhu <[email protected]>
Date: 2016-12-19T23:44:05Z
Use workerRef to send RegisterWorkerResponse to avoid the race condition
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]