GitHub user zsxwing opened a pull request:
https://github.com/apache/spark/pull/8340
[SPARK-10137][Streaming]Avoid to restart receivers if scheduleReceivers
returns balanced results
This PR fixes the following cases for `ReceiverSchedulingPolicy`.
1) Assume there are 4 executors: host1, host2, host3, host4, and 5
receivers: r1, r2, r3, r4, r5. Then
`ReceiverSchedulingPolicy.scheduleReceivers` will return (r1 -> host1, r2 ->
host2, r3 -> host3, r4 -> host4, r5 -> host1).
Let's assume r1 starts at first on `host1` as `scheduleReceivers`
suggested, and try to register with ReceiverTracker. But the previous
`ReceiverSchedulingPolicy.rescheduleReceiver` will return (host2, host3, host4)
according to the current executor weights (host1 -> 1.0, host2 -> 0.5, host3 ->
0.5, host4 -> 0.5), so ReceiverTracker will reject `r1`. This is unexpected
since r1 is starting exactly where `scheduleReceivers` suggested.
This case can be fixed by ignoring the information of the receiver that is
rescheduling in `receiverTrackingInfoMap`.
2) Assume there are 3 executors (host1, host2, host3) and each executors
has 3 cores, and 3 receivers: r1, r2, r3. Assume r1 is running on host1. Now r2
is restarting, the previous `ReceiverSchedulingPolicy.rescheduleReceiver` will
always return (host1, host2, host3). So it's possible that r2 will be scheduled
to host1 by TaskScheduler. r3 is similar. Then at last, it's possible that
there are 3 receivers running on host1, while host2 and host3 are idle.
This issue can be fixed by returning only executors that have the minimum
wight rather than returning at least 3 executors.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zsxwing/spark fix-receiver-scheduling
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8340.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8340
----
commit 5fba8d46db62118711ff965efc6c6772ba1ddd1b
Author: zsxwing <[email protected]>
Date: 2015-08-20T10:25:54Z
Avoid to restart receivers if scheduleReceivers returns balanced results
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]