GitHub user navsan opened a pull request: https://github.com/apache/incubator-quickstep/pull/33
BugFix: Update NumQueuedWorkOrders to fix scheduling Quickstep scheduling is currently broken (since PR #14). The foreman only schedules work for one worker, leaving all other workers idle. This PR fixes that bug. The foreman maintains the number of queued work orders for each worker in the WorkerDirectory. This state was not being incremented when WorkOrders are dispatched, and not being decremented when WorkOrders are completed. The search for LeastLoadedWorker would therefore always pick the first worker, resulting in serial execution of the entire query. In this PR, I have incremented the number of queued work orders in the Foreman when it dispatches messages. When a WorkOrderCompletion message arrives, this number must be decremented. However, the worker's thread ID is not available to the Foreman, since PR #14 moved the message deserialization into the PolicyEnforcer. So, in this PR, I've added a pointer to the WorkerDirectory in the PolicyEnforcer. The PolicyEnforcer decrements the number of queued work orders while processing WorkOrderCompletion messages. The WorkerDirectory is not thread-safe, so the decrement should only be done in the Foreman thread. Since the PolicyEnforcer is part of the Foreman thread, this change should be fine. Tested on CloudLab machine (40 workers) with a few example queries. [A big thanks to @rogersjeffreyl for helping me debug this!] You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/incubator-quickstep fix_scheduler Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-quickstep/pull/33.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #33 ---- commit 74e49fa7f91ab33e20d488ef9923b285214bc04e Author: Navneet Potti <nav...@apache.org> Date: 2016-06-15T02:52:25Z BugFix: Update NumQueuedWorkOrders to fix scheduling ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---