[
https://issues.apache.org/jira/browse/CRUNCH-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chao Shi updated CRUNCH-172:
----------------------------
Attachment: crunch-172.patch
Remove background thread from CrunchJobControl and let it called by the monitor
thread in MRExecutor
Beside this, there are some small changes:
- Use exponential backoff when query job status. This makes local integration
tests run much faster on hadoop2.
- Remove suspend/resume support, because it is currently not used and makes
synchronization complex.
> Refine synchronization mechanism in CrunchJobControl
> ----------------------------------------------------
>
> Key: CRUNCH-172
> URL: https://issues.apache.org/jira/browse/CRUNCH-172
> Project: Crunch
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.6.0
> Reporter: Chao Shi
> Assignee: Josh Wills
> Attachments: crunch-172.patch
>
>
> Currently CrunchJobControl uses a runnerState to synchronize its background
> loop and client calls (e.g. stop). This is not sufficient. Jenkins reports a
> failure after CRUNCH-156 is checked in.
> MRExecutor does the following in its monitorLoop:
> {code}
> Thread controlThread = new Thread(control);
> controlThread.start();
> while (killSignal.getCount() > 0 && !control.allFinished()) {
> killSignal.await(1, TimeUnit.SECONDS);
> }
> control.stop();
> {code}
> And how CrunchJobControl works:
> {code}
> public void stop() {
> this.runnerState = ThreadState.STOPPING;
> }
> public void run() {
> this.runnerState = ThreadState.RUNNING;
> while (true) {
> ...
> }
> {code}
> So it is possible to have stop() called before run() called in the other
> thread. Then MRExecutor thinks everything has been stopped and start to do
> clean up work, while CrunchJobControl is continue to submit new jobs. Because
> the clean up work is done, the newly submitted job will complain FileNotFound.
> I think a solution is to remove background thread in CrunchJobControl and let
> MRExecutor to call it periodically.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira