[ 
https://issues.apache.org/jira/browse/CRUNCH-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Shi updated CRUNCH-172:
----------------------------

    Attachment: crunch-172.patch

Remove background thread from CrunchJobControl and let it called by the monitor 
thread in MRExecutor
    
Beside this, there are some small changes:
- Use exponential backoff when query job status. This makes local integration 
tests run much faster on hadoop2.
- Remove suspend/resume support, because it is currently not used and makes 
synchronization complex.

                
> Refine synchronization mechanism in CrunchJobControl
> ----------------------------------------------------
>
>                 Key: CRUNCH-172
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-172
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.0
>            Reporter: Chao Shi
>            Assignee: Josh Wills
>         Attachments: crunch-172.patch
>
>
> Currently CrunchJobControl uses a runnerState to synchronize its background 
> loop and client calls (e.g. stop). This is not sufficient. Jenkins reports a 
> failure after CRUNCH-156 is checked in.
> MRExecutor does the following in its monitorLoop:
> {code}
>       Thread controlThread = new Thread(control);
>       controlThread.start();
>       while (killSignal.getCount() > 0 && !control.allFinished()) {
>         killSignal.await(1, TimeUnit.SECONDS);
>       }
>       control.stop();
> {code}
> And how CrunchJobControl works:
> {code}
>   public void stop() {
>     this.runnerState = ThreadState.STOPPING;
>   }
>   public void run() {
>     this.runnerState = ThreadState.RUNNING;
>     while (true) {
>     ...
>   }
> {code}
> So it is possible to have stop() called before run() called in the other 
> thread. Then MRExecutor thinks everything has been stopped and start to do 
> clean up work, while CrunchJobControl is continue to submit new jobs. Because 
> the clean up work is done, the newly submitted job will complain FileNotFound.
> I think a solution is to remove background thread in CrunchJobControl and let 
> MRExecutor to call it periodically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to