[
https://issues.apache.org/jira/browse/HBASE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-13146:
--------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
Pushed to branch-1+. Nice one [~Apache9]
> Race Condition in ScheduledChore and ChoreService
> -------------------------------------------------
>
> Key: HBASE-13146
> URL: https://issues.apache.org/jira/browse/HBASE-13146
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 2.0.0, 1.1.0
> Reporter: zhangduo
> Assignee: zhangduo
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE-13146.patch
>
>
> Here is my findings when addressing HBASE-13145.
> {code:title=ChoreService.java}
> public synchronized boolean scheduleChore(ScheduledChore chore) {
> ...
> ScheduledFuture<?> future =
> scheduler.scheduleAtFixedRate(chore, chore.getInitialDelay(),
> chore.getPeriod(),
> chore.getTimeUnit());
> chore.setChoreServicer(this);
> ...
> }
> {code}
> So we schedule the chore first, and then set chore servicer. And for
> CompactionChecker, the initialDelay is 0, so it is possible that the chore is
> run before we set chore servicer for it. And see this
> {code:title=ScheduledChore.java}
> public void run() {
> ...
> else if (stopper.isStopped() || !isScheduled()) {
> cancel(false);
> cleanup();
> if (LOG.isInfoEnabled()) LOG.info("Chore: " + getName() + " was
> stopped");
> }
> ...
> }
> ...
> public synchronized boolean isScheduled() {
> return choreServicer != null && choreServicer.isChoreScheduled(this);
> }
> {code}
> So it is possible that isScheduled() returns false and we start to cancel the
> chore. You can insert a sleep between scheduled chore and set chore servicer,
> then you can always get the log ' Chore: CompactionChecker was stopped'. But
> it does not always actually cancel the chore because the cancel method's
> implementation.
> {code:title=ScheduledChore.java}
> public synchronized void cancel(boolean mayInterruptIfRunning) {
> if (isScheduled()) choreServicer.cancelChore(this, mayInterruptIfRunning);
> choreServicer = null;
> }
> {code}
> So if you insert a sleep before cancel(remember to set a larger sleep time
> here), then you can make the test always fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)