[
https://issues.apache.org/jira/browse/HBASE-13007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Lawlor updated HBASE-13007:
------------------------------------
Attachment: HBASE-13007-v2.patch
- Removed unnecessary synchronization on ScheduledChore#run(). This
synchronization could potentially cause threads to hang in the event that
ScheduledChore#chore() is hanging and won't let go of the lock for the
ScheduledChore.
- Changed the behavior of ScheduledChore#cancel() and
ChoreService#cancelChore(). They now uses mayInterruptIfRunning = true by
default (whereas before they used false).
- Removed ScheduledChore#resetState() as the idea of resetting the state of a
ScheduledChore seems inappropriate (was initially in place for testing)
- Removed unnecessary chores from the end of
TestChoreService#testCorePoolDecrease -- the effect of a core pool decrease can
be noticed without starting the chores that were removed. Starting those chores
made the test harder to understand.
If this patch passes the initial QA run I will reattach it again to trigger a
second run to try to ensure that a pass of TestChoreService is not a fluke
> Fix the test timeouts being caused by ChoreService
> ---------------------------------------------------
>
> Key: HBASE-13007
> URL: https://issues.apache.org/jira/browse/HBASE-13007
> Project: HBase
> Issue Type: Bug
> Reporter: Jonathan Lawlor
> Assignee: Jonathan Lawlor
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE-13007-v1.patch, HBASE-13007-v2.patch
>
>
> TestChoreService has been seen timing out in recent builds and the timeouts
> seem to be rooted in how the ChoreService cancels its chores when being
> shutdown. The issue is that during shutdown, the ChoreService calls
> synchronized methods on the ScheduledChore which could cause indefinite
> blocking if the scheduled chore is hanging in a synchronized method. We
> should instead call the appropriate cancel method within the ChoreService and
> add logic into ScheduledChores that allows them to realize when they have
> been cancelled.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)