[
https://issues.apache.org/jira/browse/HBASE-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Lawlor updated HBASE-6778:
-----------------------------------
Attachment: HBASE_6778_v6.patch
Okay, I think I have it this time (though I may be jinxing it).
I am claiming that the issue that has caused all of these test failures is
rooted in how the new ScheduledChore's handle exceptions within their run
method vs. how the old Chore implementation handled exceptions.
The issue is that ScheduledChores are canceling themselves whenever an
exception is thrown during the execution of chore(). This is incorrect and we
should instead catch the error, log it, and see if the ScheduledChore has been
stopped. If stopped, then we can cancel the chore.
The attached patch has added this behavior, and testing locally shows that the
failing zombie test passes.
> Deprecate Chore; its a thread per task when we should have one thread to do
> all tasks
> -------------------------------------------------------------------------------------
>
> Key: HBASE-6778
> URL: https://issues.apache.org/jira/browse/HBASE-6778
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Jonathan Lawlor
> Fix For: 2.0.0, 1.1.0
>
> Attachments: AFTER_thread_dump.txt, BEFORE_thread_dump.txt,
> HBASE_6778_WIP_v1.patch, HBASE_6778_WIP_v2.patch, HBASE_6778_v1.patch,
> HBASE_6778_v2.patch, HBASE_6778_v3.patch, HBASE_6778_v3.patch,
> HBASE_6778_v4.patch, HBASE_6778_v5.patch, HBASE_6778_v6.patch,
> thread_dump_HMaster.local.out
>
>
> Should use something like ScheduledThreadPoolExecutor instead (Elliott said
> this first I think; J-D said something similar just now).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)