[
https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329761#comment-16329761
]
stack commented on HBASE-19527:
-------------------------------
The Master or RegionServer threads determine whether we should go down or not.
If they are stopped or aborted, then all else should go down. Lets not be
having to do a decision-per-thread on when to go down (this gets really hard to
do... sometimes its exit if process is stopped, other times it is if cluster is
up or down, and other combos...). If a worker thread is doing something that it
can't give up, that we cannot recover from, thats a problem; lets find it
sooner rather than later given threads can exit any which way at any time.
{quote}... it'll crash end many background work like create table, merge
regions, and anything that we aim to build on top of proc framework - backup,
replication, etc
{quote}
Yeah. We pick up the work again when the Master comes back up.
{quote} - Pro: Reliably ending ongoing work at defined sync points{quote}
Finding all the combinations, the code paths that lead to an exit, and exits
concurrent with various combinations of operations, would be too much work;
we'd never achieve complete coverage – I suggest.
{quote}Start a ShutdownMonitor thread in HMaster.stop() (which should be Daemon
thread) and if it finds itself running for more than X seconds, then call
System.exit() (with a nice msg on why such abruptness of course).
{quote}
Extra complexity in my view. We have shutdown handlers and too many threads
already.
Suggest we try this and the watch the flakies a while... Can revert if a bad
idea.
> Make ExecutorService threads daemon=true.
> -----------------------------------------
>
> Key: HBASE-19527
> URL: https://issues.apache.org/jira/browse/HBASE-19527
> Project: HBase
> Issue Type: Sub-task
> Reporter: stack
> Assignee: stack
> Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19527.branch-2.001.patch,
> HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch,
> HBASE-19527.master.001.patch, HBASE-19527.master.001.patch,
> HBASE-19527.master.002.patch
>
>
> Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going
> down, no point in these threads sticking around (I think). Let me try this.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)