[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331608#comment-16331608 ] Hudson commented on HBASE-19527: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4426 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4426/]) HBASE-19527 Make ExecutorService threads daemon=true (stack: rev 646770dd51aefab1174ae949230e084402ea8336) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * (edit) hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRegionsOnMasterOptions.java > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331032#comment-16331032 ] stack commented on HBASE-19527: --- Got a +1 too up on RB from [~Apache9] (Need to fix an import). Here is a case where this patch will help. TestRegionsOnMasterOptions.testRegionsOnAllServers is one of our flakies. Looking at a hang, a Pv2 Worker thread is stuck trying to set region status to CLOSE but the hbase:meta is not available – we are on our way out. Thread 2087 (ProcExecWrkr-8): State: TIMED_WAITING Blocked count: 6 Waited count: 11937 Stack: java.lang.Object.wait(Native Method) org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.waitUntilDone(AsyncRequestFutureImpl.java:1228) org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.waitUntilDone(AsyncRequestFutureImpl.java:1197) org.apache.hadoop.hbase.client.HTable.doBatchWithCallback(HTable.java:485) org.apache.hadoop.hbase.util.MultiHConnection.processBatchCallback(MultiHConnection.java:122) org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:222) org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateUserRegionLocation(RegionStateStore.java:209) org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:149) org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1536) org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1456) org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1225) org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1735) > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329863#comment-16329863 ] Appy commented on HBASE-19527: -- We can't make production use case worse (abrupt endings on shutdown and unnecessary recoveries on restart) for the sake of extra testing. Former is far more important than latter. bq. The Master or RegionServer threads determine whether we should go down or not. If they are stopped or aborted, then all else should go down. Lets not be having to do a decision-per-thread on when to go down (this gets really hard to do... sometimes its exit if process is stopped, other times it is if cluster is up or down, and other combos...). I agree it's hard to do for everything, but worth doing (in this case, keeping what is) for few critical systems. ProcExecutor is critical and will only become more important with time as more sub-systems (replication, backup, etc) move to it. I don't mind the change in ExecutorService (mostly because i don't enough to make a case for it, nor have time to dig.) Among other thread pools, RPC executors for user requests are probably even lesser important and can go down randomly (not relevant here, but trying to think holistically to bring good points to table) bq. If a worker thread is doing something that it can't give up, that we cannot recover from, thats a problem; lets find it sooner rather than later given threads can exit any which way at any time. True we can use more fault tolerance testing. But the answer to that should be adding more fault tolerant testing, rather than making the system indeterministic. bq. Finding all the combinations, the code paths that lead to an exit, and exits concurrent with various combinations of operations, would be too much work; we'd never achieve complete coverage – I suggest. Yeah, we can never do that. If we could, we won't need the "guard". bq. Suggest we try this and the watch the flakies a while... Can revert if a bad idea. The alternate is having extra complexity for cleaner shutdown and restart. How will changes in flakies justify for/against that? Btw, all our ProcFramework's fault tolerance testing is doing join() on these threads, so making them daemon doesn't make those tests any better. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329850#comment-16329850 ] Hadoop QA commented on HBASE-19527: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 39s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 46s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green} hbase-server: The patch generated 0 new + 21 unchanged - 1 fixed = 21 total (was 22) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 25s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 17m 6s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 11s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 96m 1s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}137m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-19527 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12906490/HBASE-19527.branch-2.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 46524693c5e5 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / af2d890055 | | maven
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329761#comment-16329761 ] stack commented on HBASE-19527: --- The Master or RegionServer threads determine whether we should go down or not. If they are stopped or aborted, then all else should go down. Lets not be having to do a decision-per-thread on when to go down (this gets really hard to do... sometimes its exit if process is stopped, other times it is if cluster is up or down, and other combos...). If a worker thread is doing something that it can't give up, that we cannot recover from, thats a problem; lets find it sooner rather than later given threads can exit any which way at any time. {quote}... it'll crash end many background work like create table, merge regions, and anything that we aim to build on top of proc framework - backup, replication, etc {quote} Yeah. We pick up the work again when the Master comes back up. {quote} - Pro: Reliably ending ongoing work at defined sync points{quote} Finding all the combinations, the code paths that lead to an exit, and exits concurrent with various combinations of operations, would be too much work; we'd never achieve complete coverage – I suggest. {quote}Start a ShutdownMonitor thread in HMaster.stop() (which should be Daemon thread) and if it finds itself running for more than X seconds, then call System.exit() (with a nice msg on why such abruptness of course). {quote} Extra complexity in my view. We have shutdown handlers and too many threads already. Suggest we try this and the watch the flakies a while... Can revert if a bad idea. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329734#comment-16329734 ] Appy commented on HBASE-19527: -- Oh, and that thread should also do thread dump before quitting. My overall motivation is - Build well-defined and deterministic system, and put guards in place if things go sideways. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329729#comment-16329729 ] Appy commented on HBASE-19527: -- Idk what's good - make them deamon thread or not. Been trying to reason in my head for quite a while now, but can't answer for sure. Just adding some thoughts here. WorkerThread class in ProcExecutor can stop itself if it's idle and the server is going down. But what about the threads doing actual work? Making daemon thread - Pro: master shutdown will not get stuck in any case - Con: Our 'clean' shutdown isn't exactly clean, it'll crash end many background work like create table, merge regions, and anything that we aim to build on top of proc framework - backup, replication, etc Not making threads daemon: - Pro: Reliably ending ongoing work at defined sync points - Cons: Master shutdown can get stuck I personally hate landing in a place where our 'clean' shutdown will actually be like a crash. One alternative I have is: Don't make them daemon threads. Start a ShutdownMonitor thread in HMaster.stop() (which should be Daemon thread) and if it finds itself running for more than X seconds, then call System.exit() (with a nice msg on why such abruptness of course). > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329667#comment-16329667 ] stack commented on HBASE-19527: --- 2.002 is retry. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329666#comment-16329666 ] stack commented on HBASE-19527: --- I'd prematurely applied this to master and branch. Had to revert in both places. Thats why above application failed. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.branch-2.002.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.002.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329528#comment-16329528 ] Hadoop QA commented on HBASE-19527: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} HBASE-19527 does not apply to branch-2. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.6.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-19527 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12906490/HBASE-19527.branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11092/console | | Powered by | Apache Yetus 0.6.0 http://yetus.apache.org | This message was automatically generated. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329519#comment-16329519 ] stack commented on HBASE-19527: --- Back again with a more radical patch [~appy] It also sets the worker threads in PE to be daemon. I see up in failing tests that we can get stuck trying to update an hbase:meta that has gone away at the end of a test. It is described even at the tail of HBASE-19533. This patch removes the special-handling described in HBASE-19533 where we disable the balancer so the test will shutdown and then adds in all the setDaemons on currently non-daemon threads. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.branch-2.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295820#comment-16295820 ] Appy commented on HBASE-19527: -- SG. Thanks > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295707#comment-16295707 ] stack commented on HBASE-19527: --- Thanks [~appy] I started to add comments and then decided to put this aside after making it a subissue of HBASE-19533. It needs a bit more research and testing and I think we need to look at the bigger HBASE-19533 picture. We will probably have to do it before hbase2 but don't think it critical for beta-1 so putting aside. Thanks for the input sir. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16294488#comment-16294488 ] Appy commented on HBASE-19527: -- Other cases are (looking for things calling ExecutorService#submit()): CompactedHFilesDischarger (chore), RegionReplicaFlushHandler, CloseRegionHandler, OpenMetaHandler, OpenPriorityRegionHandler, OpenRegionHandler. Am good with the idea of trying it. The effects of this one line change are much profound and we should address them too. For eg. - The check of if(rs.isStopping) in these handlers will be confusing. They will suggest that those are the only exit point in the flow, but in fact they are not. We should either remove them, but maybe they make sense (to avoid starting new work), in which case there should at least be a comment in each place stating the same. bq. I needed to keep a thread running until the last moment and didn't want to check stopped or aborted flags because these were being set too early in the shutdown process. Thats how I arrrived here. The thread that messages master about region changed state was ending abruptly as soon as the RS set 'stop' which it does early when a cluster is going down. Even after 'stop' is set, we still have the close of all regions to be done. With the messaging thread killed, the notice does not get to Master that a region has been closed. This gives a nice example to think about. Is the messaging thread daemon or not? (seems to me it is not) Is there a separate jira what you tackling this problem? Add a comment somewhere on this dependency of closing regions and messaging and how daemon threads help it? In summary, making these threads daemon makes sense. Although might break current assumptions and maybe logic. Although one line change, it has deeper effects, and the very least should document new assumptions in relevant places. > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Attachments: HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293952#comment-16293952 ] stack commented on HBASE-19527: --- bq. They would abruptly end even for a clean server shutdown. Right. And not being daemon, they could get into some state where they would prevent the RS from going down. Perhaps though making them daemon makes it so they only half do a process ... but we should be able to recover anyways even if something only half-done (an open, a close, a master operation, etc. -- all the places an ExecutorService is used). I think I should push this. If it brings on an issue, its probably something that was hidden by the way we currently go down. I needed to keep a thread running until the last moment and didn't want to check stopped or aborted flags because these were being set too early in the shutdown process. Thats how I arrrived here. The thread that messages master about region changed state was ending abruptly as soon as the RS set 'stop' which it does early when a cluster is going down. Even after 'stop' is set, we still have the close of all regions to be done. With the messaging thread killed, the notice does not get to Master that a region has been closed. (Currently we don't make use of this message but maybe we should -- see HBASE-19533) Thanks for discussion [~appy] > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Attachments: HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293683#comment-16293683 ] Hadoop QA commented on HBASE-19527: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 30s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 38s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 18m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0-beta1. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 22s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}142m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19527 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12902482/HBASE-19527.master.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 01a5908bc6a4 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / d276c3275e | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/10491/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/10491/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/10491/console | | Powered by | Apache Yetus 0.6.0 http://yetus.apache.org | This message was
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293659#comment-16293659 ] Appy commented on HBASE-19527: -- bq. Making the threads it runs be daemon seems fine but there may be some assumptions we're breaking w/ this change That was my biggest concern too. If a clean shutdown meant letting worker threads check for server status at fixed safe/check points and quitting by themselves, that would no longer happen now. They would abruptly end even for a clean server shutdown. For the case of server crash, they would have abruptly ended anyways, so all necessary recovery should already be in place. So all boils down to: Is it fine to abruptly quit whatever work these threads are doing in case of a clean shutdown? You mentioned region open/close. Those are regulated by master and RS are supposed to explicitly ack back success. So abruptly ending them should be fine (since master would detect lack of ack and take actions). What other cases? > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Attachments: HBASE-19527.master.001.patch, > HBASE-19527.master.001.patch, HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293632#comment-16293632 ] Hadoop QA commented on HBASE-19527: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 33s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 33s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 18m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0-beta1. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 48s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}140m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.balancer.TestRegionsOnMasterOptions | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19527 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12902472/HBASE-19527.master.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux db228335800a 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / f9f869f60a | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/10487/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/10487/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output |
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293468#comment-16293468 ] Hadoop QA commented on HBASE-19527: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 34s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 38s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 19m 2s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0-beta1. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 27s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}143m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestHRegion | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19527 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12902451/HBASE-19527.master.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 50d7bfe7ab21 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 15:49:21 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 20b42d2d70 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/10483/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/10483/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output |
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293408#comment-16293408 ] stack commented on HBASE-19527: --- A whole class to set a flag! Who ever wrote that was doing make work. Am setting flag on factory here Iirc so wouldn't work. Is it even used in Hadoop code base outside of the patch that committed it? > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Attachments: HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19527) Make ExecutorService threads daemon=true.
[ https://issues.apache.org/jira/browse/HBASE-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293404#comment-16293404 ] Appy commented on HBASE-19527: -- Came across this a while back. https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/util/Daemon.html Nice syntactic sugar. Can we use it too? > Make ExecutorService threads daemon=true. > - > > Key: HBASE-19527 > URL: https://issues.apache.org/jira/browse/HBASE-19527 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Attachments: HBASE-19527.master.001.patch > > > Let me try this. ExecutorService runs OPENs, CLOSE, etc. If Server is going > down, no point in these threads sticking around (I think). Let me try this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)