[jira] [Comment Edited] (HBASE-19954) ShutdownHook should check whether shutdown hook is tracked by ShutdownHookManager
[ https://issues.apache.org/jira/browse/HBASE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16369611#comment-16369611 ] Ted Yu edited comment on HBASE-19954 at 2/20/18 12:23 AM: -- Patch v2 adds audience annotation for the ShutdownHookManager class. hasShutdownHook() is exercised by the TestBlockReorder against hadoop3. If specific scenario is needed to test hasShutdownHook(), let me know. was (Author: yuzhih...@gmail.com): Patch v2 adds audience annotation for the ShutdownHookManager class. hasShutdownHook() is exercised by the TestBlockReorder against hadoop3. If specific scenario is desired to test hasShutdownHook(), let me know. > ShutdownHook should check whether shutdown hook is tracked by > ShutdownHookManager > - > > Key: HBASE-19954 > URL: https://issues.apache.org/jira/browse/HBASE-19954 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: 19954.v1.txt, 19954.v2.txt > > > Currently ShutdownHook#suppressHdfsShutdownHook() does the following: > {code} > synchronized (fsShutdownHooks) { > boolean isFSCacheDisabled = > fs.getConf().getBoolean("fs.hdfs.impl.disable.cache", false); > if (!isFSCacheDisabled && > !fsShutdownHooks.containsKey(hdfsClientFinalizer) > && !ShutdownHookManager.deleteShutdownHook(hdfsClientFinalizer)) { > {code} > There is no check that ShutdownHookManager still tracks the shutdown hook, > leading to potential RuntimeException (as can be observed in hadoop3 Jenkins > job). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-19954) ShutdownHook should check whether shutdown hook is tracked by ShutdownHookManager
[ https://issues.apache.org/jira/browse/HBASE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368420#comment-16368420 ] Ted Yu edited comment on HBASE-19954 at 2/18/18 1:54 AM: - Did some debugging by installing hadoop-common of hadoop3 with additional logging into local maven repo. {code} 2018-02-17 16:14:14,573 INFO [Time-limited test] util.ShutdownHookManager(286): clearing hooks 2018-02-17 16:14:14,588 INFO [Time-limited test] hbase.HBaseTestingUtility(1114): Minicluster is down 2018-02-17 16:14:14,627 INFO [Time-limited test] hbase.ResourceChecker(172): after: fs.TestBlockReorder#testBlockLocationReorder Thread=110 (was 8) {code} Note the above was the first test in TestBlockReorder where the {{hooks}} Set of hadoop ShutdownHookManager was cleared (first line). The 'Failed suppression' exception happened in the second subtest where the FileSystem$Cache$ClientFinalizer instance was no longer in the Set. I dumped the contents of the {{hooks}} Set at time of the exception and saw fsdataset.impl.BlockPoolSlice instances but no ClientFinalizer instance. After poking around hadoop ShutdownHookManager, I don't see bug. was (Author: yuzhih...@gmail.com): Did some debugging by installing hadoop-common with additional logging into local maven repo. {code} 2018-02-17 16:14:14,573 INFO [Time-limited test] util.ShutdownHookManager(286): clearing hooks 2018-02-17 16:14:14,588 INFO [Time-limited test] hbase.HBaseTestingUtility(1114): Minicluster is down 2018-02-17 16:14:14,627 INFO [Time-limited test] hbase.ResourceChecker(172): after: fs.TestBlockReorder#testBlockLocationReorder Thread=110 (was 8) {code} Note the above was the first test in TestBlockReorder where the {{hooks}} Set of hadoop ShutdownHookManager was cleared (first line). The 'Failed suppression' exception happened in the second subtest where the FileSystem$Cache$ClientFinalizer instance was no longer in the Set. I dumped the contents of the {{hooks}} Set at time of the exception and saw fsdataset.impl.BlockPoolSlice instances but no ClientFinalizer instance. After poking around hadoop ShutdownHookManager, I don't see bug. > ShutdownHook should check whether shutdown hook is tracked by > ShutdownHookManager > - > > Key: HBASE-19954 > URL: https://issues.apache.org/jira/browse/HBASE-19954 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: 19954.v1.txt > > > Currently ShutdownHook#suppressHdfsShutdownHook() does the following: > {code} > synchronized (fsShutdownHooks) { > boolean isFSCacheDisabled = > fs.getConf().getBoolean("fs.hdfs.impl.disable.cache", false); > if (!isFSCacheDisabled && > !fsShutdownHooks.containsKey(hdfsClientFinalizer) > && !ShutdownHookManager.deleteShutdownHook(hdfsClientFinalizer)) { > {code} > There is no check that ShutdownHookManager still tracks the shutdown hook, > leading to potential RuntimeException (as can be observed in hadoop3 Jenkins > job). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-19954) ShutdownHook should check whether shutdown hook is tracked by ShutdownHookManager
[ https://issues.apache.org/jira/browse/HBASE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368134#comment-16368134 ] stack edited comment on HBASE-19954 at 2/17/18 7:55 AM: You don't say why it happens. Why in hadoop3 do we get this exception and not in hadoop2? Why is hook removed earlier or not installed? My concern this patch just papers over a more substantial issue; a hook not being installed. Looking at the patch, there is refactoring and no test. Seems easy enough to add. Compound checks like that done in ShutdownHook.java are easy to get wrong. Needs Audience. I tried it and seems to address the below. I see this exception when TestBlockReorder fails against hadoop3. The exception is: {code} java.lang.RuntimeException: Failed suppression of fs shutdown hook: org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@771d03a8 at org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:207) at org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:85) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:927) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:187) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:133) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1942) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:307) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:168) at java.lang.Thread.run(Thread.java:745) {code} was (Author: stack): You don't say why it happens. Why in hadoop3 do we get this exception and not in hadoop2? Why is hook removed earlier or not installed? Looking at the patch, there is refactoring and no test. Seems easy enough to add. Compound checks like that done in ShutdownHook.java are easy to get wrong. Needs Audience. I tried it and seems to address the below. I see this exception when TestBlockReorder fails against hadoop3. The exception is: {code} java.lang.RuntimeException: Failed suppression of fs shutdown hook: org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@771d03a8 at org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:207) at org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:85) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:927) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:187) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:133) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1942) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:307) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:168) at java.lang.Thread.run(Thread.java:745) {code} > ShutdownHook should check whether shutdown hook is tracked by > ShutdownHookManager > - > > Key: HBASE-19954 > URL: https://issues.apache.org/jira/browse/HBASE-19954 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: 19954.v1.txt > > > Currently ShutdownHook#suppressHdfsShutdownHook() does the following: > {code} > synchronized (fsShutdownHooks) { > boolean isFSCacheDisabled = > fs.getConf().getBoolean("fs.hdfs.impl.disable.cache", false); > if (!isFSCacheDisabled && > !fsShutdownHooks.containsKey(hdfsClientFinalizer) > && !ShutdownHookManager.deleteShutdownHook(hdfsClientFinalizer)) { > {code} > There is no check that ShutdownHookManager still tracks the shutdown hook, > leading to potential RuntimeException (as can be observed in hadoop3 Jenkins > job). -- This message was sent by Atlassian JIRA (v7.6.3#76005)