[ https://issues.apache.org/jira/browse/HADOOP-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2017: -------------------------- Attachment: trsa.patch A patch w/ more logging and thread dumping to better help what is going on, and a mechanism that notices moved regions sooner. {code} HADOOP-2017 TestRegionServerAbort failure in patch build #903 and nightly #266 Notice moved META regions sooner. Also added more logging and thread dumping once a minute when test starts to take too long so can see where we are hung (if we are hung). M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHStoreFile.java Inherit from HBaseTestCase. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseClusterTestCase.java (threadDumpingJoin): Added. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestRegionServerAbort.java Run verification in its own thread so can concurrently thread dump if test is going on too long. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/DFSAbort.java Moved join up into parent class. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/Chore.java Remove unused import. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java (MetaRegion.toString): Added. Added logging around assignment checking and log split. (MetaRegion.compareTo): Add consideration of server address. (numberOfMetaRegions, metaRegionsToScan, onlineMetaRegions): Put declaration and assignment together and made final. (scanOneMetaRegion): If the region is no longer in onlineMetaRegions, give up trying to scan. (unassignRootRegion): Added (Not yet finished). {code} > [hbase] TestRegionServerAbort failure in patch build #903 and nightly #266 > -------------------------------------------------------------------------- > > Key: HADOOP-2017 > URL: https://issues.apache.org/jira/browse/HADOOP-2017 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Reporter: stack > Priority: Minor > Fix For: 0.15.0 > > Attachments: trsa.patch > > > In patch build #903, the metascanner keeps trying to go to the downed server > even though onlineMetaRegions has been updated w/ new location and then the > metascanner just goes away (or hangs). > In nightly build #266, its a similar scenario only the remaining region > servers decide to shut down because they haven't been able to reach the > master in 7 seconds. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.