I can reproduce it: ------------------------------------------------------- T E S T S -------------------------------------------------------
------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.master.TestMasterFailover Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 90.954 sec <<< FAILURE! Results : Failed tests: testMasterFailoverWithMockedRITOnDeadRS(org.apache.hadoop.hbase.master.TestMasterFailover): region=enabledTable,bbb,1319241846089.6b022df3f7399ee977683c6c5e4be009. Tests run: 4, Failures: 1, Errors: 0, Skipped: 0 -----邮件原件----- 发件人: Ted Yu [mailto:[email protected]] 发送时间: 2011年11月4日 13:21 收件人: [email protected]; lars hofhansl 主题: Re: TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS fails on Jenkins Please run the test in loop. I can reproduce the failure on my MacBook. Gary logged a jira about jmx exceptions. They're non-essential. Cheers On Thursday, November 3, 2011, lars hofhansl <[email protected]> wrote: > When I run that locally (latest trunk) it passes: > > ------------------------------------------------------- > T E S T S > ------------------------------------------------------- > Running org.apache.hadoop.hbase.master.TestMasterFailover > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 69.721 sec > > Results : > > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0 > > [INFO] ------------------------------------------------------------------------ > [INFO] BUILD SUCCESSFUL > [INFO] ------------------------------------------------------------------------ > [INFO] Total time: 2 minutes 29 seconds > [INFO] Finished at: Thu Nov 03 22:06:25 PDT 2011 > [INFO] Final Memory: 58M/286M > [INFO] ------------------------------------------------------------------------ > > > In the log I see some JMX related exceptions, but their timing did not > suggest any potentially hanging threads. > > (Linux, OpenJDK 1.6 64 bit, needed to set umask to 022) > > > -- Lars > > > > ----- Original Message ----- > From: Ted Yu <[email protected]> > To: [email protected] > Cc: > Sent: Thursday, November 3, 2011 8:55 PM > Subject: TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS fails on Jenkins > > Hi, > Currently TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS < > https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ < https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ >> > consistently fails on 0.92 and TRUNK. > > I intended to log a JIRA but https://issues.apache.org is giving me 503 > error. > > I briefly went over the code. > I think after each region is added to regionsThatShouldBeOnline, we should > log the name of region: > // Region of enabled on dead server gets closed but not ack'd by master > region = enabledAndOnDeadRegions.remove(0); > regionsThatShouldBeOnline.add(region); > log("2. expecting " + region.toString() + " to be online: "); > > so that if the assertion below fails we know what type of scenario wasn't > working: > for (HRegionInfo hri : regionsThatShouldBeOnline) { > assertTrue("region=" + hri.getRegionNameAsString(), > onlineRegions.contains(hri)); > } > > From the above mentioned test output I saw a lot of: > > 2011-11-03 21:52:58,652 FATAL [Thread-558.logSyncer] wal.HLog(1106): > Could not sync. Requesting close of hlog > java.io.IOException: Reflection > at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:225) > at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1090) > at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1194) > at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1056) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:223) > ... 4 more > Caused by: java.io.IOException: DFSOutputStream is closed > at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3483) > at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) > at org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944) > ... 8 more > > Maybe they have something to do with regions stuck in RIT. > > Cheers > >
