Hi Lars I was not in town and was in travel for the last 2 days. I will immediately check the reason for the testcase failures. Had I been there I would have helped out earlier.
Sorry about that. Regards Ram > -----Original Message----- > From: lars hofhansl [mailto:[email protected]] > Sent: Monday, October 08, 2012 2:07 AM > To: [email protected] > Subject: Re: State of the 0.94 tests > > After this change things look better. Apologies for the noise. Stay > tuned for the next RC. > > -- Lars > > > > ________________________________ > From: lars hofhansl <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Sunday, October 7, 2012 11:23 AM > Subject: Re: State of the 0.94 tests > > I looked back through the failures. I had recently enabled all "ubuntu" > build vms for the 0.94 builds. > It turns out that most of the environment issues occur on ubuntu2. I > excluded that from the build vms. > > > -- Lars > > > > ________________________________ > From: Andrew Purtell <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Sunday, October 7, 2012 1:36 AM > Subject: Re: State of the 0.94 tests > > Too many open files usually is an environment issue. > > Lars, you should consider setting up a private Jenkins as a sanity > check. > > On Oct 7, 2012, at 2:41 PM, lars hofhansl <[email protected]> wrote: > > > Looks like after all that whining I finally got a successful build. > > But I lost confidence in the current 0.94 code line. > > > > Still, it is possible that all of these were environmental issue. If > we can get a few more successful runs, it could be OK. > > > > -- Lars > > > > > > > > ________________________________ > > From: lars hofhansl <[email protected]> > > To: hbase-dev <[email protected]> > > Sent: Saturday, October 6, 2012 11:11 PM > > Subject: State of the 0.94 tests > > > > I've been trying (essentially the entire day) getting a successful > jenkins build for 0.94 (triggering the test run periodically from my > phone). Not a *single* run succeeded. > > This is clearly not acceptable. Something is off. > > > > The tests that fails the most frequently are: > > - > TestSplitTransactionOnCluster.testShouldThrowIOExceptionIfStoreFileSize > IsEmptyAndSHouldSuccessfullyExecuteRollback > > - > TestSplitTransactionOnCluster.testShouldClearRITWhenNodeFoundInSplittin > gState > > (The failure cause most of the time is too many files open, but also > fail because of unavailable regions). > > > > Both tests were added recently (since 0.94.2RC2). See HBASE-6854 and > HBASE-6853. > > > > Either there is something wrong with the tests, or we introduced some > problems in the code base. > > > > Note that I am not dinging these two changes specifically. Both were > fixes with a lot of thought and care behind them. > > > > There are also various time out issues in other tests. > > > > These were all the fixes added since the last RC: > > [HBASE-4565] - Maven HBase build broken on cygwin with > copynativelib.sh call > > [HBASE-6299] - RS starting region open while failing ack to > HMaster.sendRegionOpen() causes inconsistency in HMaster's region state > and a series of successive problems > > [HBASE-6679] - RegionServer aborts due to race between compaction and > split > > [HBASE-6688] - folder referred by thrift demo app instructions is > outdated > > [HBASE-6854] - Deletion of SPLITTING node on split rollback should > clear the region from RIT > > [HBASE-6871] - HFileBlockIndex Write Error in HFile V2 due to > incorrect split into intermediate index blocks > > [HBASE-6888] - HBase scripts ignore any HBASE_OPTS set in the > environment > > [HBASE-6889] - Ignore source control files with apache-rat > > [HBASE-6900] - RegionScanner.reseek() creates NPE when a flush or > compaction happens before the reseek. > > [HBASE-6901] - Store file compactSelection throws > ArrayIndexOutOfBoundsException > > [HBASE-6906] - TestHBaseFsck#testQuarantine* tests are flakey due to > TableNotEnabledException > > [HBASE-6912] - Filters are not properly applied in certain cases > > [HBASE-6916] - HBA logs at info level errors that won't show in the > shell > > [HBASE-6920] - On timeout connecting to master, client can get stuck > and never make progress > > [HBASE-6927] - WrongFS using HRegionInfo.getTableDesc() and different > fs for hbase.root and fs.defaultFS > > [HBASE-6946] - JavaDoc missing from release tarballs > > [HBASE-5582] - "No HServerInfo found for" should be a WARNING message > > [HBASE-6914] - Scans/Gets/Mutations don't give a good error if the > table is disabled. > > [HBASE-6853] - IllegalArgument Exception is thrown when an empty > region is spliitted. > > > > Unless somebody (Ram :) ) speaks up I will roll back HBASE-6854 and > HBASE-6853 (and maybe HBASE-6299) > > > > I could also roll all of these back except HBASE-6920 (which is the > one that sunk the last RC). And leave the rest of the next RC. > > > > Also, from now on - at least until 0.94.2 is released, please clear > all 0.94 changes with me before you commit. There is clearly too much > churn going into 0.94 too quickly, which prevents 0.94.2 from > stabilizing. > > > > -- Lars
