Hmm... How about just adding to the contributor section that new tests should run reliably N times locally. N=10? N=20? N=100?
On Wed, Dec 26, 2012 at 12:02 PM, Enis Söztutar <[email protected]> wrote: > Just a reference of some of the recent efforts that went in: > HBASE-7432 TestHBaseFsck prevents testsuite from finishing > HBASE-7431 TestSplitTransactionOnCluster tests still flaky > HBASE-7417 Test patch, hopefully fixes TestReplication > HBASE-7421 TestHFileCleaner->testHFileCleaning has an aggressive > timeout > HBASE-7398 [0.94 UNIT TESTS] TestAssignmentManager fails frequently on > CentOS 5 > HBASE-7338 Fix flaky condition for > > org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange > HBASE-6175 TestFSUtils flaky on hdfs getFileStatus method > HBASE-7343 Fix flaky condition for TestDrainingServer (Himanshu) > HBASE-7301 Force ipv4 for unit tests > HBASE-7300 HbckTestingUtil needs to keep a static executor to lower > the number of threads used > HBASE-6206 Large tests fail with jdk1.7 > HBASE-7252 TestSizeBasedThrottler fails occasionally > HBASE-7235 TestMasterObserver is flaky > HBASE-7172 TestSplitLogManager.testVanishingTaskZNode() fails when run > individually and is flaky > HBASE-7177 TestZooKeeperScanPolicyObserver.testScanPolicyObserver is > flaky > HBASE-7166 TestSplitTransactionOnCluster tests are flaky > HBASE-7165 TestSplitLogManager.testUnassignedTimeout is flaky > HBASE-5984 TestLogRolling.testLogRollOnPipelineRestart failed with > HADOOP 2.0.0 > HBASE-7142 TestSplitLogManager#testDeadWorker may fail because of hard > limit on the TimeoutMonitor's timeout period (Himanshu) > HBASE-7143 TestMetaMigrationRemovingHTD fails when used with Hadoop > 0.23/2.x (Andrey Klochlov) > HBASE-6958 TestAssignmentManager sometimes fails > HBASE-6305 TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds. > (Himanshu) > HBASE-6796 ADDENDUM, remove spurious time limit from testHFileCleaning > HBASE-6852, REVERT again, due to unexplained test failures that only > occur on the jenkins machines > HBASE-7077 ADDENDUM, add TestCategory > HBASE-6733 TestReplication.queueFailover occasionally fails [Part-2] > HBASE-6906 TestHBaseFsck#testQuarantine* tests are flakey due to > TestNotEnabledException > HBASE-6784 TestCoprocessorScanPolicy is sometimes flaky when run > locally > HBASE-6714 TestMultiSlaveReplication#testMultiSlaveReplication may fail > HBASE-6715 TestFromClientSide.testCacheOnWriteEvictOnClose is flaky > > > Please keep these in mind, when you are writing a new test. > Enis > > > On Wed, Dec 26, 2012 at 10:03 AM, Stack <[email protected]> wrote: > > > I just added a section to the 'contributing' section on committers being > > responsible for ensuring contributor's patches do not break build or > tests. > > St.Ack > > > > > > On Wed, Dec 26, 2012 at 9:08 AM, Stack <[email protected]> wrote: > > > > > Or there is a submitting patches section: > > > http://hbase.apache.org/book.html#submitting.patches > > > St.Ack > > > > > > > > > On Wed, Dec 26, 2012 at 8:53 AM, Stack <[email protected]> wrote: > > > > > >> Thanks for doing the fixup "Iron Hand". +1 on these rules for a > branch > > >> or for any branch (We'll have to do the same for for trunk when it > > becomes > > >> 0.96 branch). Should we add something here: > > >> http://hbase.apache.org/book.html#hbase.tests Or to the community > > >> section: http://hbase.apache.org/book.html#community ? Or to the > > >> developer section? > > >> > > >> St.Ack > > >> > > >> > > >> On Tue, Dec 25, 2012 at 11:57 AM, lars hofhansl <[email protected] > > >wrote: > > >> > > >>> During the past few days I spend some time to bring the 0.94 test > back > > >>> into shape. > > >>> > > >>> GC issues, bad backports, hanging tests, memory issues, you name it. > > >>> I do not want to ever have to do that again. > > >>> > > >>> The good news is: The 0.94 tests are back in shape now. Yeah! > > >>> > > >>> If you commit a patch it is your responsibility to make sure it > passes > > >>> the test suite. > > >>> Either the tests should be fixed in a reasonable amount of time or > the > > >>> commit should be reverted. > > >>> This is mainly for committers, contributors should also watch the > test > > >>> runs for their patches. > > >>> No excuses. The tests are passing now. > > >>> I do not care whether a test passes locally, or whether it fails > > rarely, > > >>> or whether some tests failed previously, or whatever. > > >>> > > >>> Please, consider this a condition for me to continue as release > manager > > >>> for 0.94. > > >>> (This is only for the 0.94 tests. I cannot speak for HadoopQA, or the > > >>> regular trunk test suite, although eventually I assume we want > similar > > >>> guidelines there) > > >>> > > >>> I increased the retention time for past builds. I will find you :) > > >>> I will publicly shame you. I will retroactively -1 the change and > > revert > > >>> it, and then shame you again. :) > > >>> > > >>> Lastly, this is a function of the large amount of contributed > patches. > > >>> So it is a good problem to have. > > >>> HBase it an actively maintained project and we certainly want to keep > > it > > >>> this way, just with an acknoledgement that keeping the test suite > > passing > > >>> is important. > > >>> > > >>> Thanks and Merry Christmas (to whoever celebrates that). > > >>> > > >>> -- Lars > > >> > > >> > > >> > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
