Re: BadApple report. Seems like I'm wasting my time.
I still think it’s a mistake to try and use all the Jenkins results to drive ignoring tests. It needs to be an objective measure in a good env. We also should not be ignoring tests in mass.l without individual consideration. Critical test coverage should be treated differently than any random test, especially when stability is sometimes simple to achieve for that test. A decade+ of history says it’s unlikely you get much consistent help digging out of a huge test ignore hell. Beasting in a known good environment and a few very interested parties is the only path out of this if you ask me. We need to get clean in a known good env and then automate beasting defense, using Jenkins to find issues in other environments. Unfortunately, not something I can help out with in the short term anymore. Mark On Wed, Aug 1, 2018 at 8:10 AM Erick Erickson wrote: > Alexandre: > > Feel free! What I'm struggling with is not that someone checked in > some code that all the sudden started breaking things. Rather that a > test that's been working perfectly will fail once the won't > reproducibly fail again and does _not_ appear to be related to recent > code changes. > > In fact that's the crux of the matter, it's difficult/impossible to > tell at a glance when a test fails whether it is or is not related to > a recent code change. > > Erick > > On Wed, Aug 1, 2018 at 8:05 AM, Alexandre Rafalovitch > wrote: > > Just a completely random thought that I do not have deep knowledge for > > (still learning my way around Solr tests). > > > > Is this something that Machine Learning could help with? The Github > > repo/history is a fantastic source of learning on who worked on which > > file, how often, etc. We certainly should be able to get some 'most > > significant developer' stats out of that. > > > > Regards, > >Alex. > > > > On 1 August 2018 at 10:56, Erick Erickson > wrote: > >> Shawn: > >> > >> Trouble is there were 945 tests that failed at least once in the last > >> 4 weeks. And the trend is all over the map on a weekly basis. > >> > >> e-mail-2018-06-11.txt: There were 989 unannotated tests that failed > >> e-mail-2018-06-18.txt: There were 689 unannotated tests that failed > >> e-mail-2018-06-25.txt: There were 555 unannotated tests that failed > >> e-mail-2018-07-02.txt: There were 723 unannotated tests that failed > >> e-mail-2018-07-09.txt: There were 793 unannotated tests that failed > >> e-mail-2018-07-16.txt: There were 809 unannotated tests that failed > >> e-mail-2018-07-23.txt: There were 953 unannotated tests that failed > >> e-mail-2018-07-30.txt: There were 945 unannotated tests that failed > >> > >> I'm BadApple'ing tests that fail every week for the last 4 weeks on > >> the theory that those are not temporary issues (hey, we all commit > >> code that breaks something then have to figure out why and fix). > >> > >> I also have the feeling that somewhere, somehow, our test framework is > >> making some assumptions that are invalid. Or too strict. Or too fast. > >> Or there's some fundamental issue with some of our classes. Or... The > >> number of sporadic issues where the Object Tracker spits stuff out for > >> instance screams that some assumption we're making, either in the code > >> or in the test framework is flawed. > >> > >> What I don't know is how to make visible progress. It's discouraging > >> to fix something and then next week have more tests fail for unrelated > >> reasons. > >> > >> Visibility is the issue to me. We have no good way of saying "these > >> tests _just started failing for a reason. As a quick experiment, I > >> extended the triage to 10 weeks (no attempt to ascertain if these > >> tests even existed 10 weeks ago). Here are the tests that have _only_ > >> failed in the last week, not the previous 9. BadApple'ing anything > >> that's only failed once seems overkill > >> > >> Although the test that failed 77 times does just stand out > >> > >> week pctruns failstest > >> 00.2 460 1 > >> CloudSolrClientTest.testVersionsAreReturned > >> 00.2 466 1 > >> ComputePlanActionTest.testSelectedCollections > >> 00.2 464 1 > >> ConfusionMatrixGeneratorTest.testGetConfusionMatrixWithBM25NB > >> 08.1 37 3 IndexSizeTriggerTest(suite) > >> 00.2 454 1 > MBeansHandlerTest.testAddedMBeanDiff > >> 00.2 454 1 MBeansHandlerTest.testDiff > >> 00.2 455 1 MetricTriggerTest.test > >> 00.2 455 1 MetricsHandlerTest.test > >> 00.2 455 1 MetricsHandlerTest.testKeyMetrics > >> 00.2 453 1 RequestHandlersTest.testInitCount > >> 00.2 453 1 RequestHandlersTest.testStatistics > >> 00.2 453 1 > ScheduledTriggerIntegrationTest(suite) > >> 00.2 451 1 >
Re: BadApple report. Seems like I'm wasting my time.
Alexandre: Feel free! What I'm struggling with is not that someone checked in some code that all the sudden started breaking things. Rather that a test that's been working perfectly will fail once the won't reproducibly fail again and does _not_ appear to be related to recent code changes. In fact that's the crux of the matter, it's difficult/impossible to tell at a glance when a test fails whether it is or is not related to a recent code change. Erick On Wed, Aug 1, 2018 at 8:05 AM, Alexandre Rafalovitch wrote: > Just a completely random thought that I do not have deep knowledge for > (still learning my way around Solr tests). > > Is this something that Machine Learning could help with? The Github > repo/history is a fantastic source of learning on who worked on which > file, how often, etc. We certainly should be able to get some 'most > significant developer' stats out of that. > > Regards, >Alex. > > On 1 August 2018 at 10:56, Erick Erickson wrote: >> Shawn: >> >> Trouble is there were 945 tests that failed at least once in the last >> 4 weeks. And the trend is all over the map on a weekly basis. >> >> e-mail-2018-06-11.txt: There were 989 unannotated tests that failed >> e-mail-2018-06-18.txt: There were 689 unannotated tests that failed >> e-mail-2018-06-25.txt: There were 555 unannotated tests that failed >> e-mail-2018-07-02.txt: There were 723 unannotated tests that failed >> e-mail-2018-07-09.txt: There were 793 unannotated tests that failed >> e-mail-2018-07-16.txt: There were 809 unannotated tests that failed >> e-mail-2018-07-23.txt: There were 953 unannotated tests that failed >> e-mail-2018-07-30.txt: There were 945 unannotated tests that failed >> >> I'm BadApple'ing tests that fail every week for the last 4 weeks on >> the theory that those are not temporary issues (hey, we all commit >> code that breaks something then have to figure out why and fix). >> >> I also have the feeling that somewhere, somehow, our test framework is >> making some assumptions that are invalid. Or too strict. Or too fast. >> Or there's some fundamental issue with some of our classes. Or... The >> number of sporadic issues where the Object Tracker spits stuff out for >> instance screams that some assumption we're making, either in the code >> or in the test framework is flawed. >> >> What I don't know is how to make visible progress. It's discouraging >> to fix something and then next week have more tests fail for unrelated >> reasons. >> >> Visibility is the issue to me. We have no good way of saying "these >> tests _just started failing for a reason. As a quick experiment, I >> extended the triage to 10 weeks (no attempt to ascertain if these >> tests even existed 10 weeks ago). Here are the tests that have _only_ >> failed in the last week, not the previous 9. BadApple'ing anything >> that's only failed once seems overkill >> >> Although the test that failed 77 times does just stand out >> >> week pctruns failstest >> 00.2 460 1 >> CloudSolrClientTest.testVersionsAreReturned >> 00.2 466 1 >> ComputePlanActionTest.testSelectedCollections >> 00.2 464 1 >> ConfusionMatrixGeneratorTest.testGetConfusionMatrixWithBM25NB >> 08.1 37 3 IndexSizeTriggerTest(suite) >> 00.2 454 1 MBeansHandlerTest.testAddedMBeanDiff >> 00.2 454 1 MBeansHandlerTest.testDiff >> 00.2 455 1 MetricTriggerTest.test >> 00.2 455 1 MetricsHandlerTest.test >> 00.2 455 1 MetricsHandlerTest.testKeyMetrics >> 00.2 453 1 RequestHandlersTest.testInitCount >> 00.2 453 1 RequestHandlersTest.testStatistics >> 00.2 453 1 ScheduledTriggerIntegrationTest(suite) >> 00.2 451 1 >> SearchRateTriggerTest.testWaitForElapsed >> 00.2 425 1 >> SoftAutoCommitTest.testSoftCommitWithinAndHardCommitMaxTimeRapidAdds >> 0 14.7 525 77 >> StreamExpressionTest.testSignificantTermsStream >> 00.2 454 1 TestBadConfig(suite) >> 00.2 465 1 >> TestBlockJoin.testMultiChildQueriesOfDiffParentLevels >> 00.6 462 3 >> TestCloudCollectionsListeners.testCollectionDeletion >> 00.2 456 1 TestInfoStreamLogging(suite) >> 00.2 456 1 TestLazyCores.testLazySearch >> 00.2 473 1 >> TestLucene70DocValuesFormat.testSortedSetAroundBlockSize >> 0 15.4 26 4 >> TestMockDirectoryWrapper.testThreadSafetyInListAll >> 00.2 454 1 TestNodeLostTrigger.testTrigger >> 00.2 453 1 TestRecovery.stressLogReplay >> 00.2 505 1 >>
Re: BadApple report. Seems like I'm wasting my time.
Just a completely random thought that I do not have deep knowledge for (still learning my way around Solr tests). Is this something that Machine Learning could help with? The Github repo/history is a fantastic source of learning on who worked on which file, how often, etc. We certainly should be able to get some 'most significant developer' stats out of that. Regards, Alex. On 1 August 2018 at 10:56, Erick Erickson wrote: > Shawn: > > Trouble is there were 945 tests that failed at least once in the last > 4 weeks. And the trend is all over the map on a weekly basis. > > e-mail-2018-06-11.txt: There were 989 unannotated tests that failed > e-mail-2018-06-18.txt: There were 689 unannotated tests that failed > e-mail-2018-06-25.txt: There were 555 unannotated tests that failed > e-mail-2018-07-02.txt: There were 723 unannotated tests that failed > e-mail-2018-07-09.txt: There were 793 unannotated tests that failed > e-mail-2018-07-16.txt: There were 809 unannotated tests that failed > e-mail-2018-07-23.txt: There were 953 unannotated tests that failed > e-mail-2018-07-30.txt: There were 945 unannotated tests that failed > > I'm BadApple'ing tests that fail every week for the last 4 weeks on > the theory that those are not temporary issues (hey, we all commit > code that breaks something then have to figure out why and fix). > > I also have the feeling that somewhere, somehow, our test framework is > making some assumptions that are invalid. Or too strict. Or too fast. > Or there's some fundamental issue with some of our classes. Or... The > number of sporadic issues where the Object Tracker spits stuff out for > instance screams that some assumption we're making, either in the code > or in the test framework is flawed. > > What I don't know is how to make visible progress. It's discouraging > to fix something and then next week have more tests fail for unrelated > reasons. > > Visibility is the issue to me. We have no good way of saying "these > tests _just started failing for a reason. As a quick experiment, I > extended the triage to 10 weeks (no attempt to ascertain if these > tests even existed 10 weeks ago). Here are the tests that have _only_ > failed in the last week, not the previous 9. BadApple'ing anything > that's only failed once seems overkill > > Although the test that failed 77 times does just stand out > > week pctruns failstest > 00.2 460 1 > CloudSolrClientTest.testVersionsAreReturned > 00.2 466 1 > ComputePlanActionTest.testSelectedCollections > 00.2 464 1 > ConfusionMatrixGeneratorTest.testGetConfusionMatrixWithBM25NB > 08.1 37 3 IndexSizeTriggerTest(suite) > 00.2 454 1 MBeansHandlerTest.testAddedMBeanDiff > 00.2 454 1 MBeansHandlerTest.testDiff > 00.2 455 1 MetricTriggerTest.test > 00.2 455 1 MetricsHandlerTest.test > 00.2 455 1 MetricsHandlerTest.testKeyMetrics > 00.2 453 1 RequestHandlersTest.testInitCount > 00.2 453 1 RequestHandlersTest.testStatistics > 00.2 453 1 ScheduledTriggerIntegrationTest(suite) > 00.2 451 1 SearchRateTriggerTest.testWaitForElapsed > 00.2 425 1 > SoftAutoCommitTest.testSoftCommitWithinAndHardCommitMaxTimeRapidAdds > 0 14.7 525 77 > StreamExpressionTest.testSignificantTermsStream > 00.2 454 1 TestBadConfig(suite) > 00.2 465 1 > TestBlockJoin.testMultiChildQueriesOfDiffParentLevels > 00.6 462 3 > TestCloudCollectionsListeners.testCollectionDeletion > 00.2 456 1 TestInfoStreamLogging(suite) > 00.2 456 1 TestLazyCores.testLazySearch > 00.2 473 1 > TestLucene70DocValuesFormat.testSortedSetAroundBlockSize > 0 15.4 26 4 > TestMockDirectoryWrapper.testThreadSafetyInListAll > 00.2 454 1 TestNodeLostTrigger.testTrigger > 00.2 453 1 TestRecovery.stressLogReplay > 00.2 505 1 > TestReplicationHandler.testRateLimitedReplication > 00.2 425 1 > TestSolrCloudWithSecureImpersonation.testForwarding > 00.9 461 4 > TestSolrDeletionPolicy1.testNumCommitsConfigured > 00.2 454 1 TestSystemIdResolver(suite) > 00.2 451 1 TestV2Request.testCloudSolrClient > 00.2 451 1 TestV2Request.testHttpSolrClient > 09.1 77 7 > TestWithCollection.testDeleteWithCollection > 03.9 77 3 > TestWithCollection.testMoveReplicaWithCollection > > So I don't know what I'm going to do here, we'll
Re: BadApple report. Seems like I'm wasting my time.
Shawn: Trouble is there were 945 tests that failed at least once in the last 4 weeks. And the trend is all over the map on a weekly basis. e-mail-2018-06-11.txt: There were 989 unannotated tests that failed e-mail-2018-06-18.txt: There were 689 unannotated tests that failed e-mail-2018-06-25.txt: There were 555 unannotated tests that failed e-mail-2018-07-02.txt: There were 723 unannotated tests that failed e-mail-2018-07-09.txt: There were 793 unannotated tests that failed e-mail-2018-07-16.txt: There were 809 unannotated tests that failed e-mail-2018-07-23.txt: There were 953 unannotated tests that failed e-mail-2018-07-30.txt: There were 945 unannotated tests that failed I'm BadApple'ing tests that fail every week for the last 4 weeks on the theory that those are not temporary issues (hey, we all commit code that breaks something then have to figure out why and fix). I also have the feeling that somewhere, somehow, our test framework is making some assumptions that are invalid. Or too strict. Or too fast. Or there's some fundamental issue with some of our classes. Or... The number of sporadic issues where the Object Tracker spits stuff out for instance screams that some assumption we're making, either in the code or in the test framework is flawed. What I don't know is how to make visible progress. It's discouraging to fix something and then next week have more tests fail for unrelated reasons. Visibility is the issue to me. We have no good way of saying "these tests _just started failing for a reason. As a quick experiment, I extended the triage to 10 weeks (no attempt to ascertain if these tests even existed 10 weeks ago). Here are the tests that have _only_ failed in the last week, not the previous 9. BadApple'ing anything that's only failed once seems overkill Although the test that failed 77 times does just stand out week pctruns failstest 00.2 460 1 CloudSolrClientTest.testVersionsAreReturned 00.2 466 1 ComputePlanActionTest.testSelectedCollections 00.2 464 1 ConfusionMatrixGeneratorTest.testGetConfusionMatrixWithBM25NB 08.1 37 3 IndexSizeTriggerTest(suite) 00.2 454 1 MBeansHandlerTest.testAddedMBeanDiff 00.2 454 1 MBeansHandlerTest.testDiff 00.2 455 1 MetricTriggerTest.test 00.2 455 1 MetricsHandlerTest.test 00.2 455 1 MetricsHandlerTest.testKeyMetrics 00.2 453 1 RequestHandlersTest.testInitCount 00.2 453 1 RequestHandlersTest.testStatistics 00.2 453 1 ScheduledTriggerIntegrationTest(suite) 00.2 451 1 SearchRateTriggerTest.testWaitForElapsed 00.2 425 1 SoftAutoCommitTest.testSoftCommitWithinAndHardCommitMaxTimeRapidAdds 0 14.7 525 77 StreamExpressionTest.testSignificantTermsStream 00.2 454 1 TestBadConfig(suite) 00.2 465 1 TestBlockJoin.testMultiChildQueriesOfDiffParentLevels 00.6 462 3 TestCloudCollectionsListeners.testCollectionDeletion 00.2 456 1 TestInfoStreamLogging(suite) 00.2 456 1 TestLazyCores.testLazySearch 00.2 473 1 TestLucene70DocValuesFormat.testSortedSetAroundBlockSize 0 15.4 26 4 TestMockDirectoryWrapper.testThreadSafetyInListAll 00.2 454 1 TestNodeLostTrigger.testTrigger 00.2 453 1 TestRecovery.stressLogReplay 00.2 505 1 TestReplicationHandler.testRateLimitedReplication 00.2 425 1 TestSolrCloudWithSecureImpersonation.testForwarding 00.9 461 4 TestSolrDeletionPolicy1.testNumCommitsConfigured 00.2 454 1 TestSystemIdResolver(suite) 00.2 451 1 TestV2Request.testCloudSolrClient 00.2 451 1 TestV2Request.testHttpSolrClient 09.1 77 7 TestWithCollection.testDeleteWithCollection 03.9 77 3 TestWithCollection.testMoveReplicaWithCollection So I don't know what I'm going to do here, we'll see if I get more optimistic when the fog lifts. Erick On Wed, Aug 1, 2018 at 7:15 AM, Shawn Heisey wrote: > On 7/30/2018 11:52 AM, Erick Erickson wrote: >> >> Is anybody paying the least attention to this or should I just stop >> bothering? > > > The job you're doing is thankless. That's the nature of the work. I'd love > to have the time to really help you out. If only my employer didn't expect > me to spend so much time *working*! > >> I'd hoped to get to a point where we could get at least semi-stable >> and start whittling away at the backlog. But with an additional 63 >> tests to
Re: BadApple report. Seems like I'm wasting my time.
On 7/30/2018 11:52 AM, Erick Erickson wrote: Is anybody paying the least attention to this or should I just stop bothering? The job you're doing is thankless. That's the nature of the work. I'd love to have the time to really help you out. If only my employer didn't expect me to spend so much time *working*! I'd hoped to get to a point where we could get at least semi-stable and start whittling away at the backlog. But with an additional 63 tests to BadApple (a little fudging here because of some issues with counting suite-level tests .vs. individual test) it doesn't seem like we're going in the right direction at all. Unless there's some value here, defined by people stepping up and at least looking (and once a week is not asking too much) at the names of the tests I'm going to BadApple to see if they ring any bells, I'll stop wasting my time. Here's a crazy thought, which might be something you already considered: Try to figure out which tests pass consistently and BadApple *all the rest* of the Solr tests. If there are any Lucene tests that fail with some regularity, BadApple those too. There are probably disadvantages to this approach, but here are the advantages I can think of: 1) The noise stops quickly. 2) Future heroic efforts will result in measurable progress -- to quote you, "whittling away at the backlog." Thank you a million times over for all the care and effort you've put into this. Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: BadApple report. Seems like I'm wasting my time.
I was thinking of the challenge with sporadic/random failures the other day and what would help. I think more and smarter notifications of failures could help a lot. (A) Using Git history, a Jenkins plugin could send an email to anyone who touched the failing test in the last 4 weeks. If that list is empty then choose the most recent person. This notification does not go to the dev list. Rationale: People who most recently maintained the test in some way are likely to want to help keep it passing. (B) (At fucit.com?) if a test has not failed in the 4 weeks prior, then notify the dev list with an email about just this test (in subject). If "many" tests fail in a build, then those failures don't count for this tracking. Rationale: Any active developer ought to take notice as this may be caused by one of their commits. Note: if "many" tests fail in a build, then it's likely a reproducible recently-committed change with a wide blast radius that is going to be fixed soon and which will already be reported by standard Jenkins notifications. These are just some ideas. I looked for a Jenkins plugin that did (A) but found none. It seems most build setups including ours aren't oriented around longitudinal tracking of individual tests, and are instead just overall pass/fail tracking of the entire suite. Hoss (& Mark?) have helped track tests longitudinally but it's a separate system that one must manually look at; it's not integrated with Jenkins nor with notifications. ~ David On Tue, Jul 31, 2018 at 3:00 AM Dawid Weiss wrote: > Hi Erick, > > > Is anybody paying the least attention to this or should I just stop > bothering? > > I think your effort is invaluable, although if not backed by actions > to fix those bugs > it's pointless. I'm paying attention to the Lucene part. As for Solr > tests I admit I gave > up hope a longer while ago. I can't run past Solr tests on my machine > anymore, no > matter how many runs I try. Yes, this means I commit stuff back if I > can run precommit > and Lucene tests only -- it is terrible, but a fact. > > > But with an additional 63 tests to BadApple [...] > > Exactly. I don't see this situation getting any better, even with all > your (and other people's) work > put into fixing them. I don't have any ideas or solution for this, I'm > afraid. > > Dawid > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
Re: BadApple report. Seems like I'm wasting my time.
Hi Erick, > Is anybody paying the least attention to this or should I just stop bothering? I think your effort is invaluable, although if not backed by actions to fix those bugs it's pointless. I'm paying attention to the Lucene part. As for Solr tests I admit I gave up hope a longer while ago. I can't run past Solr tests on my machine anymore, no matter how many runs I try. Yes, this means I commit stuff back if I can run precommit and Lucene tests only -- it is terrible, but a fact. > But with an additional 63 tests to BadApple [...] Exactly. I don't see this situation getting any better, even with all your (and other people's) work put into fixing them. I don't have any ideas or solution for this, I'm afraid. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: BadApple report. Seems like I'm wasting my time.
Steve: Ok, InfixSuggestersTest.testShutdownDuringBuild is in my "Do not annotate" list. On Mon, Jul 30, 2018 at 7:33 PM, Steve Rowe wrote: > Hi Erick, > > I think it’s valuable to continue the BadApple process as you’re currently > running it. I’m guessing most people will not engage, but some will, myself > included (though I don’t claim to read the list every week). > > I’m working on fixing InfixSuggestersTest.testShutdownDuringBuild > (SOLR-12606), so please don’t BadApple it. > > Thanks, > > -- > Steve > www.lucidworks.com > >> On Jul 30, 2018, at 1:52 PM, Erick Erickson wrote: >> >> Is anybody paying the least attention to this or should I just stop >> bothering? >> >> I'd hoped to get to a point where we could get at least semi-stable >> and start whittling away at the backlog. But with an additional 63 >> tests to BadApple (a little fudging here because of some issues with >> counting suite-level tests .vs. individual test) it doesn't seem like >> we're going in the right direction at all. >> >> Unless there's some value here, defined by people stepping up and at >> least looking (and once a week is not asking too much) at the names of >> the tests I'm going to BadApple to see if they ring any bells, I'll >> stop wasting my time. >> >> There are currently 100 BadApple tests. That number will increase by a >> hefty percentage _this week alone_. >> >> I suppose I'll just bet the latest example of tilting at this windmill. >> >> Erick >> >> >> >> **Annotated tests/suites that didn't fail in the last 4 weeks. >> >> >> **Annotations will be removed from the following tests because they >> haven't failed in the last 4 rollups. >> >> **Methods: 12 >> AddReplicaTest.test >> DeleteReplicaTest.deleteReplicaFromClusterState >> LeaderVoteWaitTimeoutTest.testMostInSyncReplicasCanWinElection >> MaxSizeAutoCommitTest >> OverseerRolesTest.testOverseerRole >> PeerSyncReplicationTest.test >> RecoveryZkTest.test >> RollingRestartTest.test >> TestCloudConsistency.testOutOfSyncReplicasCannotBecomeLeader >> TestCloudPivotFacet.test >> TestLargeCluster.testSearchRate >> TestPullReplicaErrorHandling.throws >> >> **Suites: 0 >> >> >> Failures in Hoss' reports for the last 4 rollups. >> >> All tests that failed 4 weeks running will be BadApple'd unless there >> are objections >> >> Failures in the last 4 reports.. >> Report Pct runsfails test >> 0123 0.9 1689 27 AutoAddReplicasIntegrationTest.testSimple >> 0123 1.1 1698 36 >> CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap >> 0123 0.9 1430 30 ChaosMonkeyNothingIsSafeTest(suite) >> 0123 0.4 1453 16 ChaosMonkeyNothingIsSafeTest.test >> 0123 0.4 1726 26 CloudSolrClientTest.preferLocalShardsTest >> 0123 0.7 1726 64 >> CloudSolrClientTest.preferReplicaTypesTest >> 0123 1.9 1682 34 >> CollectionsAPIAsyncDistributedZkTest.testAsyncIdRaceCondition >> 0123 0.9 1676 14 >> CollectionsAPIDistributedZkTest.testCollectionsAPI >> 0123 1.1 1717 13 ComputePlanActionTest.testNodeLost >> 0123 0.2 1721 13 >> ComputePlanActionTest.testNodeWithMultipleReplicasLost >> 0123 0.4 1684 30 DistributedMLTComponentTest.test >> 0123 0.4 1707 9 >> DocValuesNotIndexedTest.testGroupingDVOnly >> 0123 0.4 1642 9 FullSolrCloudDistribCmdsTest.test >> 0123 0.9 1663 49 GraphExpressionTest(suite) >> 0123 0.4 1693 21 >> GraphExpressionTest.testShortestPathStream >> 0123 1.8 1660 41 GraphTest(suite) >> 0123 0.9 1686 20 GraphTest.testShortestPathStream >> 0123 72.7 88 70 HdfsChaosMonkeySafeLeaderTest(suite) >> 0123 1.3 1622 32 HttpSolrCallGetCoreTest(suite) >> 0123 16.6 1992223 >> InfixSuggestersTest.testShutdownDuringBuild >> 0123 0.4 1661 12 LargeVolumeJettyTest(suite) >> 0123 0.4 1685 12 LargeVolumeJettyTest.testMultiThreaded >> 0123 0.9 1661 9 >> LeaderElectionIntegrationTest.testSimpleSliceLeaderElection >> 0123 3.2 1695 55 >> MetricTriggerIntegrationTest.testMetricTrigger >> 0123 3.9 1624 84 MoveReplicaHDFSTest.testFailedMove >> 0123 4.2 1716 71 >> ScheduledTriggerIntegrationTest.testScheduledTrigger >> 0123 2.2 1626 37 SchemaApiFailureTest(suite) >> 0123 38.1 422164 ShardSplitTest.test >> 0123 9.2 329 26 ShardSplitTest.testSplitMixedReplicaTypes >> 0123 0.7 1701 24 SolrCloudReportersTest.testDefaultPlugins >> 0123 0.9 1701 36 >> SolrCloudReportersTest.testExplicitConfiguration >> 0123 1.9 1683 25 SolrJmxReporterCloudTest.testJmxReporter >> 0123 11.8 568 12
Re: BadApple report. Seems like I'm wasting my time.
Hi Erick, I think it’s valuable to continue the BadApple process as you’re currently running it. I’m guessing most people will not engage, but some will, myself included (though I don’t claim to read the list every week). I’m working on fixing InfixSuggestersTest.testShutdownDuringBuild (SOLR-12606), so please don’t BadApple it. Thanks, -- Steve www.lucidworks.com > On Jul 30, 2018, at 1:52 PM, Erick Erickson wrote: > > Is anybody paying the least attention to this or should I just stop bothering? > > I'd hoped to get to a point where we could get at least semi-stable > and start whittling away at the backlog. But with an additional 63 > tests to BadApple (a little fudging here because of some issues with > counting suite-level tests .vs. individual test) it doesn't seem like > we're going in the right direction at all. > > Unless there's some value here, defined by people stepping up and at > least looking (and once a week is not asking too much) at the names of > the tests I'm going to BadApple to see if they ring any bells, I'll > stop wasting my time. > > There are currently 100 BadApple tests. That number will increase by a > hefty percentage _this week alone_. > > I suppose I'll just bet the latest example of tilting at this windmill. > > Erick > > > > **Annotated tests/suites that didn't fail in the last 4 weeks. > > > **Annotations will be removed from the following tests because they > haven't failed in the last 4 rollups. > > **Methods: 12 > AddReplicaTest.test > DeleteReplicaTest.deleteReplicaFromClusterState > LeaderVoteWaitTimeoutTest.testMostInSyncReplicasCanWinElection > MaxSizeAutoCommitTest > OverseerRolesTest.testOverseerRole > PeerSyncReplicationTest.test > RecoveryZkTest.test > RollingRestartTest.test > TestCloudConsistency.testOutOfSyncReplicasCannotBecomeLeader > TestCloudPivotFacet.test > TestLargeCluster.testSearchRate > TestPullReplicaErrorHandling.throws > > **Suites: 0 > > > Failures in Hoss' reports for the last 4 rollups. > > All tests that failed 4 weeks running will be BadApple'd unless there > are objections > > Failures in the last 4 reports.. > Report Pct runsfails test > 0123 0.9 1689 27 AutoAddReplicasIntegrationTest.testSimple > 0123 1.1 1698 36 > CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap > 0123 0.9 1430 30 ChaosMonkeyNothingIsSafeTest(suite) > 0123 0.4 1453 16 ChaosMonkeyNothingIsSafeTest.test > 0123 0.4 1726 26 CloudSolrClientTest.preferLocalShardsTest > 0123 0.7 1726 64 CloudSolrClientTest.preferReplicaTypesTest > 0123 1.9 1682 34 > CollectionsAPIAsyncDistributedZkTest.testAsyncIdRaceCondition > 0123 0.9 1676 14 > CollectionsAPIDistributedZkTest.testCollectionsAPI > 0123 1.1 1717 13 ComputePlanActionTest.testNodeLost > 0123 0.2 1721 13 > ComputePlanActionTest.testNodeWithMultipleReplicasLost > 0123 0.4 1684 30 DistributedMLTComponentTest.test > 0123 0.4 1707 9 DocValuesNotIndexedTest.testGroupingDVOnly > 0123 0.4 1642 9 FullSolrCloudDistribCmdsTest.test > 0123 0.9 1663 49 GraphExpressionTest(suite) > 0123 0.4 1693 21 GraphExpressionTest.testShortestPathStream > 0123 1.8 1660 41 GraphTest(suite) > 0123 0.9 1686 20 GraphTest.testShortestPathStream > 0123 72.7 88 70 HdfsChaosMonkeySafeLeaderTest(suite) > 0123 1.3 1622 32 HttpSolrCallGetCoreTest(suite) > 0123 16.6 1992223 > InfixSuggestersTest.testShutdownDuringBuild > 0123 0.4 1661 12 LargeVolumeJettyTest(suite) > 0123 0.4 1685 12 LargeVolumeJettyTest.testMultiThreaded > 0123 0.9 1661 9 > LeaderElectionIntegrationTest.testSimpleSliceLeaderElection > 0123 3.2 1695 55 > MetricTriggerIntegrationTest.testMetricTrigger > 0123 3.9 1624 84 MoveReplicaHDFSTest.testFailedMove > 0123 4.2 1716 71 > ScheduledTriggerIntegrationTest.testScheduledTrigger > 0123 2.2 1626 37 SchemaApiFailureTest(suite) > 0123 38.1 422164 ShardSplitTest.test > 0123 9.2 329 26 ShardSplitTest.testSplitMixedReplicaTypes > 0123 0.7 1701 24 SolrCloudReportersTest.testDefaultPlugins > 0123 0.9 1701 36 > SolrCloudReportersTest.testExplicitConfiguration > 0123 1.9 1683 25 SolrJmxReporterCloudTest.testJmxReporter > 0123 11.8 568 12 StreamDecoratorTest.testClassifyStream > 0123 10.5 1133 60 StreamDecoratorTest.testExecutorStream > 0123 2.6 1133 16 > StreamDecoratorTest.testParallelComplementStream > 0123 5.3 1134 15 >