Re: BadApple report, but please read the first bit
Thanks Kevin; clearly I missed the link to that which I can now see at fucit. I was worried I may have worked on something that could have perturbed this recent issue but no -- I don't think so. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Aug 12, 2020 at 9:08 AM Kevin Risden wrote: > > http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.SharedFSAutoReplicaFailoverTest.test > > David for that specific test you asked the failures are recent with as far > as I know no change to HDFS stuff. Starting June/July failing regularly. > > Kevin Risden > > > > On Wed, Aug 12, 2020 at 9:03 AM Erick Erickson > wrote: > >> I have the weekly rollups (with a few gaps) going back to about April >> 2018, but nothing’s been done to try to make them generally available. Each >> BadApple report has rates for the last 4 weeks in the attached file, just >> below "Failures over the last 4 weeks, but not every week. Ordered >> most-recent first:” >> >> >> >> > On Aug 12, 2020, at 2:06 AM, David Smiley wrote: >> > >> > Do we have any long term (aka "longitudinal") pass/fail rates for tests? >> > >> > SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to >> HDFS, and that's going away to a plug-in for 9.0. The shared file system >> notion isn't well supported in SolrCloud, I think. >> > >> > ~ David Smiley >> > Apache Lucene/Solr Search Developer >> > http://www.linkedin.com/in/davidwsmiley >> > >> > >> > On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson >> wrote: >> > There are several tests that are causing a lot of noise: >> > >> > SharedFSAutoReplicaFailoverTest is failing 90%+ of the time. >> > TestBulkSchemaConcurrent 31% >> > StressHdfsTest 16% >> > SchemaApiFailureTest 13.88% >> > >> > I encourage people to look at: >> http://fucit.org/solr-jenkins-reports/failure-report.html and see if >> anything looks like it is affected by recent work. TestBulkSchemaConcurrent >> has been failing off and on for a long time, but the failure rate picked up >> dramatically in the last couple of weeks. Ditto SchemaApiFailureTest. >> > >> > Do we even care about Hdfs? Are we deprecating it or not? >> > >> > Holding relatively steady otherwise: >> > >> > Raw fail count by week totals, most recent week first (corresponds to >> bits): >> > Week: 0 had 82 failures >> > Week: 1 had 94 failures >> > Week: 2 had 502 failures >> > Week: 3 had 19 failures >> > >> > >> > Failures in Hoss' reports for the last 4 rollups. >> > >> > There were 562 unannotated tests that failed in Hoss' rollups. Ordered >> by the date I downloaded the rollup file, newest->oldest. See above for the >> dates the files were collected >> > These tests were NOT BadApple'd or AwaitsFix'd >> > >> > Failures in the last 4 reports.. >> >Report Pct runsfails test >> > 0123 0.3 1271 8 RollingRestartTest.test >> > 0123 93.3 41 36 >> SharedFSAutoReplicaFailoverTest.test >> > 0123 3.5 627 16 >> TestCircuitBreaker.testBuildingMemoryPressure >> > 0123 1.0 627 8 >> TestCircuitBreaker.testResponseWithCBTiming >> > 0123 5.8 1483 79 >> TestContainerPlugin.testApiFromPackage >> > 0123 2.3 1335 23 TestDistributedGrouping.test >> > >> > >> > >> > Full report: >> > >> > - >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >>
Re: BadApple report, but please read the first bit
Didn’t think at first (only one cup of coffee). Here’s the Emails that test appears in, the formatting is poor… After that is the raw data from Hoss’ rollups that might be easier to ingest. I have 1.3G of this kind of historical data, I’ve had vague thoughts about putting it someplace accessible to others but haven’t done anything with it. I suppose, wrapped around this, is the entire question of how much value it’ll have depending on what happens with Mark’s reference impl... “Suite” fails are things like object tracker failures. e-mail-2018-03-26.txt:SharedFSAutoReplicaFailoverTest.java e-mail-2018-04-02.txt:SharedFSAutoReplicaFailoverTest.java e-mail-2018-04-09.txt:SharedFSAutoReplicaFailoverTest.java e-mail-2018-04-16.txt:SharedFSAutoReplicaFailoverTest.java e-mail-2018-04-30.txt:SharedFSAutoReplicaFailoverTest.java e-mail-2018-05-21.txt:SharedFSAutoReplicaFailoverTest.java e-mail-2018-06-11.txt: SharedFSAutoReplicaFailoverTest.test e-mail-2018-06-11.txt:3 100.02 2 SharedFSAutoReplicaFailoverTest(suite) e-mail-2018-06-11.txt:SharedFSAutoReplicaFailoverTest.test e-mail-2018-06-18.txt: 0100.02 2 SharedFSAutoReplicaFailoverTest(suite) e-mail-2018-06-25.txt: 0174.1 29 22 SharedFSAutoReplicaFailoverTest(suite) e-mail-2018-06-25.txt: 0 5.9 34 2 SharedFSAutoReplicaFailoverTest.test e-mail-2018-07-02.txt: 012 74.1 56 42 SharedFSAutoReplicaFailoverTest(suite) e-mail-2018-07-02.txt: 01 5.1 73 4 SharedFSAutoReplicaFailoverTest.test e-mail-2018-07-09.txt: 0123 74.1 83 62 SharedFSAutoReplicaFailoverTest(suite) e-mail-2018-07-09.txt: 0122.3 117 5 SharedFSAutoReplicaFailoverTest.test e-mail-2018-07-16.txt: 0123 74.1 108 80 SharedFSAutoReplicaFailoverTest(suite) e-mail-2018-07-16.txt: 0123 17.6 151 11 SharedFSAutoReplicaFailoverTest.test e-mail-2018-07-23.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-07-23.txt:SharedFSAutoReplicaFailoverTest.test e-mail-2018-07-30.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-07-30.txt:SharedFSAutoReplicaFailoverTest.test e-mail-2018-08-06.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-08-06.txt:SharedFSAutoReplicaFailoverTest.test e-mail-2018-08-14.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-08-14.txt:SharedFSAutoReplicaFailoverTest.test e-mail-2018-08-20.txt: SharedFSAutoReplicaFailoverTest.test e-mail-2018-08-20.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-08-20.txt:SharedFSAutoReplicaFailoverTest.test e-mail-2018-08-27.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-09-03.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-09-10.txt: 0 20.05 1 SharedFSAutoReplicaFailoverTest.test e-mail-2018-09-10.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-09-18.txt: 0133.38 2 SharedFSAutoReplicaFailoverTest.test e-mail-2018-09-18.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-10-08.txt:3 33.33 1 SharedFSAutoReplicaFailoverTest.test e-mail-2018-10-08.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2018-12-24.txt: 0 3 33.36 2 SharedFSAutoReplicaFailoverTest.test e-mail-2018-12-24.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-01-08.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-01-15.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-02-12.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-02-18.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-03-04.txt: 133.33 1 SharedFSAutoReplicaFailoverTest.test e-mail-2019-03-04.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-03-11.txt: 2 33.33 1 SharedFSAutoReplicaFailoverTest.test e-mail-2019-03-11.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-03-18.txt:3 33.33 1 SharedFSAutoReplicaFailoverTest.test e-mail-2019-03-18.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-03-25.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-04-01.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite e-mail-2019-04-08.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest suite
Re: BadApple report, but please read the first bit
http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.SharedFSAutoReplicaFailoverTest.test David for that specific test you asked the failures are recent with as far as I know no change to HDFS stuff. Starting June/July failing regularly. Kevin Risden On Wed, Aug 12, 2020 at 9:03 AM Erick Erickson wrote: > I have the weekly rollups (with a few gaps) going back to about April > 2018, but nothing’s been done to try to make them generally available. Each > BadApple report has rates for the last 4 weeks in the attached file, just > below "Failures over the last 4 weeks, but not every week. Ordered > most-recent first:” > > > > > On Aug 12, 2020, at 2:06 AM, David Smiley wrote: > > > > Do we have any long term (aka "longitudinal") pass/fail rates for tests? > > > > SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to > HDFS, and that's going away to a plug-in for 9.0. The shared file system > notion isn't well supported in SolrCloud, I think. > > > > ~ David Smiley > > Apache Lucene/Solr Search Developer > > http://www.linkedin.com/in/davidwsmiley > > > > > > On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson > wrote: > > There are several tests that are causing a lot of noise: > > > > SharedFSAutoReplicaFailoverTest is failing 90%+ of the time. > > TestBulkSchemaConcurrent 31% > > StressHdfsTest 16% > > SchemaApiFailureTest 13.88% > > > > I encourage people to look at: > http://fucit.org/solr-jenkins-reports/failure-report.html and see if > anything looks like it is affected by recent work. TestBulkSchemaConcurrent > has been failing off and on for a long time, but the failure rate picked up > dramatically in the last couple of weeks. Ditto SchemaApiFailureTest. > > > > Do we even care about Hdfs? Are we deprecating it or not? > > > > Holding relatively steady otherwise: > > > > Raw fail count by week totals, most recent week first (corresponds to > bits): > > Week: 0 had 82 failures > > Week: 1 had 94 failures > > Week: 2 had 502 failures > > Week: 3 had 19 failures > > > > > > Failures in Hoss' reports for the last 4 rollups. > > > > There were 562 unannotated tests that failed in Hoss' rollups. Ordered > by the date I downloaded the rollup file, newest->oldest. See above for the > dates the files were collected > > These tests were NOT BadApple'd or AwaitsFix'd > > > > Failures in the last 4 reports.. > >Report Pct runsfails test > > 0123 0.3 1271 8 RollingRestartTest.test > > 0123 93.3 41 36 SharedFSAutoReplicaFailoverTest.test > > 0123 3.5 627 16 > TestCircuitBreaker.testBuildingMemoryPressure > > 0123 1.0 627 8 > TestCircuitBreaker.testResponseWithCBTiming > > 0123 5.8 1483 79 > TestContainerPlugin.testApiFromPackage > > 0123 2.3 1335 23 TestDistributedGrouping.test > > > > > > > > Full report: > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: BadApple report, but please read the first bit
I have the weekly rollups (with a few gaps) going back to about April 2018, but nothing’s been done to try to make them generally available. Each BadApple report has rates for the last 4 weeks in the attached file, just below "Failures over the last 4 weeks, but not every week. Ordered most-recent first:” > On Aug 12, 2020, at 2:06 AM, David Smiley wrote: > > Do we have any long term (aka "longitudinal") pass/fail rates for tests? > > SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to HDFS, > and that's going away to a plug-in for 9.0. The shared file system notion > isn't well supported in SolrCloud, I think. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson wrote: > There are several tests that are causing a lot of noise: > > SharedFSAutoReplicaFailoverTest is failing 90%+ of the time. > TestBulkSchemaConcurrent 31% > StressHdfsTest 16% > SchemaApiFailureTest 13.88% > > I encourage people to look at: > http://fucit.org/solr-jenkins-reports/failure-report.html and see if anything > looks like it is affected by recent work. TestBulkSchemaConcurrent has been > failing off and on for a long time, but the failure rate picked up > dramatically in the last couple of weeks. Ditto SchemaApiFailureTest. > > Do we even care about Hdfs? Are we deprecating it or not? > > Holding relatively steady otherwise: > > Raw fail count by week totals, most recent week first (corresponds to bits): > Week: 0 had 82 failures > Week: 1 had 94 failures > Week: 2 had 502 failures > Week: 3 had 19 failures > > > Failures in Hoss' reports for the last 4 rollups. > > There were 562 unannotated tests that failed in Hoss' rollups. Ordered by the > date I downloaded the rollup file, newest->oldest. See above for the dates > the files were collected > These tests were NOT BadApple'd or AwaitsFix'd > > Failures in the last 4 reports.. >Report Pct runsfails test > 0123 0.3 1271 8 RollingRestartTest.test > 0123 93.3 41 36 SharedFSAutoReplicaFailoverTest.test > 0123 3.5 627 16 > TestCircuitBreaker.testBuildingMemoryPressure > 0123 1.0 627 8 > TestCircuitBreaker.testResponseWithCBTiming > 0123 5.8 1483 79 TestContainerPlugin.testApiFromPackage > 0123 2.3 1335 23 TestDistributedGrouping.test > > > > Full report: > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: BadApple report, but please read the first bit
Do we have any long term (aka "longitudinal") pass/fail rates for tests? SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to HDFS, and that's going away to a plug-in for 9.0. The shared file system notion isn't well supported in SolrCloud, I think. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson wrote: > There are several tests that are causing a lot of noise: > > SharedFSAutoReplicaFailoverTest is failing 90%+ of the time. > TestBulkSchemaConcurrent 31% > StressHdfsTest 16% > SchemaApiFailureTest 13.88% > > I encourage people to look at: > http://fucit.org/solr-jenkins-reports/failure-report.html and see if > anything looks like it is affected by recent work. TestBulkSchemaConcurrent > has been failing off and on for a long time, but the failure rate picked up > dramatically in the last couple of weeks. Ditto SchemaApiFailureTest. > > Do we even care about Hdfs? Are we deprecating it or not? > > Holding relatively steady otherwise: > > Raw fail count by week totals, most recent week first (corresponds to > bits): > Week: 0 had 82 failures > Week: 1 had 94 failures > Week: 2 had 502 failures > Week: 3 had 19 failures > > > Failures in Hoss' reports for the last 4 rollups. > > There were 562 unannotated tests that failed in Hoss' rollups. Ordered by > the date I downloaded the rollup file, newest->oldest. See above for the > dates the files were collected > These tests were NOT BadApple'd or AwaitsFix'd > > Failures in the last 4 reports.. >Report Pct runsfails test > 0123 0.3 1271 8 RollingRestartTest.test > 0123 93.3 41 36 SharedFSAutoReplicaFailoverTest.test > 0123 3.5 627 16 > TestCircuitBreaker.testBuildingMemoryPressure > 0123 1.0 627 8 > TestCircuitBreaker.testResponseWithCBTiming > 0123 5.8 1483 79 TestContainerPlugin.testApiFromPackage > 0123 2.3 1335 23 TestDistributedGrouping.test > > > > Full report: > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org