Re: BadApple report, but please read the first bit

2020-08-13 Thread David Smiley
Thanks Kevin; clearly I missed the link to that which I can now see at
fucit.

I was worried I may have worked on something that could have perturbed this
recent issue but no -- I don't think so.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Aug 12, 2020 at 9:08 AM Kevin Risden  wrote:

>
> http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.SharedFSAutoReplicaFailoverTest.test
>
> David for that specific test you asked the failures are recent with as far
> as I know no change to HDFS stuff. Starting June/July failing regularly.
>
> Kevin Risden
>
>
>
> On Wed, Aug 12, 2020 at 9:03 AM Erick Erickson 
> wrote:
>
>> I have the weekly rollups (with a few gaps) going back to about April
>> 2018, but nothing’s been done to try to make them generally available. Each
>> BadApple report has rates for the last 4 weeks in the attached file, just
>> below "Failures over the last 4 weeks, but not every week. Ordered
>> most-recent first:”
>>
>>
>>
>> > On Aug 12, 2020, at 2:06 AM, David Smiley  wrote:
>> >
>> > Do we have any long term (aka "longitudinal") pass/fail rates for tests?
>> >
>> > SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to
>> HDFS, and that's going away to a plug-in for 9.0.  The shared file system
>> notion isn't well supported in SolrCloud, I think.
>> >
>> > ~ David Smiley
>> > Apache Lucene/Solr Search Developer
>> > http://www.linkedin.com/in/davidwsmiley
>> >
>> >
>> > On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson 
>> wrote:
>> > There are several tests that are causing a lot of noise:
>> >
>> > SharedFSAutoReplicaFailoverTest is failing 90%+ of the time.
>> > TestBulkSchemaConcurrent 31%
>> > StressHdfsTest  16%
>> > SchemaApiFailureTest 13.88%
>> >
>> > I encourage people to look at:
>> http://fucit.org/solr-jenkins-reports/failure-report.html and see if
>> anything looks like it is affected by recent work. TestBulkSchemaConcurrent
>> has been failing off and on for a long time, but the failure rate picked up
>> dramatically in the last couple of weeks. Ditto SchemaApiFailureTest.
>> >
>> > Do we even care about Hdfs? Are we deprecating it or not?
>> >
>> > Holding relatively steady otherwise:
>> >
>> > Raw fail count by week totals, most recent week first (corresponds to
>> bits):
>> > Week: 0  had  82 failures
>> > Week: 1  had  94 failures
>> > Week: 2  had  502 failures
>> > Week: 3  had  19 failures
>> >
>> >
>> > Failures in Hoss' reports for the last 4 rollups.
>> >
>> > There were 562 unannotated tests that failed in Hoss' rollups. Ordered
>> by the date I downloaded the rollup file, newest->oldest. See above for the
>> dates the files were collected
>> > These tests were NOT BadApple'd or AwaitsFix'd
>> >
>> > Failures in the last 4 reports..
>> >Report   Pct runsfails   test
>> >  0123   0.3 1271  8  RollingRestartTest.test
>> >  0123  93.3   41 36
>> SharedFSAutoReplicaFailoverTest.test
>> >  0123   3.5  627 16
>> TestCircuitBreaker.testBuildingMemoryPressure
>> >  0123   1.0  627  8
>> TestCircuitBreaker.testResponseWithCBTiming
>> >  0123   5.8 1483 79
>> TestContainerPlugin.testApiFromPackage
>> >  0123   2.3 1335 23  TestDistributedGrouping.test
>> > 
>> >
>> >
>> > Full report:
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: BadApple report, but please read the first bit

2020-08-12 Thread Erick Erickson
Didn’t think at first (only one cup of coffee). Here’s the Emails that test 
appears in, the formatting is poor…

After that is the raw data from Hoss’ rollups that might be easier to ingest.

I have 1.3G of this kind of historical data, I’ve had vague thoughts about 
putting it someplace accessible to others but haven’t done anything with it.

I suppose, wrapped around this, is the entire question of how much value it’ll 
have depending on what happens with Mark’s reference impl...

“Suite” fails are things like object tracker failures.

e-mail-2018-03-26.txt:SharedFSAutoReplicaFailoverTest.java
e-mail-2018-04-02.txt:SharedFSAutoReplicaFailoverTest.java
e-mail-2018-04-09.txt:SharedFSAutoReplicaFailoverTest.java
e-mail-2018-04-16.txt:SharedFSAutoReplicaFailoverTest.java
e-mail-2018-04-30.txt:SharedFSAutoReplicaFailoverTest.java
e-mail-2018-05-21.txt:SharedFSAutoReplicaFailoverTest.java
e-mail-2018-06-11.txt:   SharedFSAutoReplicaFailoverTest.test
e-mail-2018-06-11.txt:3 100.02  2  
SharedFSAutoReplicaFailoverTest(suite)
e-mail-2018-06-11.txt:SharedFSAutoReplicaFailoverTest.test
e-mail-2018-06-18.txt: 0100.02  2  
SharedFSAutoReplicaFailoverTest(suite)
e-mail-2018-06-25.txt: 0174.1   29 22  
SharedFSAutoReplicaFailoverTest(suite)
e-mail-2018-06-25.txt: 0  5.9   34  2  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-07-02.txt: 012   74.1   56 42  
SharedFSAutoReplicaFailoverTest(suite)
e-mail-2018-07-02.txt: 01 5.1   73  4  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-07-09.txt: 0123  74.1   83 62  
SharedFSAutoReplicaFailoverTest(suite)
e-mail-2018-07-09.txt: 0122.3  117  5  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-07-16.txt: 0123  74.1  108 80  
SharedFSAutoReplicaFailoverTest(suite)
e-mail-2018-07-16.txt: 0123  17.6  151 11  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-07-23.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-07-23.txt:SharedFSAutoReplicaFailoverTest.test
e-mail-2018-07-30.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-07-30.txt:SharedFSAutoReplicaFailoverTest.test
e-mail-2018-08-06.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-08-06.txt:SharedFSAutoReplicaFailoverTest.test
e-mail-2018-08-14.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-08-14.txt:SharedFSAutoReplicaFailoverTest.test
e-mail-2018-08-20.txt:   SharedFSAutoReplicaFailoverTest.test
e-mail-2018-08-20.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-08-20.txt:SharedFSAutoReplicaFailoverTest.test
e-mail-2018-08-27.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-09-03.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-09-10.txt: 0 20.05  1  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-09-10.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-09-18.txt: 0133.38  2  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-09-18.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-10-08.txt:3  33.33  1  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-10-08.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2018-12-24.txt: 0  3  33.36  2  
SharedFSAutoReplicaFailoverTest.test
e-mail-2018-12-24.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-01-08.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-01-15.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-02-12.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-02-18.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-03-04.txt:  133.33  1  
SharedFSAutoReplicaFailoverTest.test
e-mail-2019-03-04.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-03-11.txt:   2   33.33  1  
SharedFSAutoReplicaFailoverTest.test
e-mail-2019-03-11.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-03-18.txt:3  33.33  1  
SharedFSAutoReplicaFailoverTest.test
e-mail-2019-03-18.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-03-25.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-04-01.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite
e-mail-2019-04-08.txt:SharedFSAutoReplicaFailoverTest.SharedFSAutoReplicaFailoverTest
 suite

Re: BadApple report, but please read the first bit

2020-08-12 Thread Kevin Risden
http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.SharedFSAutoReplicaFailoverTest.test

David for that specific test you asked the failures are recent with as far
as I know no change to HDFS stuff. Starting June/July failing regularly.

Kevin Risden



On Wed, Aug 12, 2020 at 9:03 AM Erick Erickson 
wrote:

> I have the weekly rollups (with a few gaps) going back to about April
> 2018, but nothing’s been done to try to make them generally available. Each
> BadApple report has rates for the last 4 weeks in the attached file, just
> below "Failures over the last 4 weeks, but not every week. Ordered
> most-recent first:”
>
>
>
> > On Aug 12, 2020, at 2:06 AM, David Smiley  wrote:
> >
> > Do we have any long term (aka "longitudinal") pass/fail rates for tests?
> >
> > SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to
> HDFS, and that's going away to a plug-in for 9.0.  The shared file system
> notion isn't well supported in SolrCloud, I think.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson 
> wrote:
> > There are several tests that are causing a lot of noise:
> >
> > SharedFSAutoReplicaFailoverTest is failing 90%+ of the time.
> > TestBulkSchemaConcurrent 31%
> > StressHdfsTest  16%
> > SchemaApiFailureTest 13.88%
> >
> > I encourage people to look at:
> http://fucit.org/solr-jenkins-reports/failure-report.html and see if
> anything looks like it is affected by recent work. TestBulkSchemaConcurrent
> has been failing off and on for a long time, but the failure rate picked up
> dramatically in the last couple of weeks. Ditto SchemaApiFailureTest.
> >
> > Do we even care about Hdfs? Are we deprecating it or not?
> >
> > Holding relatively steady otherwise:
> >
> > Raw fail count by week totals, most recent week first (corresponds to
> bits):
> > Week: 0  had  82 failures
> > Week: 1  had  94 failures
> > Week: 2  had  502 failures
> > Week: 3  had  19 failures
> >
> >
> > Failures in Hoss' reports for the last 4 rollups.
> >
> > There were 562 unannotated tests that failed in Hoss' rollups. Ordered
> by the date I downloaded the rollup file, newest->oldest. See above for the
> dates the files were collected
> > These tests were NOT BadApple'd or AwaitsFix'd
> >
> > Failures in the last 4 reports..
> >Report   Pct runsfails   test
> >  0123   0.3 1271  8  RollingRestartTest.test
> >  0123  93.3   41 36  SharedFSAutoReplicaFailoverTest.test
> >  0123   3.5  627 16
> TestCircuitBreaker.testBuildingMemoryPressure
> >  0123   1.0  627  8
> TestCircuitBreaker.testResponseWithCBTiming
> >  0123   5.8 1483 79
> TestContainerPlugin.testApiFromPackage
> >  0123   2.3 1335 23  TestDistributedGrouping.test
> > 
> >
> >
> > Full report:
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: BadApple report, but please read the first bit

2020-08-12 Thread Erick Erickson
I have the weekly rollups (with a few gaps) going back to about April 2018, but 
nothing’s been done to try to make them generally available. Each BadApple 
report has rates for the last 4 weeks in the attached file, just below 
"Failures over the last 4 weeks, but not every week. Ordered most-recent first:”



> On Aug 12, 2020, at 2:06 AM, David Smiley  wrote:
> 
> Do we have any long term (aka "longitudinal") pass/fail rates for tests?
> 
> SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to HDFS, 
> and that's going away to a plug-in for 9.0.  The shared file system notion 
> isn't well supported in SolrCloud, I think.
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
> 
> 
> On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson  wrote:
> There are several tests that are causing a lot of noise:
> 
> SharedFSAutoReplicaFailoverTest is failing 90%+ of the time.
> TestBulkSchemaConcurrent 31%
> StressHdfsTest  16%
> SchemaApiFailureTest 13.88%
> 
> I encourage people to look at: 
> http://fucit.org/solr-jenkins-reports/failure-report.html and see if anything 
> looks like it is affected by recent work. TestBulkSchemaConcurrent has been 
> failing off and on for a long time, but the failure rate picked up 
> dramatically in the last couple of weeks. Ditto SchemaApiFailureTest.
> 
> Do we even care about Hdfs? Are we deprecating it or not?
> 
> Holding relatively steady otherwise:
> 
> Raw fail count by week totals, most recent week first (corresponds to bits):
> Week: 0  had  82 failures
> Week: 1  had  94 failures
> Week: 2  had  502 failures
> Week: 3  had  19 failures
> 
> 
> Failures in Hoss' reports for the last 4 rollups.
> 
> There were 562 unannotated tests that failed in Hoss' rollups. Ordered by the 
> date I downloaded the rollup file, newest->oldest. See above for the dates 
> the files were collected 
> These tests were NOT BadApple'd or AwaitsFix'd
> 
> Failures in the last 4 reports..
>Report   Pct runsfails   test
>  0123   0.3 1271  8  RollingRestartTest.test
>  0123  93.3   41 36  SharedFSAutoReplicaFailoverTest.test
>  0123   3.5  627 16  
> TestCircuitBreaker.testBuildingMemoryPressure
>  0123   1.0  627  8  
> TestCircuitBreaker.testResponseWithCBTiming
>  0123   5.8 1483 79  TestContainerPlugin.testApiFromPackage
>  0123   2.3 1335 23  TestDistributedGrouping.test
> 
> 
> 
> Full report:
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: BadApple report, but please read the first bit

2020-08-12 Thread David Smiley
Do we have any long term (aka "longitudinal") pass/fail rates for tests?

SharedFSAutoReplicaFailoverTest in particular is kinda-sorta tied to HDFS,
and that's going away to a plug-in for 9.0.  The shared file system notion
isn't well supported in SolrCloud, I think.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Aug 3, 2020 at 7:26 AM Erick Erickson 
wrote:

> There are several tests that are causing a lot of noise:
>
> SharedFSAutoReplicaFailoverTest is failing 90%+ of the time.
> TestBulkSchemaConcurrent 31%
> StressHdfsTest  16%
> SchemaApiFailureTest 13.88%
>
> I encourage people to look at:
> http://fucit.org/solr-jenkins-reports/failure-report.html and see if
> anything looks like it is affected by recent work. TestBulkSchemaConcurrent
> has been failing off and on for a long time, but the failure rate picked up
> dramatically in the last couple of weeks. Ditto SchemaApiFailureTest.
>
> Do we even care about Hdfs? Are we deprecating it or not?
>
> Holding relatively steady otherwise:
>
> Raw fail count by week totals, most recent week first (corresponds to
> bits):
> Week: 0  had  82 failures
> Week: 1  had  94 failures
> Week: 2  had  502 failures
> Week: 3  had  19 failures
>
>
> Failures in Hoss' reports for the last 4 rollups.
>
> There were 562 unannotated tests that failed in Hoss' rollups. Ordered by
> the date I downloaded the rollup file, newest->oldest. See above for the
> dates the files were collected
> These tests were NOT BadApple'd or AwaitsFix'd
>
> Failures in the last 4 reports..
>Report   Pct runsfails   test
>  0123   0.3 1271  8  RollingRestartTest.test
>  0123  93.3   41 36  SharedFSAutoReplicaFailoverTest.test
>  0123   3.5  627 16
> TestCircuitBreaker.testBuildingMemoryPressure
>  0123   1.0  627  8
> TestCircuitBreaker.testResponseWithCBTiming
>  0123   5.8 1483 79  TestContainerPlugin.testApiFromPackage
>  0123   2.3 1335 23  TestDistributedGrouping.test
> 
>
>
> Full report:
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org