Re: BadApple report. Seems like I'm wasting my time.

Erick Erickson Wed, 01 Aug 2018 08:10:09 -0700

Alexandre:

Feel free! What I'm struggling with is not that someone checked in
some code that all the sudden started breaking things. Rather that a
test that's been working perfectly will fail once the won't
reproducibly fail again and does _not_ appear to be related to recent
code changes.


In fact that's the crux of the matter, it's difficult/impossible to
tell at a glance when a test fails whether it is or is not related to
a recent code change.....

Erick

On Wed, Aug 1, 2018 at 8:05 AM, Alexandre Rafalovitch
<arafa...@gmail.com> wrote:
> Just a completely random thought that I do not have deep knowledge for
> (still learning my way around Solr tests).
>
> Is this something that Machine Learning could help with? The Github
> repo/history is a fantastic source of learning on who worked on which
> file, how often, etc. We certainly should be able to get some 'most
> significant developer' stats out of that.
>
> Regards,
>    Alex.
>
> On 1 August 2018 at 10:56, Erick Erickson <erickerick...@gmail.com> wrote:
>> Shawn:
>>
>> Trouble is there were 945 tests that failed at least once in the last
>> 4 weeks. And the trend is all over the map on a weekly basis.
>>
>> e-mail-2018-06-11.txt: There were 989 unannotated tests that failed
>> e-mail-2018-06-18.txt: There were 689 unannotated tests that failed
>> e-mail-2018-06-25.txt: There were 555 unannotated tests that failed
>> e-mail-2018-07-02.txt: There were 723 unannotated tests that failed
>> e-mail-2018-07-09.txt: There were 793 unannotated tests that failed
>> e-mail-2018-07-16.txt: There were 809 unannotated tests that failed
>> e-mail-2018-07-23.txt: There were 953 unannotated tests that failed
>> e-mail-2018-07-30.txt: There were 945 unannotated tests that failed
>>
>> I'm BadApple'ing tests that fail every week for the last 4 weeks on
>> the theory that those are not temporary issues (hey, we all commit
>> code that breaks something then have to figure out why and fix).
>>
>> I also have the feeling that somewhere, somehow, our test framework is
>> making some assumptions that are invalid. Or too strict. Or too fast.
>> Or there's some fundamental issue with some of our classes. Or... The
>> number of sporadic issues where the Object Tracker spits stuff out for
>> instance screams that some assumption we're making, either in the code
>> or in the test framework is flawed.
>>
>> What I don't know is how to make visible progress. It's discouraging
>> to fix something and then next week have more tests fail for unrelated
>> reasons.
>>
>> Visibility is the issue to me. We have no good way of saying "these
>> tests _just started failing for a reason. As a quick experiment, I
>> extended the triage to 10 weeks (no attempt to ascertain if these
>> tests even existed 10 weeks ago). Here are the tests that have _only_
>> failed in the last week, not the previous 9. BadApple'ing anything
>> that's only failed once seems overkill
>>
>> Although the test that failed 77 times does just stand out....
>>
>> week     pct        runs  fails            test
>> 0            0.2      460      1
>> CloudSolrClientTest.testVersionsAreReturned
>> 0            0.2      466      1
>> ComputePlanActionTest.testSelectedCollections
>> 0            0.2      464      1
>> ConfusionMatrixGeneratorTest.testGetConfusionMatrixWithBM25NB
>> 0            8.1       37      3      IndexSizeTriggerTest(suite)
>> 0            0.2      454      1      MBeansHandlerTest.testAddedMBeanDiff
>> 0            0.2      454      1      MBeansHandlerTest.testDiff
>> 0            0.2      455      1      MetricTriggerTest.test
>> 0            0.2      455      1      MetricsHandlerTest.test
>> 0            0.2      455      1      MetricsHandlerTest.testKeyMetrics
>> 0            0.2      453      1      RequestHandlersTest.testInitCount
>> 0            0.2      453      1      RequestHandlersTest.testStatistics
>> 0            0.2      453      1      ScheduledTriggerIntegrationTest(suite)
>> 0            0.2      451      1      
>> SearchRateTriggerTest.testWaitForElapsed
>> 0            0.2      425      1
>> SoftAutoCommitTest.testSoftCommitWithinAndHardCommitMaxTimeRapidAdds
>> 0           14.7      525     77
>> StreamExpressionTest.testSignificantTermsStream
>> 0            0.2      454      1      TestBadConfig(suite)
>> 0            0.2      465      1
>> TestBlockJoin.testMultiChildQueriesOfDiffParentLevels
>> 0            0.6      462      3
>> TestCloudCollectionsListeners.testCollectionDeletion
>> 0            0.2      456      1      TestInfoStreamLogging(suite)
>> 0            0.2      456      1      TestLazyCores.testLazySearch
>> 0            0.2      473      1
>> TestLucene70DocValuesFormat.testSortedSetAroundBlockSize
>> 0           15.4       26      4
>> TestMockDirectoryWrapper.testThreadSafetyInListAll
>> 0            0.2      454      1      TestNodeLostTrigger.testTrigger
>> 0            0.2      453      1      TestRecovery.stressLogReplay
>> 0            0.2      505      1
>> TestReplicationHandler.testRateLimitedReplication
>> 0            0.2      425      1
>> TestSolrCloudWithSecureImpersonation.testForwarding
>> 0            0.9      461      4
>> TestSolrDeletionPolicy1.testNumCommitsConfigured
>> 0            0.2      454      1      TestSystemIdResolver(suite)
>> 0            0.2      451      1      TestV2Request.testCloudSolrClient
>> 0            0.2      451      1      TestV2Request.testHttpSolrClient
>> 0            9.1       77      7
>> TestWithCollection.testDeleteWithCollection
>> 0            3.9       77      3
>> TestWithCollection.testMoveReplicaWithCollection
>>
>> So I don't know what I'm going to do here, we'll see if I get more
>> optimistic when the fog lifts.
>>
>> Erick
>>
>> On Wed, Aug 1, 2018 at 7:15 AM, Shawn Heisey <apa...@elyograg.org> wrote:
>>> On 7/30/2018 11:52 AM, Erick Erickson wrote:
>>>>
>>>> Is anybody paying the least attention to this or should I just stop
>>>> bothering?
>>>
>>>
>>> The job you're doing is thankless.  That's the nature of the work.  I'd love
>>> to have the time to really help you out. If only my employer didn't expect
>>> me to spend so much time *working*!
>>>
>>>> I'd hoped to get to a point where we could get at least semi-stable
>>>> and start whittling away at the backlog. But with an additional 63
>>>> tests to BadApple (a little fudging here because of some issues with
>>>> counting suite-level tests .vs. individual test) it doesn't seem like
>>>> we're going in the right direction at all.
>>>>
>>>> Unless there's some value here, defined by people stepping up and at
>>>> least looking (and once a week is not asking too much) at the names of
>>>> the tests I'm going to BadApple to see if they ring any bells, I'll
>>>> stop wasting my time.
>>>
>>>
>>> Here's a crazy thought, which might be something you already considered:
>>> Try to figure out which tests pass consistently and BadApple *all the rest*
>>> of the Solr tests.  If there are any Lucene tests that fail with some
>>> regularity, BadApple those too.
>>>
>>> There are probably disadvantages to this approach, but here are the
>>> advantages I can think of:  1) The noise stops quickly. 2) Future heroic
>>> efforts will result in measurable progress -- to quote you, "whittling away
>>> at the backlog."
>>>
>>> Thank you a million times over for all the care and effort you've put into
>>> this.
>>>
>>> Shawn
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: BadApple report. Seems like I'm wasting my time.

Reply via email to