[
https://issues.apache.org/jira/browse/SOLR-10032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858889#comment-15858889
]
Mark Miller commented on SOLR-10032:
------------------------------------
For this next report I have switched to an 8 core machine from a 16 core
machine. It looks like that may have made some of the more resource/env
sensitive tests pop out a little more. The first report was created on a single
machine, so I went with 16 cores just to try and generate it as fast as
possible. 16-cores was not strictly needed, I run 10 at a time on my 6-core
machine with similar results. It may even be a little too much CPU for our use
case, even when running 10 instances of the test in parallel.
I have moved on from just using one machine though. It actually basically took
2-3 days to generate the first report as I was still working out some speed
issues. The First run had like 2 minutes and 40 seconds of 'build' overtime per
test run for most of the report and just barely enough RAM to handle 10 tests
at a time - for a few test fails on heavy tests (eg hdfs), not enough RAM as
there is also no swap space on those machines. Anyway, beasting ~900 tests is
time consuming even in the best case.
Two tests also hung and that slowed things up a bit. Now I am more on the
lookout for that - I've @BadAppled a test method involved in producing one of
the hangs, and for this report I locally @BadAppled the other. They both look
like legit bugs to me. I should have done @Ignore for the second hang, the test
report runs @BadApple and @AwaitFix. Losing one machine for a long time when
you are using 10 costs you a lot in report creation time. Now I at least know
to pay attention to my email while running reports though. Luckily, these
instance I'm using will auto pause after 30 minutes of no real activity and I
get an email, so I now I can be a bit more vigilant while creating the report.
Also helps that I've gotten down to about
I used 5 16-core machines for the second report. I can't recall about how long
that took, but it was still in the realm of an all night job.
For this third report I am using 10 8-core machines.
I think we should be using those annotations like this:
* @AwaitsFix - we basically know something key is broken and it's fairly clear
what the issue is - we are waiting for someone to fix it - you don't expect
this to be run regularly, but you can just pass a system property to run them.
* @BadApple - test is too flakey, fails too much for unknown or varied reasons
- you do expect that some test runs would still or could still include these
tests and give some useful coverage information - flakiness in many more
integration type tests can be the result of unrelated issues and clear up over
time. Or get worse.
* @Ignore - test is never run, it can hang, OOM, or does something negative to
other tests.
I'll put up another report soon. I probably won't do another one until I have
tackled the above flakey rating issues, hoping that's just a couple to a few
weeks at most, but may be wishful.
> Create report to assess Solr test quality at a commit point.
> ------------------------------------------------------------
>
> Key: SOLR-10032
> URL: https://issues.apache.org/jira/browse/SOLR-10032
> Project: Solr
> Issue Type: Task
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Tests
> Reporter: Mark Miller
> Assignee: Mark Miller
> Attachments: Lucene-Solr Master Test Beast Results
> 01-24-2017-9899cbd031dc3fc37a384b1f9e2b379e90a9a3a6 Level Medium- Running 30
> iterations, 12 at a time .pdf, Lucene-Solr Master Test Beasults
> 02-01-2017-bbc455de195c83d9f807980b510fa46018f33b1b Level Medium- Running 30
> iterations, 10 at a time.pdf
>
>
> We have many Jenkins instances blasting tests, some official, some policeman,
> I and others have or had their own, and the email trail proves the power of
> the Jenkins cluster to find test fails.
> However, I still have a very hard time with some basic questions:
> what tests are flakey right now? which test fails actually affect devs most?
> did I break it? was that test already flakey? is that test still flakey? what
> are our worst tests right now? is that test getting better or worse?
> We really need a way to see exactly what tests are the problem, not because
> of OS or environmental issues, but more basic test quality issues. Which
> tests are flakey and how flakey are they at any point in time.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]