[ https://issues.apache.org/jira/browse/SOLR-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris M. Hostetter updated SOLR-13864: -------------------------------------- Attachment: apache_Lucene-Solr-BadApples-Tests-master_531.log.txt Status: Open (was: Open) [~jbernste] - it looks like you only fixed testGammaDistribution ? what about testZipFDistribution, testGeometricDistribution, and testFuzzyKmeans ? ... as mentioned above they also seem to fail sporadically due to explicit assumptions about the underlying random distributions. Attaching a recent jenkins failure from testZipFDistribution... {noformat} [junit4] 2> 288619 INFO (TEST-MathExpressionTest.testZipFDistribution-seed#[4B489A4C6D218B8D]) [ ] o.a.s.SolrTestCaseJ4 ###Ending testZipFDistribution [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=MathExpressionTest -Dtests.method=testZipFDistribution -Dtests.seed=4B489A4C6D218B8D -Dtests.multiplier=2 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ja-JP -Dtests.timezone=America/St_Lucia -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] ERROR 0.07s J1 | MathExpressionTest.testZipFDistribution <<< [junit4] > Throwable #1: java.lang.Exception: Zipf distribution not descending!!! [junit4] > at __randomizedtesting.SeedInfo.seed([4B489A4C6D218B8D:6FFDF7687A8983A5]:0) [junit4] > at org.apache.solr.client.solrj.io.stream.MathExpressionTest.testZipFDistribution(MathExpressionTest.java:3766) {noformat} > MathExpressionTest non-reproducible failures due to assertions of > non-absolutes and randomization beyond test seed > ------------------------------------------------------------------------------------------------------------------ > > Key: SOLR-13864 > URL: https://issues.apache.org/jira/browse/SOLR-13864 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Chris M. Hostetter > Assignee: Joel Bernstein > Priority: Major > Attachments: apache_Lucene-Solr-BadApples-Tests-master_531.log.txt > > > We're seeing a a fairly steady trickle of MathExpressionTest from various > jenkins boxes going back quite a while ... mostly from testGammaDistribution, > but other tests pop up now and then. > the crux of the problem with this test seems to break down into 2 categories: > # tests that make assumptions about the relative values that will come out > of taking samples from different random distributions that aren't garunteed > to be true > ** ie: comparing 2 random samples from 2 diff shaped gamma distributions and > expecting one to always be strictly greater then the other. I'm not a stats > guy, but my naive understanding is that on the low end some of these shapes > may cross over, so every possible random sample from one shape is not > garunteed to be less then every ossible random sample from a diff shape > # the code being tested does it's own randomization outside of the crontrol > of the test framework (or test client) > ** this causes the seeds to not reproduce > ---- > Tests should not be making assertions about random data that aren't 100% > garunteed to be true in all cases (ie: {{random().nextInt(5) < (5.0D + > (double) random().nextInt(5))}} is one thing, {{random().nextInt(5) < > (4.99999D + (double) random().nextInt(5))}} is a diff story. > Randomized behavior in solr (non-test) code should ideally have some way for > being controlled by the client/tests ... either via a request param used to > initialize any new Random instances, or for example the use of the > "tests.seed" property in various places in the code to try and provide some > reproducibility even when the external solr client isn't even aware of > randomization being a factor in the behavior of the code. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org