[ https://issues.apache.org/jira/browse/CRUNCH-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Wills updated CRUNCH-542: ------------------------------ Attachment: CRUNCH-542.patch Patch for the same. > Wider tolerance for flaky scrunch PCollectionTest > ------------------------------------------------- > > Key: CRUNCH-542 > URL: https://issues.apache.org/jira/browse/CRUNCH-542 > Project: Crunch > Issue Type: Improvement > Components: Scrunch > Affects Versions: 0.10.0, 0.11.0, 0.12.0 > Reporter: Josh Wills > Priority: Minor > Fix For: 0.13.0 > > Attachments: CRUNCH-542.patch > > > One of the Scrunch tests uses an unseeded version of the sample() function > that verifies that it works correctly by ensuring that an actual sampling of > elements is within ~ 3 standard deviations of the expected value. Given this, > we expect the test to fail about once every 370 times it is run, or once a > year if the tests were run every day. > My issue is that we test about a dozen versions of Crunch automatically in > Jenkins every day, and so I'm having this test fail on at least some version > about once every month. I'd like to bump the control limit up to a little > over 5 standard deviations so that the test fails around once every > millennium and/or get rid of the test entirely and only rely on the seeded > versions of the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)