Re: SamplingLongPrimitiveIteratorTest fails

Sean Owen Wed, 02 Jan 2013 02:00:08 -0800

It passes for me. It's asserting about the result of a random process though.

10% of 1000 elements are sampled, and the number sampled should be
normally distributed with mean 100 and stdev ~= sqrt(0.9*0.1*1000).
The test asserts it's within 4 standard deviations which should only
fail about 1 out of 16,000 times. This is run 1000 times.

I suppose it wouldn't be so strange for it to fail eventually, since
it will over time be run tens of thousands of times. The thing is, the
tests are supposed to always start from the same random seed state, so
should be deterministic.

But then: a short while ago I cleverly optimized this iterator by
having it pick the # of elements to skip from a geometric distribution
instead of actually checking a probability a bunch of times.

But then: Commons Math's implementation doesn't let you supply a
random number generator, so it's internally using its own
non-deterministically seeded RNG, and that may allow different test
results.

But then: in 3.1, released last week, you can supply your own RNG.

I think I will fix this by updating to 3.1 and supplying our RNG, and
also loosening the test bounds a bit.

On Wed, Jan 2, 2013 at 9:11 AM, Dan Filimon <[email protected]> wrote:
> Sorry if you know about this, but the
> testSample(org.apache.mahout.cf.taste.impl.common.SamplingLongPrimitiveIteratorTest)
> fails at line 77,
>       assertTrue(k <= 100 + 4 * sd);
>
> I changed a bunch of code in Mahout (unrelated to this test) and
> Jenkins doesn't seem to point to any failed tests in the last stable
> build [1]. Trunk currently seems to fail building not sure why...).
>
> Could anyone check to see if they can reproduce this test failing?
> Thanks!
>
> [1] 
> https://builds.apache.org/job/Mahout-Quality/lastSuccessfulBuild/testReport/

Re: SamplingLongPrimitiveIteratorTest fails

Reply via email to