Re: [rng] Tests in -sampling

Rob Tompkins Thu, 26 Jul 2018 06:44:48 -0700

> On Jul 25, 2018, at 9:48 PM, Gilles <gil...@harfang.homelinux.org> wrote:
> 
> On Wed, 25 Jul 2018 21:08:57 -0400, Rob Tompkins wrote:
>>> On Jul 24, 2018, at 9:13 PM, Rob Tompkins <chtom...@gmail.com> wrote:
>>> 
>>> 
>>> 
>>>> On Jul 24, 2018, at 7:04 PM, Gilles <gil...@harfang.homelinux.org> wrote:
>>>> 
>>>> Hi Rob.
>>>> 
>>>> On Tue, 24 Jul 2018 18:33:40 -0400, Rob Tompkins wrote:
>>>>> I know that the tests will be necessarily non-deterministic, but we
>>>>> can at least get closer to having determinism by running the same test
>>>>> 1000 times and expecting some reasonable number of passes right? Could
>>>>> we use the underlying distribution that we are testing to sort out
>>>>> this value?
>>>> 
>>>> This *is* what the test is doing, although it repeats 50 times
>>>> (takes quite some time already) instead of 1000.
>>>> As I've reported on this list, it is quite possible that the
>>>> failure probabilities are underestimated; (first) review welcome:
>>>> the tests are fairly well documented as to what they are doing
>>>> but I might have committed some bugs wrt the statistics involved.
>>> 
>>> Once I get the release out, I’ll have a look.
>> 
>> So the curiosity here is a standard probability problem. It seems
>> that we have N tests each with some probability of failing P_N. For
>> some arbitrary test T, P_T is fairly inconsequential, but when
>> aggregated together with in with P_1, P_2, … , P_{T-1}, P_{T}, …., P_N
>> the probability of failure of test approaches something between 10%
>> and 50% which is indeed consequential.
> 
> If p is the probability that the test will fail, 1-p is
> the probability that it'll succeed. The probability that
> all N tests succeed is (1-p)^N.
> 
> Example from empirical runs: Overall failure is ~25% (3/12 as
> per previous post); there ~35 such tests, thus p is ~1%.
> We'd have to look for how to reduce this latter value.

If we simply set up surefire to re-run only the failed tests, we’d overcome the 
problem. I checked that into 1.1 last night. I think that’ll help considerably.

> 
> Gilles
> 
>> I’m going to have to think
>> about this some. If I recall correctly, we could use the central limit
>> theorem here about overall test failure, right? Could we apply the
>> same characteristic to the over all number of tests in the project? I
>> don’t think we can avoid it. Does surefire accommodate a percentage of
>> test failures for passing the build?
>> 
>> -Rob
>> 
>>> 
>>> Cheers,
>>> -Rob
>>> 
>>>> 
>>>> Regards,
>>>> Gilles
>>>> 
>>>>> 
>>>>> -Rob
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org
Re: [rng] Tests in -sampling

Reply via email to