> On Jul 25, 2018, at 9:48 PM, Gilles <gil...@harfang.homelinux.org> wrote:
>
> On Wed, 25 Jul 2018 21:08:57 -0400, Rob Tompkins wrote:
>>> On Jul 24, 2018, at 9:13 PM, Rob Tompkins <chtom...@gmail.com> wrote:
>>>
>>>
>>>
>>>> On Jul 24, 2018, at 7:04 PM, Gilles <gil...@harfang.homelinux.org> wrote:
>>>>
>>>> Hi Rob.
>>>>
>>>> On Tue, 24 Jul 2018 18:33:40 -0400, Rob Tompkins wrote:
>>>>> I know that the tests will be necessarily non-deterministic, but we
>>>>> can at least get closer to having determinism by running the same test
>>>>> 1000 times and expecting some reasonable number of passes right? Could
>>>>> we use the underlying distribution that we are testing to sort out
>>>>> this value?
>>>>
>>>> This *is* what the test is doing, although it repeats 50 times
>>>> (takes quite some time already) instead of 1000.
>>>> As I've reported on this list, it is quite possible that the
>>>> failure probabilities are underestimated; (first) review welcome:
>>>> the tests are fairly well documented as to what they are doing
>>>> but I might have committed some bugs wrt the statistics involved.
>>>
>>> Once I get the release out, I’ll have a look.
>>
>> So the curiosity here is a standard probability problem. It seems
>> that we have N tests each with some probability of failing P_N. For
>> some arbitrary test T, P_T is fairly inconsequential, but when
>> aggregated together with in with P_1, P_2, … , P_{T-1}, P_{T}, …., P_N
>> the probability of failure of test approaches something between 10%
>> and 50% which is indeed consequential.
>
> If p is the probability that the test will fail, 1-p is
> the probability that it'll succeed. The probability that
> all N tests succeed is (1-p)^N.
>
> Example from empirical runs: Overall failure is ~25% (3/12 as
> per previous post); there ~35 such tests, thus p is ~1%.
> We'd have to look for how to reduce this latter value.
If we simply set up surefire to re-run only the failed tests, we’d overcome the
problem. I checked that into 1.1 last night. I think that’ll help considerably.
>
> Gilles
>
>> I’m going to have to think
>> about this some. If I recall correctly, we could use the central limit
>> theorem here about overall test failure, right? Could we apply the
>> same characteristic to the over all number of tests in the project? I
>> don’t think we can avoid it. Does surefire accommodate a percentage of
>> test failures for passing the build?
>>
>> -Rob
>>
>>>
>>> Cheers,
>>> -Rob
>>>
>>>>
>>>> Regards,
>>>> Gilles
>>>>
>>>>>
>>>>> -Rob
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org