Re: false-positive failure of the root-removal test

Jim Meyering Wed, 14 Oct 2015 21:51:07 -0700

On Wed, Oct 14, 2015 at 6:44 PM, Pádraig Brady <[email protected]> wrote:
> On 14/10/15 18:43, Jim Meyering wrote:
>> Running a massively parallel "make very-expensive-check"
>> (-j73 on a 48-core system), the rm/r-root.sh test would fail
>> about 1-in-2 or 1-in-3 trials due to expiration of the 2-second
>> timeout here:
...
> The comment above that is a bit more explicit, i.e.
>
> # This isn't terribly expensive, but it must not be run under heavy load.
> # The reason is the conservative 'timeout' setting below to limit possible
> # damage in the worst case which yields a race under heavy load.
> # Marking this test as "expensive" therefore is a compromise, i.e., adding
> # this test to the list ensures it still gets _some_ (albeit minimal)
> # coverage while not causing false-positive failures in day to day runs.
>
> It would be better to avoid the race of course.
> timeout(1) isn't great protection anyway
> as 2 seconds allows for a lot of unlink calls.
> Perhaps leveraging gdb to limit the number of unlink calls is better.
> ...
>
> So I had a look and ended up debugging the debugger :/
> https://sourceware.org/bugzilla/show_bug.cgi?id=10079


Whoa. Very nice.
I've tested both this and your tail-test-race patch, and
they have survived 10 iterations of those abusive tests.

Re: false-positive failure of the root-removal test

Reply via email to