Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-22 Thread tony mancill
Hi Gregor,

On Thu, Dec 22, 2016 at 05:53:21PM +0100, gregor herrmann wrote:
> 
> I can reproduce the failure by running the test in a loop in the
> chroot [0]; sometimes in the first low-single-digit round, sometimes a
> bit later.
> 
> Interestingly, higher load on the machine seems to make
> the failure go away (mostly we see the opposite). But this might as
> well be a false guess.

Sounds like a classic race condition - just in this case winning the
race means the test fails?

> Reading through t/XSetRoot.t is interesting:
> 
> 
> # Something fishy with xvfb test server seems to cause the reconnect below
> # to fail.  Keeping a second connection makes it better.  Dunno why.
> #
> my $keepalive_X = X11::Protocol->new ($display);
> 
> No idea what exactly is going on here, but I note the coincidence of
> tha author mentioning xvfb and we running xvfb and the failure
> exactly at this point (i.e. after the diagnostic output and before
> the first actual test).
> 
> Commenting out the line seems to trigger the failure earlier.
> Adding yet another connection seems to push the failure further back.
> Dropping this second object and adding a sleep(1) after the keepalive
> and before the tests (is ugly and) seems to workaround the problem.
> Interestingly, the diag output mentions "# DISPLAY :99" and "# DISPLAY
> :100" alternating.
> Ah, and now I triggered a failure even with the sleep(1).
> 
> 
> So ... I'm not sure. It looks like this test has known problems with
> xvfb, and this doesn't show a problem in the code.
> In line with our handling of similar fragile tests, I propose to
> disable it.

100% agreed.  clusterssh and shutter are the only r-deps for this
package, so I feel confident with this course of action.

> Done in git, waiting with an upload for further comments.

I reviewed your commits and they look good to me.  Also, thank you for
providing a much quicker way to run the tests than a full package build
(although I still haven't reproduced it - I guess my machine is always
too slow... :).

Thank you both for your efforts on this.
tony


signature.asc
Description: PGP signature


Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-22 Thread gregor herrmann
On Thu, 15 Dec 2016 10:26:38 +0100, Santiago Vila wrote:

> > Could you maybe test with 29-1 which I just uploaded?
> Tried 100 times. Failed 80. Build failures here:
> https://people.debian.org/~sanvila/libx11-protocol-other-perl/

Thanks for your work, much appreciated.

So this shows that t/XSetRoot.t is the only test failing, but quite
regularly.


On Wed, 21 Dec 2016 21:26:20 -0800, tony mancill wrote:

> I built the package locally 65 times in a row successfully until I
> observed a test hang once with the perl process consuming a full core,
> but I haven't yet reproduced your build failure.  (However, I did
> observe the same test failure in the reproducible-builds [1]).

I can reproduce the failure by running the test in a loop in the
chroot [0]; sometimes in the first low-single-digit round, sometimes a
bit later.

Interestingly, higher load on the machine seems to make
the failure go away (mostly we see the opposite). But this might as
well be a false guess.


Reading through t/XSetRoot.t is interesting:


# Something fishy with xvfb test server seems to cause the reconnect below
# to fail.  Keeping a second connection makes it better.  Dunno why.
#
my $keepalive_X = X11::Protocol->new ($display);

No idea what exactly is going on here, but I note the coincidence of
tha author mentioning xvfb and we running xvfb and the failure
exactly at this point (i.e. after the diagnostic output and before
the first actual test).

Commenting out the line seems to trigger the failure earlier.
Adding yet another connection seems to push the failure further back.
Dropping this second object and adding a sleep(1) after the keepalive
and before the tests (is ugly and) seems to workaround the problem.
Interestingly, the diag output mentions "# DISPLAY :99" and "# DISPLAY
:100" alternating.
Ah, and now I triggered a failure even with the sleep(1).


So ... I'm not sure. It looks like this test has known problems with
xvfb, and this doesn't show a problem in the code.
In line with our handling of similar fragile tests, I propose to
disable it.

Done in git, waiting with an upload for further comments.


Cheers,
gregor


[0]
while : ; do xvfb-run -a prove --blib --verbose t/XSetRoot.t || break ; done
 

-- 
 .''`.  https://info.comodo.priv.at/ - Debian Developer https://www.debian.org
 : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D  85FA BB3A 6801 8649 AA06
 `. `'  Member of VIBE!AT & SPI, fellow of the Free Software Foundation Europe
   `-   NP: Alannah Myles


signature.asc
Description: Digital Signature


Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-22 Thread Santiago Vila
On Wed, 21 Dec 2016, tony mancill wrote:

> So if you could provide details about your build environment that
> might make the failure more readily reproducible, I would
> appreciate it.  And thank you for helping ferret out flakey tests.

I have just written this to describe my building environment:

https://people.debian.org/~sanvila/my-building-environment.txt

Hope it helps.

Thanks.



Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-21 Thread tony mancill
On Thu, 15 Dec 2016 00:44:59 +0100 Santiago Vila  wrote:
> On Wed, Dec 14, 2016 at 11:34:31PM +0100, gregor herrmann wrote:
..
> What follows might look as a rant but it's not:
>
> Perhaps I need to describe my building environment more accurately so
> that people (in general) can reproduce more easily the FTBFS-randomly
> bugs I report?
>
> I ask because the number of bugs of this kind I've reported is already
> too high:
>
> https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=ftbfs-randomly;users=sanv...@debian.org
>
> and I should really not be the only person to reproduce them.
> If a package really FTBFS randomly, everybody should be able to
> "reproduce the randomness" (so to speak).

Hi Santiago,

If you could provide more information about your build environment as you
offer above, that would be appreciated.  This bug has the potential to
prevent clusterssh from being shipped with stretch, which is something I
would like to avoid.

I suspect that the set of test cases that are failing are either due to
assumptions that the test author made about the build environment, or
they could pertain to a dependency of libx11-protocol-other-perl.  Which
is to say, I don't believe that they necessarily reflect the quality of
the module itself (but then again, I might be off base).

I built the package locally 65 times in a row successfully until I
observed a test hang once with the perl process consuming a full core,
but I haven't yet reproduced your build failure.  (However, I did
observe the same test failure in the reproducible-builds [1]).

So if you could provide details about your build environment that
might make the failure more readily reproducible, I would
appreciate it.  And thank you for helping ferret out flakey tests.

Cheers,
tony

[1] 
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/libx11-protocol-other-perl.html



Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-15 Thread Santiago Vila
> Could you maybe test with 29-1 which I just uploaded?

Tried 100 times. Failed 80. Build failures here:

https://people.debian.org/~sanvila/libx11-protocol-other-perl/

Thanks.



Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-14 Thread Santiago Vila
Sorry for the duplicate, mutt segfaults...



Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-14 Thread Santiago Vila
On Wed, Dec 14, 2016 at 11:34:31PM +0100, gregor herrmann wrote:

> Could you maybe test with 29-1 which I just uploaded?
> It has changes in the tests but I'm not sure if they address the
> issues you discovered.

Thanks a lot for acting on this bug so quickly.

Yes, I'll try to trigger a bunch of rebuilds tomorrow.

But bear in mind that the failure rate for this one is about 90% on
the single-CPU virtual machines I use. I would really love to see this
failure to happen in your machine as well.

What follows might look as a rant but it's not:

Perhaps I need to describe my building environment more accurately so
that people (in general) can reproduce more easily the FTBFS-randomly
bugs I report?

I ask because the number of bugs of this kind I've reported is already
too high:

https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=ftbfs-randomly;users=sanv...@debian.org

and I should really not be the only person to reproduce them.
If a package really FTBFS randomly, everybody should be able to
"reproduce the randomness" (so to speak).

Some weeks ago I naively believed that all the FTBFS-randomly bugs
were already reported, but I was very wrong, people keep uploading NEW
packages for unstable which suffer from this problem every day...

I wish we had a "D" day in the Release Managers calendar for stretch:

[D] Last day to upload packages that only build half of the time :-)

Anyway, I am lucky that you and Niko really care about this kind of
bugs (not everybody does, just see the list above). You are my
personal heroes in this battle against build randomness.

Thanks.



Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-14 Thread Santiago Vila
On Wed, Dec 14, 2016 at 11:34:31PM +0100, gregor herrmann wrote:
> On Tue, 13 Dec 2016 17:53:55 +0100, Santiago Vila wrote:
> 
> > Package: src:libx11-protocol-other-perl
> > Version: 28-1
> > Severity: serious
> > 
> > I tried to build this package in stretch with "dpkg-buildpackage -A"
> > (which is what the "Arch: all" autobuilder would do to build it)
> > but it failed:
> 
> > Test Summary Report
> > ---
> > t/XSetRoot.t  (Wstat: 65280 Tests: 0 Failed: 0)
> >   Non-zero exit status: 255
> >   Parse errors: Bad plan.  You planned 7 tests but ran 0.
> > Files=47, Tests=, 12 wallclock secs ( 0.13 usr  0.02 sys +  1.08 cusr  
> > 0.08 csys =  1.31 CPU)
> > Result: FAIL
> 
> 
> Thanks for your bug report.
> 
> Could you maybe test with 29-1 which I just uploaded?
> It has changes in the tests but I'm not sure if they address the
> issues you discovered.

Thanks a lot for acting on this bug so quickly.

Yes, I'll try to trigger a bunch of rebuilds tomorrow.

But bear in mind that the failure rate for this one is about 90% on
the single-CPU virtual machines I use. I would really love to see this
failure to happen in your machine as well.

What follows might look as a rant but it's not:

Perhaps I need to describe my building environment more accurately so
that people (in general) can reproduce more easily the FTBFS-randomly
bugs I report?

I ask because the number of bugs of this kind I've reported is already
too high:

https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=ftbfs-randomly;users=sanv...@debian.org

and I should really not be the only person to reproduce them.
If a package really FTBFS randomly, everybody should be able to
"reproduce the randomness" (so to speak).

Some weeks ago I naively believed that all the FTBFS-randomly bugs
were already reported, but I was very wrong, people keep uploading NEW
packages for unstable which suffer from this problem every day...

I wish we had a "D" day in the Release Managers calendar for stretch:

[D] Last day to upload packages that only build half of the time :-)

Anyway, I am lucky that you and Niko really care about this kind of
bugs (not everybody does, just see the list above). You are my
personal heroes in this battle against build randomness.

Thanks.



Bug#848060: libx11-protocol-other-perl: FTBFS randomly (failing tests)

2016-12-14 Thread gregor herrmann
On Tue, 13 Dec 2016 17:53:55 +0100, Santiago Vila wrote:

> Package: src:libx11-protocol-other-perl
> Version: 28-1
> Severity: serious
> 
> I tried to build this package in stretch with "dpkg-buildpackage -A"
> (which is what the "Arch: all" autobuilder would do to build it)
> but it failed:

> Test Summary Report
> ---
> t/XSetRoot.t  (Wstat: 65280 Tests: 0 Failed: 0)
>   Non-zero exit status: 255
>   Parse errors: Bad plan.  You planned 7 tests but ran 0.
> Files=47, Tests=, 12 wallclock secs ( 0.13 usr  0.02 sys +  1.08 cusr  
> 0.08 csys =  1.31 CPU)
> Result: FAIL


Thanks for your bug report.

Could you maybe test with 29-1 which I just uploaded?
It has changes in the tests but I'm not sure if they address the
issues you discovered.


Thanks in advance,
gregor


-- 
 .''`.  https://info.comodo.priv.at/ - Debian Developer https://www.debian.org
 : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D  85FA BB3A 6801 8649 AA06
 `. `'  Member of VIBE!AT & SPI, fellow of the Free Software Foundation Europe
   `-   NP: Tom Waits: Sins Of My Father


signature.asc
Description: Digital Signature