Bug#1026061: [+externe Mail+] Re: Bug#1026061: bart: FTBFS randomly in bullseye (failing test)

2024-05-07 Thread Uecker, Martin
Am Dienstag, dem 07.05.2024 um 19:23 +0200 schrieb Santiago Vila:
> El 7/5/24 a las 18:50, Uecker, Martin escribió:
> > Am Dienstag, dem 07.05.2024 um 17:59 +0200 schrieb Santiago Vila:
> > > El 1/1/23 a las 16:55, Uecker, Martin escribió:
> > > In the meantime, I became member of debian-med, so in theory,
> > > I could fix this myself via team upload. Would you prefer
> > > that I take care of this myself that way?
> > 
> > I wouldn't mind if you did. There are some other bugs
> > which could easily be fixed with a new upload (i.e. with
> > minor patches in the bug tracker)
> > > 
> > > (I would also handle bart-cuda, also affected but not reported yet)
> > 
> > bart-cuda needs to be updated to 0.9.00 which should
> > be straightforward in principle but need more work
> > and a binary upload.
> 
> Well, just in case: I'm talking about making an upload
> for bullseye, using the same fix in bookworm, then
> sending a bug to release.debian.org explaining the
> issue, etc.
> 
> That's mainly bureaucratic and will not interfere with your normal
> work in unstable regarding new upstream releases and such (which
> I still prefer not to handle myself).
> 
Yes, if you do not mind to do the work, please update bullseye.
That would be much appreciated.

Martin




Bug#1026061: [+externe Mail+] Re: Bug#1026061: bart: FTBFS randomly in bullseye (failing test)

2024-05-07 Thread Uecker, Martin
Am Dienstag, dem 07.05.2024 um 17:59 +0200 schrieb Santiago Vila:
> El 1/1/23 a las 16:55, Uecker, Martin escribió:
> > I can apply the patch, but I do not have much time now.
> > Is there some urgency?
> 
> Hello. A lot of time passed without activity on this bug.
> 
> In the meantime, I became member of debian-med, so in theory,
> I could fix this myself via team upload. Would you prefer
> that I take care of this myself that way?

I wouldn't mind if you did. There are some other bugs
which could easily be fixed with a new upload (i.e. with
minor patches in the bug tracker)
> 
> (I would also handle bart-cuda, also affected but not reported yet)

bart-cuda needs to be updated to 0.9.00 which should
be straightforward in principle but need more work
and a binary upload.

Otherwise, I would have more time in July to do some work.

Martin

> 
> (Yes, I still think it would be worthy to fix this in bullseye,
> for posterity and before it becomes LTS).
> 
> Thanks.





Bug#1026061: bart: FTBFS randomly in bullseye (failing test)

2023-01-01 Thread Uecker, Martin
> 
> 
> Am 01.01.23 um 15:55 schrieb Uecker, Martin:
> > 
> > One could just relax (or simply remove) the test from bullseye
> > or packport the version bookworm.
> 
> I guess that would be applying 0003-relax-failing-unit-test.patch
> to the Bullseye version.

Yes, this should do it.

> 
> > The wine code is broken (it violates the effective types
> > rules of ISO C).
> 
> Ok, just wanted to offer an option.
> 

Yes, thanks.

> Am 01.01.23 um 16:02 schrieb Santiago Vila:
> 
> > Hi. Such failure rate differs a lot from what I get, which is
> > about 50% in some systems (which is why I believe we should fix this
> > in bullseye).
> > 
> > Maybe this is an issue of tests optimized for Intel and failing
> > a lot on AMD or viceversa (there was another package for which that
> > happened).
> 
> 
> If it might be of any help - my system is a "AMD Ryzen 7 1700",
> the qemu VM runs with "-enable-kvm -cpu host -smp 16".
> 
> By locking the process to just a single cpu I do not get any failures:
>    taskset -c 0 bash -c "while true; do ./test_nufft ; done"
> 
> If I allow two cpus the failure rate is at 77%:
>    taskset -c 0,1 bash -c "while true; do ./test_nufft ; done"
> 
> 
> This still does not affect a Bookworm VM, no failures even with
> the relax patch removed.

This is likely a numerical error caused by reordering a
floating point sum by parallelization. It is not worth
spending time on it.

I can apply the patch, but I do not have much time now.
Is there some urgency?

Martin







Bug#1026061: bart: FTBFS randomly in bullseye (failing test)

2023-01-01 Thread Uecker, Martin

One could just relax (or simply remove) the test from bullseye
or packport the version bookworm.


The wine code is broken (it violates the effective types
rules of ISO C).

Martin

Am Sonntag, dem 01.01.2023 um 15:39 +0100 schrieb Bernhard Übelacker:
> Dear Maintainer,
> I could reproduce this failure in a bullseye VM.
> There the "test_nufft_adjoint" fails in about 1.2 % of the runs.
> 
> Attached diff helps to make it more visible.
> 
> It looks like the float comparison fails because the
> limit of "1.E-6f" is slightly not enough.
> 
> If interpret following floating point comparison document right,
> then the failing cases are just 8 representable floats "ULPs"
> away from the expected value, below 8 it does not fail.
> 
>    
> https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
> 
> Maybe upstream could consider changing that
> float comparison to something like this:
> 
>    
> https://source.winehq.org/git/wine.git/blob/HEAD:/dlls/ddraw/tests/ddraw7.c#l61
> 
> 
> In the newer Bookworm package I have found following patch,
> which does relax exactly this test:
> 
>    
> https://salsa.debian.org/med-team/bart/-/blob/master/debian/patches/0003-relax-failing-unit-test.patch
> 
> But for some reason it does still not fail if I remove that
> patch in the Bookworm version.
> 
> Kind regards,
> Bernhard
> 
> 
> 
> make utest
> while true; do ./test_nufft ; done
> 
> 
> Bullseye/stable/bart-0.6.00:
> - -1.067273+3.247031i - -1.067273+3.247030i - sc1=6.3619987432198080e+01 
> sc2=6.3619926397041880e+01 diff=9.6109602054639254e-07 diff_ulp=7
> adjoint diff: 0.01 9.6109602054639254e-07, limit: 9.999747524271e-07
>  ./test_nufft:  1/ 1 passed.
> 
> - -1.067273+3.247030i - -1.067273+3.247030i - sc1=6.3619926397041830e+01 
> sc2=6.3619926397041880e+01 diff=8.3446502685546875e-07 diff_ulp=7
> adjoint diff: 0.01 8.3446502685546875e-07, limit: 9.999747524271e-07
>  ./test_nufft:  1/ 1 passed.
> 
> - -1.067273+3.247031i - -1.067273+3.247030i - sc1=6.3619987432198087e+01 
> sc2=6.3619926397041880e+01 diff=8.5963040419301251e-07 diff_ulp=6
> adjoint diff: 0.01 8.5963040419301251e-07, limit: 9.999747524271e-07
>  ./test_nufft:  1/ 1 passed.
> 
> - -1.067272+3.247031i - -1.067273+3.247030i - sc1=6.3619987432198073e+01 
> sc2=6.3619926397041880e+01 diff=1.0662403155947686e-06 diff_ulp=8
> adjoint diff: 0.01 1.0662403155947686e-06, limit: 9.999747524271e-07
> ERROR: ./test_nufft:  0/ 1 passed.
> 
> - -1.067273+3.247031i - -1.067273+3.247030i - sc1=6.3619987432198087e+01 
> sc2=6.3619926397041880e+01 diff=8.5963040419301251e-07 diff_ulp=6
> adjoint diff: 0.01 8.5963040419301251e-07, limit: 9.999747524271e-07
>  ./test_nufft:  1/ 1 passed.
> 
> - -1.067273+3.247031i - -1.067273+3.247030i - sc1=6.3619987432198080e+01 
> sc2=6.3619926397041880e+01 diff=9.6109602054639254e-07 diff_ulp=7
> adjoint diff: 0.01 9.6109602054639254e-07, limit: 9.999747524271e-07
>  ./test_nufft:  1/ 1 passed.
> 
> 
> Bookworm/testing/bart-0.8.00:
> - -1.067272+3.247031i - -1.067273+3.247030i - sc1=6.3619987432198073e+01 
> sc2=6.3619926397041880e+01 diff=1.0662403155947686e-06 diff_ulp=8
> adjoint diff: 0.00 3.1195452265819767e-07, limit: 9.999747524271e-07
>  ./test_nufft:  1/ 1 passed.



Bug#1010164: fails autopkgtest against Octave 7

2022-05-01 Thread Uecker, Martin
On Mon, 25 Apr 2022 17:23:25 +0200 =?utf-8?q?S=C3=A9bastien_Villemot?= 
 wrote:
> Package: octave-bart
> Version: 0.7.00-2
> Severity: serious
> Tags: patch
> Control: block 1009865 by -1
> 
> Dear Maintainer,
> 
> The autopkgtest for octave-bart fails against octave 7.1.0-2 recently uploaded
> to unstable. See:
> https://ci.debian.net/data/autopkgtest/testing/amd64/b/bart/21135046/log.gz
> 
> The problem comes from a message that is printed to stderr:
> 
>   error: ignoring const execution_exception& while preparing to exit
> 
> This message only appears when running the autopkgtest in a dedicated chroot 
> as
> done on the DebCI infrastructure. I wasn’t able to reproduce it in other
> contexts, and I could not figure out what causes it.
> 
> Since this message is essentially harmless, and given that the test passes
> otherwise, I would suggest to simply add the “allow-stderr” keyword to
> “Restrictions” in the octave-integration stanza of debian/tests/control.

I am happy to do this if this is necessary, but isn't this
obviously caused by a bug in octave?

Martin




Bug#969804: Bug#969804: bart: autopkgtest should be marked superficial

2020-09-21 Thread Uecker, Martin
Am Montag, den 21.09.2020, 17:57 +0200 schrieb Andreas Tille:
> On Mon, Sep 21, 2020 at 05:30:20PM +0200, Andreas Tille wrote:
> > 
> > May be I misunderstood you - but if you do not run the test at all
> > (as done in some architectures) how will you know whether the test
> > might fail?  May be I miss your point here and thus I implemented
> > my suggestion and we'll see what happends on the buildd logs soon.

Thank you! No, I meant running the tests but ignoring the results,
so exactly what you implemented!

> For instance
> 
>    
> https://buildd.debian.org/status/fetch.php?pkg=bart=i386=0.6.00-3=1600702554
> =0
> 
> has
> 
> ...
> ./test_linop_matrix 
>  ./test_linop_matrix:  4/ 4 passed.
> ./test_linop 
>  [31mERROR: ./test_linop:  2/ 3 passed.
>  [0mmake[3]: *** [Makefile:685: utests-all] Error 1
> make[3]: Leaving directory '/<>'
> make[2]: *** [Makefile:273: utest] Error 2
> 
> 
> --> I guess access to i386 should be simple.  You can either use
> qemu or may be the issue can be even reproduced in an i386
> pbuilder chroot

Yes, I think this is a problem I looked at before. Essentially
the floating point computation has a bigger error for some
reason. But I did not have to time to get on the bottom of it.

What would be really useful is to get bug reports for
failing tests which are not critical.

> 
> The s390x build
> 
> 
> https://buildd.debian.org/status/fetch.php?pkg=bart=s390x=0.6.00-3=1600702311
> aw=0
> 
> has the know issue.
> 
> So at least the logs do not hide the issues that might be worth
> investigating or not.

Yes, this is useful. I will try to change the tests so that a
failing test outputs more information.

Best,
Martin



Bug#969804: Bug#969804: bart: autopkgtest should be marked superficial

2020-09-21 Thread Uecker, Martin

Hi Andreas,

Am Montag, den 21.09.2020, 09:45 +0200 schrieb Andreas Tille:
> Hi Martin,
> 
> On Sat, Sep 19, 2020 at 05:50:34PM +0000, Uecker, Martin wrote:
> > > I'm not sure whether this is a good idea in general.  If
> > > we can be sure that for s390x there is an issue with the
> > > tool chain I could imagine something like:
> > > 
> > >   if build on s390x
> > >   run_test || true # FIXME: explanation why we ignore test
> > >   else
> > >   run_test
> > > 
> > > I do not think that ignoring the tests on all architectures
> > > is a good idea.
> > 
> > Without any way to investigate it is difficult
> > to be sure.
> > 
> > But we run very comprehensive tests upstream.
> 
> I'm pretty sure about this and I appreciate the effort you are doing so.
> 
> > So it is
> > actually very likely that all problems we can catch in
> > Debian (and not already earlier) are problems
> > with the tool chain on some architecture.
> 
> To the best of my knowledge the fact that a test runs on amd64 but fails
> on some other architecture is not only caused by issues in the tool
> chain.  For instance recently I learned that for instance if char is
> used as "very short int" it matters whether it is declared as char,
> signed char or unsigned char.  

I know, this is also why I originally thought it would be
a good idea.

> To spot this kind of issues it makes
> perfectly sense to run the test suite on all architectures - except
> for those where we expicitly know that a broken tool chain is breaking
> the build.

If there were an easy way to log into a s390x machine
and debug the problem, the outcome would likely be a useful
bug report against GCC which would be useful to the community
and the s390x port.  But as things are, it only causes
unnecessary work for us to not get the package removed while
we learned nothing about the underlying issue.

> Thus my suggestion to exclude only the affected s390x from the test.
> If my suggestion is not worded clearly enough feel free to ping me and
> I implement my suggestion in d/rules to show what I mean.

Well, the idea would be to do exactly this but pro-actively
for all architectures except amd64. We would still get
the information when tests fail (which is useful) but
avoid wasting effort on fixing it each time.

Best,
Martin

> Kind regards and thanks for working on bart
> 
>   Andreas.
>  
> 

Bug#969804: Bug#969804: bart: autopkgtest should be marked superficial

2020-09-19 Thread Uecker, Martin
Am Samstag, den 19.09.2020, 17:09 +0200 schrieb Andreas Tille:
> Hi Martin,
> 
> On Sat, Sep 19, 2020 at 01:15:47PM +0000, Uecker, Martin wrote:
> > > The severity on this bug can be downgraded, however the FTBFS on s390x
> > > remains a release critical bug, since s390x is a release architecture.
> > > 
> > > Either the FTBFS gets fixed, or removal of the s390x binaries can be 
> > > requested.
> > 
> > I wonder whether we should simply turn off all tests on
> > non-x86-64 architectures.
> > 
> > These usually turn up issues with the toolchain which
> > then cause additional work for me although the bug is
> > elsewhere.
> 
> I'm not sure whether this is a good idea in general.  If
> we can be sure that for s390x there is an issue with the
> tool chain I could imagine something like:
> 
>   if build on s390x
>   run_test || true # FIXME: explanation why we ignore test
>   else
>   run_test
> 
> I do not think that ignoring the tests on all architectures
> is a good idea.

Without any way to investigate it is difficult
to be sure.

But we run very comprehensive tests upstream. So it is
actually very likely that all problems we can catch in
Debian (and not already earlier) are problems
with the tool chain on some architecture.


Best,
Martin






Bug#969804: [Debian-med-packaging] Bug#969804: bart: autopkgtest should be marked superficial

2020-09-19 Thread Uecker, Martin
Hi Graham,

Am Samstag, den 19.09.2020, 14:04 +0200 schrieb Graham Inggs:
> On Sat, 19 Sep 2020 at 13:51, Uecker, Martin
>  wrote:
> > Could severity be downgraded?
> 
> The severity on this bug can be downgraded, however the FTBFS on s390x
> remains a release critical bug, since s390x is a release architecture.
> 
> Either the FTBFS gets fixed, or removal of the s390x binaries can be 
> requested.

I wonder whether we should simply turn off all tests on
non-x86-64 architectures.

These usually turn up issues with the toolchain which
then cause additional work for me although the bug is
elsewhere.

Best,
Martin

Bug#969804: bart: autopkgtest should be marked superficial

2020-09-19 Thread Uecker, Martin

Could severity be downgraded?


(The build on s390x seems to use a newer compiler.
The failure may be a compiler bug, but similar to
the RISC-V bug I can't investigate as I never got
access to porter boxes.)


On Thu, 17 Sep 2020 15:05:16 +0200 Andreas Tille  wrote:
> Control: tags -1 normal
> 
> On Thu, Sep 17, 2020 at 01:40:29PM +0200, Paul Gevers wrote:
> > The thread starts here:
> > https://lists.debian.org/debian-devel/2020/09/msg00071.html
> > 
> > mostly follow-ups from here on:
> > https://lists.debian.org/debian-devel/2020/09/msg00219.html
> 
> Thanks for the pointers and as I'd love to repeat for all your
> work on this
>  Andreas.
> 
> -- 
> http://fam-tille.de
> 
> 

Bug#897466: bart-view: FTBFS: multind.h:17:10: fatal error: misc/nested.h: No such file or directory

2018-05-03 Thread Uecker, Martin

Hi Andreas,

we will need to re-upload the bart package 
to export the missing files. I will prepare a new
upload.

Best,
Martin

Am Donnerstag, den 03.05.2018, 12:07 +0200 schrieb Andreas Tille:
> Hi Martin,
> 
> can you please have a look?
> 
> Kind regards
> 
>  Andreas.
> 
> On Wed, May 02, 2018 at 09:55:11PM +0200, Lucas Nussbaum wrote:
> > Source: bart-view
> > Version: 0.1.00-1
> > Severity: serious
> > Tags: buster sid
> > User: debian...@lists.debian.org
> > Usertags: qa-ftbfs-20180502 qa-ftbfs
> > Justification: FTBFS on amd64
> > 
> > Hi,
> > 
> > During a rebuild of all packages in sid, your package failed to
> > build on
> > amd64.
> > 
> > Relevant part (hopefully):
> > > cc -g -O2 -fdebug-prefix-map=/<>=. -fstack-
> > > protector-strong -Wformat -Werror=format-security -std=c11
> > > -fopenmp -export-dynamic -o view -I/usr/include/bart/ `pkg-config 
> > > --cflags gtk+-3.0` src/main.c src/view.c src/draw.c `pkg-config
> > > --libs gtk+-3.0` /usr/lib/bart//libmisc.a /usr/lib/bart//libnum.a
> > > -lm -lpng
> > > In file included from src/view.c:24:0:
> > > /usr/include/bart/num/multind.h:17:10: fatal error:
> > > misc/nested.h: No such file or directory
> > >  #include "misc/nested.h"
> > >   ^~~
> > > compilation terminated.
> > > make[1]: *** [Makefile:52: view] Error 1
> > 
> > The full build log is available from:
> >    http://aws-logs.debian.net/2018/05/02/bart-view_0.1.00-1_unstabl
> > e.log
> > 
> > A list of current common problems and possible solutions is
> > available at
> > http://wiki.debian.org/qa.debian.org/FTBFS . You're welcome to
> > contribute!
> > 
> > About the archive rebuild: The rebuild was done on EC2 VM instances
> > from
> > Amazon Web Services, using a clean, minimal and up-to-date chroot.
> > Every
> > failed build was retried once to eliminate random failures.
> > 
> > ___
> > Debian-med-packaging mailing list
> > debian-med-packag...@alioth-lists.debian.net
> > https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/debian-med
> > -packaging
> 
>