Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
On Mon, Feb 20, 2017 at 01:56:54AM +0100, Steve Cotton wrote: > On Sun, Feb 19, 2017 at 09:38:12PM +0100, Steve Cotton wrote: > > Sorry, but it's turned out that my patch either doesn't completely > > avoid the bug, or doesn't avoid another bug which gives the same error > > message. The build has failed on arm64. > > I've tried to reproduce this locally (on amd64, not arm64). With my > patch, I can't replicate the failure. Santiago, please would you test > how it fares on your autobuilders? Version -5 builds ok in my single-cpu autobuilders (tried 100 times), but Markus has just uploaded version -6 which has a different fix. I can't tell if the bug in -4 that made it to fail on single-cpu systems and on reproducible builds autobuilders (which are not single-cpu) was the same as the bug in -5 that made arm64 to fail. I'll try version -6 anyway. Thanks.
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
On 20.02.2017 01:56, Steve Cotton wrote: > On Sun, Feb 19, 2017 at 09:38:12PM +0100, Steve Cotton wrote: >> Sorry, but it's turned out that my patch either doesn't completely >> avoid the bug, or doesn't avoid another bug which gives the same error >> message. The build has failed on arm64. > > I've tried to reproduce this locally (on amd64, not arm64). With my > patch, I can't replicate the failure. Santiago, please would you test > how it fares on your autobuilders? > > Testing by removing my patch and trying to debug the root cause, I > haven't found the cause yet. But I think it will need a complex patch > to either libsdl1.2 or xvfb, and I don't think this bug justifies any > complex patch during the freeze. > > It seems that one of the XOpenDisplay calls in SDL's X11VideoInit fails. > Having a GCC breakpoint at the start of X11VideoInit, or running MakeDat > under strace, makes the bug unreproducible. Adding a long sleep to the > last point in the ri-li package's code before SDL_Init doesn't help. > Running another SDL programe first in the same xvfb server doesn't > change anything, so it seems that the server can't be primed in advance. > > Markus, I'm really sorry that what looked like a risk-free patch has > caused a failed rebuild. Depending on the TC's ruling, does it sound > sensible to say that 2.0.1+ds-4 is in Stretch, and -5 doesn't affect -4? Hi Steve, no worries and thanks for providing a helping hand here. I've also tried a couple of different options in the past hours. I think your initial patch wasn't completely wrong and the underlying issue has something to do how SDL initializes its subsystems but I have reverted it for now. I have read about issues when using SDL in virtual machines but I have found only one clue namely to manually link with -lX11 to avoid this kind of error. I have tried this solution at least 20 times on asachi.debian.org (arm64 porterbox) and couldn't reproduce the build failure anymore. I have uploaded another revision a few minutes ago. -5 and -6 don't affect Stretch. I don't know about a TC ruling but since it is obvious that the claim of "99 %" build failures is not true I stand by my opinion that this is not release critical. We could also stop building the data from source but this isn't something I would call an improvement over the current situation. Regards, Markus signature.asc Description: OpenPGP digital signature
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
On Sun, Feb 19, 2017 at 09:38:12PM +0100, Steve Cotton wrote: > Sorry, but it's turned out that my patch either doesn't completely > avoid the bug, or doesn't avoid another bug which gives the same error > message. The build has failed on arm64. I've tried to reproduce this locally (on amd64, not arm64). With my patch, I can't replicate the failure. Santiago, please would you test how it fares on your autobuilders? Testing by removing my patch and trying to debug the root cause, I haven't found the cause yet. But I think it will need a complex patch to either libsdl1.2 or xvfb, and I don't think this bug justifies any complex patch during the freeze. It seems that one of the XOpenDisplay calls in SDL's X11VideoInit fails. Having a GCC breakpoint at the start of X11VideoInit, or running MakeDat under strace, makes the bug unreproducible. Adding a long sleep to the last point in the ri-li package's code before SDL_Init doesn't help. Running another SDL programe first in the same xvfb server doesn't change anything, so it seems that the server can't be primed in advance. Markus, I'm really sorry that what looked like a risk-free patch has caused a failed rebuild. Depending on the TC's ruling, does it sound sensible to say that 2.0.1+ds-4 is in Stretch, and -5 doesn't affect -4? Steve
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
Hi Markus and Santiago, Sorry, but it's turned out that my patch either doesn't completely avoid the bug, or doesn't avoid another bug which gives the same error message. The build has failed on arm64. On Sun, Feb 19, 2017 at 09:05:07PM +0100, Santiago Vila wrote: > Even if I had not the technical skills to find a fix for this bug, I > explained clearly how to reproduce it, and I even offered a machine > for you to reproduce it. There's a difference between being able to reproduce a bug, and being able to debug it. Giving an entire VM setup and saying that that is enough to reproduce it doesn't start to cut it down to the minimal test case in which a developer could debug it. It was only Jonathan Dowland's response to debian-devel, saying that the main difference in your buildd was the single CPU, that made me realise there was a simple way to reproduce it in a debugging environment. Or not, as the arm64 failure shows. BR, Steve
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
On 19.02.2017 21:05, Santiago Vila wrote: > On Sun, Feb 19, 2017 at 08:03:37PM +0100, Markus Koschany wrote: > >> You constantly ignore different views and arguments and I am not the >> only one who questions your approach and your aggressive behaviour, > > Please stop the name-calling. No, I won't stop stating the truth. >> [...] >> for pushing your agenda on other people. [...] > > Sorry, but this is not "my" agenda, it's Release Policy and > Debian Policy: Packages *must* build from source provided the > build-dependencies are installed. This is _your_ interpretation of Debian Policy and there are people who disagree with it. Not every build failure is automatically release critical because we can't support all setups in existence hence the separation between release architectures and ports. The same goes for every custom build environment. You are trying hard to push this agenda on -devel but maybe you should also read when other people disagree with your agenda or refute your 99 % build failure claim e.g. [1] [2] > Even if I had not the technical skills to find a fix for this bug, I > explained clearly how to reproduce it, and I even offered a machine > for you to reproduce it. You are expecting other people sharing your opinion and doing the work for you. This is antisocial. > Next time please be more collaborative and accept the offer before > gratuitously downgrading the bug. Even if it was me who reported the > bug and it was my machines where the failed builds happened, the build > failure was never my fault, it was a bug in the package. You don't know if the root cause was in the package. The package simply tries to initialize SDL on the host system. This has always worked in the past, so there is certainly the possibility that the bug is not in ri-li which might be only _affected_ by it. [1] https://lists.debian.org/debian-devel/2017/02/msg00345.html [2] https://lists.debian.org/debian-devel/2017/02/msg00293.html signature.asc Description: OpenPGP digital signature
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
On Sun, Feb 19, 2017 at 08:03:37PM +0100, Markus Koschany wrote: > You constantly ignore different views and arguments and I am not the > only one who questions your approach and your aggressive behaviour, Please stop the name-calling. > [...] > for pushing your agenda on other people. [...] Sorry, but this is not "my" agenda, it's Release Policy and Debian Policy: Packages *must* build from source provided the build-dependencies are installed. Even if I had not the technical skills to find a fix for this bug, I explained clearly how to reproduce it, and I even offered a machine for you to reproduce it. Next time please be more collaborative and accept the offer before gratuitously downgrading the bug. Even if it was me who reported the bug and it was my machines where the failed builds happened, the build failure was never my fault, it was a bug in the package. Thanks.
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
On 19.02.2017 19:31, Santiago Vila wrote: > On Sun, Feb 19, 2017 at 07:00:41PM +0100, Markus Koschany wrote: > >> Thank you for the patch. I will apply this one today. Please note that >> this is simply a workaround for a limitation of Santiago's build >> environment. > > Absolutely not. > > This is basic Computer Science: Everything that you can do with > several CPU may be done with a single CPU as well (even if it's slower). > > So having a single CPU is never to be considered a "limitation" in the > build environment (except by you, of course). > >> This kind of error won't occur for the vast majority of the >> package's target audience. > > We don't have "target audiences" to consider here. Packages *must* build > from source when the build-dependencies are met. Of course we have target audiences, that is if you actually interact with those people and think outside of your technical box. It is also not true that the package fails 99 % all the time, it's basic math Santiago, all possible outcomes, not just the ones in your custom limited build environment. You constantly ignore different views and arguments and I am not the only one who questions your approach and your aggressive behaviour, claiming your are the only one who is right. There is no point in making a virtual environment the benchmark for the whole archive. You don't solve any real life issues but you are wasting countless hours of developer time for pushing your agenda on other people. Send in patches like Steve did for a change if you really think this is "critcal" but stop telling people how serious this bug is when you don't know nothing about the package and its users. Markus signature.asc Description: OpenPGP digital signature
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
On Sun, Feb 19, 2017 at 07:00:41PM +0100, Markus Koschany wrote: > Thank you for the patch. I will apply this one today. Please note that > this is simply a workaround for a limitation of Santiago's build > environment. Absolutely not. This is basic Computer Science: Everything that you can do with several CPU may be done with a single CPU as well (even if it's slower). So having a single CPU is never to be considered a "limitation" in the build environment (except by you, of course). > This kind of error won't occur for the vast majority of the > package's target audience. We don't have "target audiences" to consider here. Packages *must* build from source when the build-dependencies are met. Your package had an undeclared "Build-CPU: 2", the problem is that we don't have a Build-CPU control field, and you want to make multi-core to be build-essential, bypassing all the decision-making procedures in Debian. You should really stop doing that. Multi-core is not part of build-essential, and the standard to build packages is defined by the build-essential definition and the build-depends. Whatever the official build daemon of the day has (a single CPU or several, a slow or a fast CPU), is *not* to be considered build-essential. Thanks.
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
Control: tags -1 pending On 19.02.2017 18:18, Steve Cotton wrote: > package ri-li > tags 848063 +patch > thanks > [...] > Hi, > > The attached patch solves this with taskset, although I haven't tried with a > real single-CPU'd buildd. > > It's a one-liner at the start of a resource-generator tool that's only run > during the build, and isn't included in the actual packages. There are no > calls to SDL_AddTimer, so I think there's no need for SDL_INIT_TIMER, and > removing it avoids some interaction between SDL, xvfb and only having one CPU. > > - if( > SDL_Init(SDL_INIT_VIDEO|SDL_INIT_TIMER|SDL_INIT_AUDIO|SDL_INIT_NOPARACHUTE) < > 0 ) { > + if( SDL_Init(SDL_INIT_VIDEO|SDL_INIT_AUDIO|SDL_INIT_NOPARACHUTE) < 0 ) { > cerr <<"Impossible d'initialiser SDL:"< exit(-1); > Thank you for the patch. I will apply this one today. Please note that this is simply a workaround for a limitation of Santiago's build environment. This kind of error won't occur for the vast majority of the package's target audience. The claim that the build fails 99% all the time is wrong. If you try to argue with mathematics you must take _all possible outcomes_ into account not only the outcomes for your sample set. Then it becomes obvious that this issue is negligible and definitely not release critical. Markus signature.asc Description: OpenPGP digital signature
Bug#848063: +patch: ri-li FTBFS on single-CPU buildds
package ri-li tags 848063 +patch thanks On Sun, Feb 19, 2017 at 04:09:56PM +0100, Steve Cotton wrote: > On Wed, Feb 15, 2017 at 06:26:51PM +0100, Santiago Vila wrote: > > The following packages FTBFS for me randomly. First column is the bug > > number, second column is the estimated probability of failure in my > > build environment, which is described here: > > > > https://people.debian.org/~sanvila/my-building-environment.txt > > For ri-li, and hopefully many of the other bugs on the list, there's a much > simpler way to replicate the bug in developers' normal dev environments, using > the taskset command from package util-linux to run the build on CPU #0 only: > > $ taskset --cpu-list 0 dpkg-buildpackage -A Hi, The attached patch solves this with taskset, although I haven't tried with a real single-CPU'd buildd. It's a one-liner at the start of a resource-generator tool that's only run during the build, and isn't included in the actual packages. There are no calls to SDL_AddTimer, so I think there's no need for SDL_INIT_TIMER, and removing it avoids some interaction between SDL, xvfb and only having one CPU. - if( SDL_Init(SDL_INIT_VIDEO|SDL_INIT_TIMER|SDL_INIT_AUDIO|SDL_INIT_NOPARACHUTE) < 0 ) { + if( SDL_Init(SDL_INIT_VIDEO|SDL_INIT_AUDIO|SDL_INIT_NOPARACHUTE) < 0 ) { cerr <<"Impossible d'initialiser SDL:"