Bug#951497: rumur: building from sources makes machine un-usable or OOM/kills kernel

2020-02-17 Thread Matthew Fernandez


> On Feb 17, 2020, at 13:54, Anatoly Pugachev  wrote:
> 
> On Mon, Feb 17, 2020 at 11:12 PM Matthew Fernandez
>  wrote:
>> 
>> Yikes, sorry.
>> 
>> The test suite for this package auto-detects the number of CPUs and 
>> parallelises across all of them. This is not very well behaved when calling 
>> it from the outer build system, as it doesn’t pay attention to any -j flag 
>> you’ve passed there. To compound the situation, some of the individual test 
>> cases themselves are also multithreaded.
>> 
>> I think the simplest solution is to force the test suite to always run 
>> single-threaded when called via ctest. Can I close this bug by making this 
>> change when packaging the next upstream release? Or do I also need to upload 
>> new packages for the existing versions in Debian?
> 
> 
> Matthew,
> 
> being "Version: 2020.01.27-1" is available in the repository , is it
> possible to fix test suite to run single-threaded (or maybe -j2) via
> debian package patch and bump version to "2020.01.27-2" and upload?
> 
> Thanks.

Yes, good point, fair enough. I was midway through cutting a new upstream 
release when the first report arrived, so I’ll make the fix there and package 
that, and then backport it to a 2020.01.27-2 upload.

The new test suite responsible for this hammering was actually added several 
upstream releases back, but it looks like 2020.01.27-1 is the only package to 
make it through to unstable so far, so I assume I don’t need to backport this 
change to -2 uploads for older packages that only ever existed in testing.


Bug#951497: rumur: building from sources makes machine un-usable or OOM/kills kernel

2020-02-17 Thread Anatoly Pugachev
On Mon, Feb 17, 2020 at 11:12 PM Matthew Fernandez
 wrote:
>
> Yikes, sorry.
>
> The test suite for this package auto-detects the number of CPUs and 
> parallelises across all of them. This is not very well behaved when calling 
> it from the outer build system, as it doesn’t pay attention to any -j flag 
> you’ve passed there. To compound the situation, some of the individual test 
> cases themselves are also multithreaded.
>
> I think the simplest solution is to force the test suite to always run 
> single-threaded when called via ctest. Can I close this bug by making this 
> change when packaging the next upstream release? Or do I also need to upload 
> new packages for the existing versions in Debian?


Matthew,

being "Version: 2020.01.27-1" is available in the repository , is it
possible to fix test suite to run single-threaded (or maybe -j2) via
debian package patch and bump version to "2020.01.27-2" and upload?

Thanks.



Bug#951497: rumur: building from sources makes machine un-usable or OOM/kills kernel

2020-02-17 Thread Matthew Fernandez
Yikes, sorry.

The test suite for this package auto-detects the number of CPUs and 
parallelises across all of them. This is not very well behaved when calling it 
from the outer build system, as it doesn’t pay attention to any -j flag you’ve 
passed there. To compound the situation, some of the individual test cases 
themselves are also multithreaded.

I think the simplest solution is to force the test suite to always run 
single-threaded when called via ctest. Can I close this bug by making this 
change when packaging the next upstream release? Or do I also need to upload 
new packages for the existing versions in Debian?


Bug#951497: rumur: building from sources makes machine un-usable or OOM/kills kernel

2020-02-17 Thread Anatoly Pugachev
Package: rumur
Severity: important

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?

building rumur from sources on a debian sid (unstable):

   * What exactly did you do (or not do) that was effective (or
 ineffective)?

$ apt-get sources rumur
$ cd rumur-2020.01.27
$ debuild -b -uc -us -j$num

where $num is could be anything. Tested with 4,8 - builds ok.
But anything starting from 16 (tested with 16 and 32), brings machine to
OOM condition and sometimes kills it (via kernel OOM / OOPS).

For example, as already told, -j4 or -j8 works, but building with -j16
(on a machine with 32G RAM and 32 vcpus), goes to:

mator@ttip:~/rumur/rumur-2020.01.27$ debuild -b -uc -us -j16
...
[100%] Built target rumur
make[2]: Leaving directory 
'/home/mator/rumur/rumur-2020.01.27/obj-sparc64-linux-gnu'
/usr/bin/cmake -E cmake_progress_start 
/home/mator/rumur/rumur-2020.01.27/obj-sparc64-linux-gnu/CMakeFiles 0
make[1]: Leaving directory 
'/home/mator/rumur/rumur-2020.01.27/obj-sparc64-linux-gnu'
   dh_auto_test
cd obj-sparc64-linux-gnu && make -j16 test ARGS\+=-j16
make[1]: Entering directory 
'/home/mator/rumur/rumur-2020.01.27/obj-sparc64-linux-gnu'
Running tests...
/usr/bin/ctest --force-new-ctest-process -j16
Test project /home/mator/rumur/rumur-2020.01.27/obj-sparc64-linux-gnu
Start 1: tests
...

opening another terminal window:

mator@ttip:~/rumur$ ps ax  && free -m
...

   3681 pts/2SN+0:00 /usr/bin/perl /usr/bin/debuild -b -uc -us -j16
   3698 pts/2SN+0:00 tee ../rumur_2020.01.27-1_sparc64.build
   3699 pts/2SN+0:00 /usr/bin/perl /usr/bin/dpkg-buildpackage -us -uc 
-ui -b -j16
   3716 pts/2SN+0:00 /usr/bin/make -f debian/rules binary
   3718 pts/2SN+0:00 /usr/bin/perl /usr/bin/dh binary
   3987 pts/3SNs0:00 -bash
   4334 pts/2SN+0:00 /usr/bin/perl /usr/bin/dh_auto_test
   4336 pts/2SN+0:00 make -j16 test ARGS+=-j16
   4339 pts/2SN+0:00 /usr/bin/ctest --force-new-ctest-process -j16
   4340 pts/2SNl+   0:06 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4356 pts/2RN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4357 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4358 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4359 pts/2RN+0:21 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4360 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4361 pts/2RN+0:17 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4362 pts/2SN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4363 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4364 pts/2RN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4365 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4367 pts/2RN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4368 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4369 pts/2RN+0:18 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4370 pts/2RN+0:18 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4371 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4373 pts/2SN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4374 pts/2RN+0:17 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4375 pts/2RN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4376 pts/2RN+0:21 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4377 pts/2RN+0:17 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4379 pts/2RN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4380 pts/2RN+0:18 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4381 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4382 pts/2RN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4383 pts/2RN+0:19 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4384 pts/2RN+0:18 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4385 pts/2RN+0:18 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4386 pts/2RN+0:15 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4387 pts/2RN+0:20 python3 
/home/mator/rumur/rumur-2020.01.27/tests/run-tests.py
   4561 pts/2RN+0:50 /tmp/tmpeurf_1xz/model.exe
   4564 pts/2RN+0:53 /tmp/tmpv809w4xi/model.exe
   4565 pts/2RN+0:53 /tmp/tmp_2whw2i8/mode