The fail seems reproducible only in autopkgtest environment.
For example not on contaiers/vm's of different architectures.
But in autopkgtest running on cloudvm img.

I see it handing after stage 4
On LP infra then after ~30 minutes it fails with a timeout.

In my local debug I found:
0     0  5294  4703  20   0      0     0 -      Zs   ?          0:00            
                  \_ [test-nbd.sh] <defunct>

Now that is stage 5 which is the failing one.

Defaunct means this one is exited but the parent didn't read the exit
call status with e.g. a "wait" call.

The parent is:
0     0  4702  4701  20   0   4608   860 -      S    ?          0:00            
          \_ /bin/sh -c /usr/bin/python3 -u /usr/share/meson/meson test 
--no-rebuild --print-errorlogs


Reproducible still when I call meson in that environment:
4     0  5656  5386  20   0  61580  4400 -      S+   pts/0      0:00  |         
  \_ sudo ./debian/tests/upstream
4     0  5657  5656  20   0  13012  3444 -      S+   pts/0      0:00  |         
      \_ /bin/bash ./debian/tests/upstream
0     0  5672  5657  20   0  14468  3580 -      S+   pts/0      0:00  |         
          \_ ninja -v -C obj-x86_64-linux-gnu test
0     0  5673  5672  20   0   4608   824 -      S+   pts/0      0:00  |         
              \_ /bin/sh -c /usr/bin/python3 -u /usr/share/meson/meson test 
--no-rebuild --print-errorlogs
0     0  5674  5673  20   0 146400 37584 -      Sl+  pts/0      0:03  |         
                  \_ /usr/bin/python3 -u /usr/share/meson/meson test 
--no-rebuild --print-errorlogs
4     0  6319  5674  20   0      0     0 -      Zs   ?          0:00  |         
                      \_ [test-nbd.sh] <defunct>


>From the dep8 script down to the meson call - still rperoducible.

It calls from meson this:
$ /bin/bash -ex 
/tmp/autopkgtest.c37m28/build.c4y/casync-2/obj-x86_64-linux-gnu/test-nbd.sh

That then spawns child:
/tmp/autopkgtest.c37m28/build.c4y/casync-2/obj-x86_64-linux-gnu/casync -v 
--digest=sha512-256 make /var/tmp/test-casync.20192/test.caibx 
/var/tmp/test-casync.20192/blob

But that immediately vanishs and leaves it in the broken state.

The call to the test script itself is interesting:
- as user ubuntu it works
- as root it fails

And while that is happening, dmesg reports:
[ 2079.920901] block nbd0: Device being setup by another task
[ 2079.921256] block nbd1: Device being setup by another task
[ 2079.921564] block nbd2: Device being setup by another task
[ 2079.921894] block nbd3: Device being setup by another task
[ 2079.924287] block nbd4: Device being setup by another task

The test is running "more" if running as root.
If in ubuntu it ends with a
++ id -u
+ '[' 1000 == 0 ']'

But being root t goes on.
So the section failing only happens if running as root.

  + modprobe nbd
  + test -e /dev/nbd0
  ++ 
/tmp/autopkgtest.c37m28/build.c4y/casync-2/obj-x86_64-linux-gnu/notify-wait 
/tmp/autopkgtest.c37m28/build.c4y/casync-2/obj-x86_64-linux-gnu/c
  + MKDEV_PID=7357
  + 
/tmp/autopkgtest.c37m28/build.c4y/casync-2/obj-x86_64-linux-gnu/test-calc-digest
 sha512-256
  + dd if=/var/tmp/test-casync.rnd/test-node bs=102400 count=80
  0+0 records in
  0+0 records out 
  0 bytes copied, 1,8723e-05 s, 0,0 kB/s
  + diff -q /var/tmp/test-casync.rnd/test.digest 
/var/tmp/test-casync.rnd/mkdev.digest
  Files /var/tmp/test-casync.rnd/test.digest and 
/var/tmp/test-casync.rnd/mkdev.digest differ


The call that seems to fail is the casync setup that should create 
/var/tmp/test-casync.rnd/test-node

Later on the dd tries to read from there (not existing) and from there
every thing is a follow on error.

Command should be:
/tmp/autopkgtest.c37m28/build.c4y/casync-2/obj-x86_64-linux-gnu/notify-wait 
/tmp/autopkgtest.c37m28/build.c4y/casync-2/obj-x86_64-linux-gnu/c  async -v 
--digest=sha512-256 mkdev /var/tmp/test-casync.rnd/test.caibx 
/var/tmp/test-casync.rnd/test-node

In that "/var/tmp/test-casync.rnd/test.caibx" seems not to exist - but that is 
due to cleanup after the test.
Modify to leave the artifacts around.

After this (no cleanup)
/var/tmp/test-casync.4821/test-node is at /dev/nbd8

I can successfully read from that node and the digest is good.
But the dd does not.
So is it just "too fast" and ndb in this env needs slightly more time to set up?

A trivial sleep 5s before the dd fixes the test.
Now what is the "right" way to ensure nbd is ready on this?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1736733

Title:
  autopkgtest crashes in test env

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/casync/+bug/1736733/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to