TL;DR: We seem to have a pretty nasty kernel bug on recent kernels that corrupts filesystems and is triggered by virt test's migrate.with_reboot test, so I can't update JeOS, despite trying every last newer kernel version that came out with Fedora for the last 8 months or so.

Hello folks,

I am writing this email to update you about something I've been trying to do for a while now: Update the 9 month old JeOS, based on Fedora 17, running the 3.6 kernel, with a new JeOS, based on Fedora 19, running the 3.11 kernel.

From the time I've spent working on this, it appears that, under newer kernels, we get a filesystem corruption when we run the migrate.with_reboot default tests. Using the current JeOS:

./run -r -j -g JeOS.17.x86_64 --no q35 -t qemu --tests "migrate_with_reboot.tcp migrate.with_reboot.unix migrate.with_reboot.exec.default_exec migrate.with_reboot.exec.gzip_exec migrate.with_reboot.fd"

Running setup. Please wait...
SETUP: PASS (15.32 s)
DATA DIR: /home/lmr/virt_test
DEBUG LOG: /home/lmr/Code/virt-test.git/logs/run-2013-10-01-20.43.05/debug.log
TESTS: 4
(1/4) migrate.with_reboot.unix: PASS (60.82 s)
(2/4) migrate.with_reboot.exec.default_exec: PASS (60.44 s)
(3/4) migrate.with_reboot.exec.gzip_exec: PASS (60.72 s)
(4/4) migrate.with_reboot.fd: PASS (61.94 s)
TOTAL TIME: 243.95 s (04:03)
TESTS PASSED: 4
TESTS FAILED: 0
SUCCESS RATE: 100.00 %

Under the newer one:

./run -r -j -g JeOS.19.x86_64 --no q35 -t qemu --tests "migrate_with_reboot.tcp migrate.with_reboot.unix migrate.with_reboot.exec.default_exec migrate.with_reboot.exec.gzip_exec migrate.with_reboot.fd"
Running setup. Please wait...
SETUP: PASS (16.42 s)
DATA DIR: /home/lmr/virt_test
DEBUG LOG: /home/lmr/Code/virt-test.git/logs/run-2013-10-01-20.54.05/debug.log
TESTS: 4
(1/4) migrate.with_reboot.unix: PASS (64.81 s)
(2/4) migrate.with_reboot.exec.default_exec: FAIL (439.86 s)
...

And all other tests will fail, because at this point the image file system is already corrupted.

2013-10-01 20:55:31: ] Started udev Kernel Device Manager.
2013-10-01 20:55:31: systemd-fsck[155]: /dev/vda1: Unattached inode 401018
2013-10-01 20:55:31: systemd-fsck[155]: /dev/vda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
2013-10-01 20:55:31: systemd-fsck[155]: (i.e., without -a or -p options)
2013-10-01 20:55:31:
2013-10-01 20:55:31: [ 1.567614] EXT4-fs (vda1): warning: mounting fs with errors, running e2fsck is recommended
2013-10-01 20:55:31: [    1.573209] EXT4-fs (vda1): re-mounted. Opts: (null)
2013-10-01 20:55:31: Welcome to emergency mode! After logging in, type "journalctl -xb" to view 2013-10-01 20:55:31: system logs, "systemctl reboot" to reboot, "systemctl default" to try again
...
2013-10-01 21:00:42: [  311.131516] EXT4-fs (vda1): error count: 5
2013-10-01 21:00:42: [ 311.131723] EXT4-fs (vda1): initial error at 1380671721: ext4_mb_generate_buddy:755 2013-10-01 21:00:42: [ 311.131723] EXT4-fs (vda1): last error at 1380671722: ext4_mb_generate_buddy:755

So, given this, I can't realistically expect to update the JeOS without hearing lots of people saying that "autotest is broken".

_______________________________________________
Virt-test-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/virt-test-devel

Reply via email to