Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
On Thu, 2015-04-30 at 09:41 +0200, Bernhard Schmidt wrote: Hi maximilian, [ copied from debian-user again ] --- Got another system with the symptoms and managed to get a snapshot. It is really extremely weird. The kernel output is List of all partitions: No filesystem could mount root, tried: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) This is reproducible. To fix it it is enough to boot into the Wheezy kernel (even with init=/bin/sh), then reboot. It apparently does something to the root-fs (fsck?) which allows the Jessie kernel to boot. I have asked our Windows guys to make a screencast, it is uploaded here. http://users.birkenwald.de/~berni/volatile/783620.mkv I can see that the GRUB menu entry for 3.16.0-4-amd64 does seem to include an initramfs. Unfortunately the frame rate is quite low so I don't see any messages from GRUB indicating whether it succeeded or failed to load the file. We still have the snapshot available, if you have an idea please drop me a note. this means linux didn't get the initramfs passed by the bootloader. In the old days this happened when lilo was not run, these days it could be some grub modules out of sync (very wild guess). did you try before botting into that image to run install-grub in it? I don't have access to the snapshot until Monday, but I don't think it will help. As you can see in the video a simple fsck/mount in initrd in the old kernel is enough, and grub isn't touched there. fsck.xfs does nothing (see the manual page). Mounting the filesystem, however, will replay any changes that were only written to the journal and not yet written to their usual locations on disk. Is it possible that this system was not cleanly shut down following the upgrade? I don't think that GRUB reads journals so this would probably explain what you've shown. Ben. But I'll test on Monday to be sure. Bernhard -- Ben Hutchings Q. Which is the greater problem in the world today, ignorance or apathy? A. I don't know and I couldn't care less. signature.asc Description: This is a digitally signed message part
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 02.05.2015 21:26, Ben Hutchings wrote: Hi, I can see that the GRUB menu entry for 3.16.0-4-amd64 does seem to include an initramfs. Unfortunately the frame rate is quite low so I don't see any messages from GRUB indicating whether it succeeded or failed to load the file. Even directly in front of the console I can't make out any error message, and if I change the filename to something non-existant I get a error message and have to press a key to continue. We still have the snapshot available, if you have an idea please drop me a note. this means linux didn't get the initramfs passed by the bootloader. In the old days this happened when lilo was not run, these days it could be some grub modules out of sync (very wild guess). did you try before botting into that image to run install-grub in it? I don't have access to the snapshot until Monday, but I don't think it will help. As you can see in the video a simple fsck/mount in initrd in the old kernel is enough, and grub isn't touched there. fsck.xfs does nothing (see the manual page). Mounting the filesystem, however, will replay any changes that were only written to the journal and not yet written to their usual locations on disk. Should I be able to see that somehow (xfs_info from a rescue system or something like that)? Doesn't xfs log anything when it replays the log (I know ext4 does, but I don't recall seeing that for XFS ever) Is it possible that this system was not cleanly shut down following the upgrade? I don't think that GRUB reads journals so this would probably explain what you've shown. The system is normally rebooted using /sbin/reboot soon after dist-upgrade is finished. There are no errors and our customizations don't touch the reboot part at all. If there was a problem with unclean shutdown it would be a common error, we are seeing ~20% failure rates on upgrades. If I understand you correctly since grub isn't erroring out on the initrd filename it is likely there on disk, but an out-of-date version. I recall the initrd being generated twice, so maybe the file that is on the disk (read by grub) is somehow incomplete and the first boot syncs the updated image from journal to disk. Or maybe it is really unreadable ... other guess would be a corruption that can be replayed by 3.2.0, but not by 3.16.0 (seems unlikely) If I mount the filesystem in a rescue system with norecovery and the initrd is either different or missing that would narrow it down, no? And a workaround would be calling sync before the reboot. Bernhard -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJVRTYCAAoJEHdQeeW4ULyTj+YP/2HBbqdpJAckR1+l/W5UDjaN c2hzPP9x5gEdrGStzigi6Z3KdM7m+EZZAmd8HRR0ZbBzjG5rvVris6HDe9q7ytIf 1ThPpd0Z67m1oWz+JSZ7V6Gh9sypJe+0EaStVoxd4ZN2tUdEFB4TN5DPubMAsslu 6fPIf/OSjc6ZL4SQbmGRmGjqDOJah8vdOu+YN/+X7FvBel/6Z54wqjqrtnXjIaEU /1m0fas7/W1y278osGy9HNHsz/e/BVcW3dfFRm1XEJKGp7dglRTyPkC9+ITrW6Ci qN3Bf5pevNl3vyfKuBlM8cqRhHsFrhyMxToMCFf8gUxwo+ZFXAhqIlEas+vT9R24 amKquDv79GdHta67WydqnUfW1EJe14eXinIgoB3tbplmRHD4l6vL7kqEro8SSjXS Ggta+rDG3W/M3L20T9guDLKNa0x3e4RvQIKVHWNURiZCOz54eoOu1X+j/y+nZuYF Ka5zPyN0D0f9MPPMX2K3PFBO8dNw42gWXR4ht2KCXxYz3edXSp4trWuP7BnnvmMy YhEfOBKBF9IZ1DxOpbz97gXThU0RJDxMBkt6PR9IdDUqUUkC7mUuOgJWE2kOwtK4 6OMXxmhxe3BLeWIogzhpat2hJ7nT22bwRncZCgGIEwC4r5DZ+uCwYiOtTsqRg+iu zuLjc4l8jDv0XAeHyUKu =4CEw -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/55453602.2050...@birkenwald.de
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
Hi maximilian, [ copied from debian-user again ] --- Got another system with the symptoms and managed to get a snapshot. It is really extremely weird. The kernel output is List of all partitions: No filesystem could mount root, tried: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) This is reproducible. To fix it it is enough to boot into the Wheezy kernel (even with init=/bin/sh), then reboot. It apparently does something to the root-fs (fsck?) which allows the Jessie kernel to boot. I have asked our Windows guys to make a screencast, it is uploaded here. http://users.birkenwald.de/~berni/volatile/783620.mkv We still have the snapshot available, if you have an idea please drop me a note. this means linux didn't get the initramfs passed by the bootloader. In the old days this happened when lilo was not run, these days it could be some grub modules out of sync (very wild guess). did you try before botting into that image to run install-grub in it? I don't have access to the snapshot until Monday, but I don't think it will help. As you can see in the video a simple fsck/mount in initrd in the old kernel is enough, and grub isn't touched there. But I'll test on Monday to be sure. Bernhard -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5541dc9c.7070...@birkenwald.de
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
Hi, [ copied from debian-user again ] --- Got another system with the symptoms and managed to get a snapshot. It is really extremely weird. The kernel output is List of all partitions: No filesystem could mount root, tried: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) This is reproducible. To fix it it is enough to boot into the Wheezy kernel (even with init=/bin/sh), then reboot. It apparently does something to the root-fs (fsck?) which allows the Jessie kernel to boot. I have asked our Windows guys to make a screencast, it is uploaded here. http://users.birkenwald.de/~berni/volatile/783620.mkv We still have the snapshot available, if you have an idea please drop me a note. --- Bernhard signature.asc Description: OpenPGP digital signature
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
Package: initramfs-tools Followup-For: Bug #783620 Hi, from the debian-user mailinglist ... Bernhard Schmidt be...@birkenwald.de wrote: Don Armstrong d...@debian.org wrote: Hi Don, has anyone observed something similar to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=783620 on their Upgrade from Wheezy to Jessie? I'm still trying to figure out what's happening, and I don't really know where to look. I was unable to attach the screenshot so far (mail is accepted but never makes it to the BTS), I've put the screenshot here: http://users.birkenwald.de/~berni/volatile/783620.png Could you run something like this on the initrds? diff -u ( zcat workinginitrd) ( zcat brokeninitrd); It's possible that something has corrupted the initrds in some subtle way, or some part of the cpio archive has been truncated which causes as issue for the kernel but is ignored by cpio.
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
On Wed, Apr 29, 2015 at 01:06:07PM +0200, Bernhard Schmidt wrote: Hi, [ copied from debian-user again ] --- Got another system with the symptoms and managed to get a snapshot. It is really extremely weird. The kernel output is List of all partitions: No filesystem could mount root, tried: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) This is reproducible. To fix it it is enough to boot into the Wheezy kernel (even with init=/bin/sh), then reboot. It apparently does something to the root-fs (fsck?) which allows the Jessie kernel to boot. I have asked our Windows guys to make a screencast, it is uploaded here. http://users.birkenwald.de/~berni/volatile/783620.mkv We still have the snapshot available, if you have an idea please drop me a note. this means linux didn't get the initramfs passed by the bootloader. In the old days this happened when lilo was not run, these days it could be some grub modules out of sync (very wild guess). did you try before botting into that image to run install-grub in it? -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150429144533.gd10...@stro.at
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
On Tue, 2015-04-28 at 21:39 +0200, Bernhard Schmidt wrote: Hi, I have tried two times to send the screenshot to this bug, but it was always eaten (delivered to @bugs.debian.org, but never made it to the BTS). I have put it online at http://users.birkenwald.de/~berni/volatile/783620.png Note that there is a bit of local integration work in these systems (a few additional packages, and the upgrade procedure switches from the legacy VMware tools to open-vm-tools), but nothing that deep that should affect initramfs. Also 90% of the upgrades go through without any issues. And the initrd content is binary-identical, so ... This is a kernel panic, which usually means the initramfs wasn't loaded at all. Which boot loader is used on this system? GRUB or something else? Ben. -- Ben Hutchings Beware of programmers who carry screwdrivers. - Leonard Brandwein signature.asc Description: This is a digitally signed message part
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
Package: initramfs-tools Version: 0.120 Severity: important Dear Maintainer, I have a hard time wrapping my head around this bug, feel free to assign somewhere else. We have started upgrading some of our production VMs to Jessie. The testsystems worked fine, but I have hit the following bug for the second time on a production VM now. - dist-upgrade works flawlessly - on first boot into Jessie I get an immediate (1s) kernel-panic (see attached screenshot) about being unable to find the root fs. Unfortunately I'm unable to get the full boot log, since I don't have a serial console there and kernel messages scroll by too fast. - To fix the issue I have to boot into the old Wheezy kernel (3.2.0-4-amd64) in grub and regenerate the initrd for the Jessie kernel # update-initramfs -k 3.16.0-4-amd64 -u Then it works fine. Now comes the interesting part ... I have saved the broken initrd for later analysis The compressed size is marginally different (broken being 3k smaller) -rw-r--r-- 1 root root 14339199 Apr 28 13:59 initrd.img-3.16.0-4-amd64 -rw-r--r-- 1 root root 14338898 Apr 28 13:58 initrd.img-3.16.0-4-amd64.broken The uncompressed size is the same root@lxmhs63:/tmp# zcat /boot/initrd.img-3.16.0-4-amd64.broken initrd.img-3.16.0-4-amd64.broken root@lxmhs63:/tmp# zcat /boot/initrd.img-3.16.0-4-amd64.broken /tmp/initrd.img-3.16.0-4-amd64.broken root@lxmhs63:/tmp# ls -la /tmp/initrd.img-3.16.0-4-amd64* -rw-r--r-- 1 root root 45304832 Apr 28 14:44 /tmp/initrd.img-3.16.0-4-amd64 -rw-r--r-- 1 root root 45304832 Apr 28 14:44 /tmp/initrd.img-3.16.0-4-amd64.broken The checksum is different root@lxmhs63:/tmp# md5sum /tmp/initrd.img-3.16.0-4-amd64* 7b24aa901b697dc5dfdbad03bd199072 /tmp/initrd.img-3.16.0-4-amd64 5e467c0a49afa4ddae315cc6e818d7ac /tmp/initrd.img-3.16.0-4-amd64.broken Now comes the puzzling part ... the _content_ of the initrd is exactly the same root@lxmhs63:/tmp# mkdir broken cd broken cpio -id ../initrd.img-3.16.0-4-amd64.broken 88486 blocks root@lxmhs63:/tmp/broken# cd .. root@lxmhs63:/tmp# mkdir ok cd ok cpio -id ../initrd.img-3.16.0-4-amd64 88486 blocks root@lxmhs63:/tmp/ok# cd .. root@lxmhs63:/tmp# diff -urN broken ok I will try to capture a screenlog on the next upgrades, maybe there is something interesting in there. Bernhard -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150428125306.20837.15594.report...@badwlrz-clbsc01.ws.lrz.de
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
The uncompressed size is the same root@lxmhs63:/tmp# zcat /boot/initrd.img-3.16.0-4-amd64.broken initrd.img-3.16.0-4-amd64.broken root@lxmhs63:/tmp# zcat /boot/initrd.img-3.16.0-4-amd64.broken /tmp/initrd.img-3.16.0-4-amd64.broken root@lxmhs63:/tmp# ls -la /tmp/initrd.img-3.16.0-4-amd64* -rw-r--r-- 1 root root 45304832 Apr 28 14:44 /tmp/initrd.img-3.16.0-4-amd64 -rw-r--r-- 1 root root 45304832 Apr 28 14:44 /tmp/initrd.img-3.16.0-4-amd64.broken Err wrong paste zcat /boot/initrd.img-3.16.0-4-amd64.broken /tmp/initrd.img-3.16.0-4-amd64.broken zcat /boot/initrd.img-3.16.0-4-amd64 /tmp/initrd.img-3.16.0-4-amd64 Bernhard -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/553f850a.10...@birkenwald.de
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Ben, On 28.04.2015 22:01, Ben Hutchings wrote: On Tue, 2015-04-28 at 21:39 +0200, Bernhard Schmidt wrote: Hi, I have tried two times to send the screenshot to this bug, but it was always eaten (delivered to @bugs.debian.org, but never made it to the BTS). I have put it online at http://users.birkenwald.de/~berni/volatile/783620.png Note that there is a bit of local integration work in these systems (a few additional packages, and the upgrade procedure switches from the legacy VMware tools to open-vm-tools), but nothing that deep that should affect initramfs. Also 90% of the upgrades go through without any issues. And the initrd content is binary-identical, so ... This is a kernel panic, which usually means the initramfs wasn't loaded at all. Which boot loader is used on this system? GRUB or something else? Thanks for the feedback. Standard grub2 installation, nothing special about it. Grub did not print any errors about not finding a file and I only ran update-initramfs and rebooted. I hope the next time it happens it will be with a less critical machine and I can keep it down a bit to debug further. Best Regards, Bernhard -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJVP+3cAAoJEHdQeeW4ULyT1g8QAJO1isSoXTkMDDz71UDVqX6A KBSF0itWhgN2EIis4at2GtydGT8agtknRpRi7lOeML0ROPWcPhZkkE5LmoSCx+pV wh4tMvNp8xzR7qINJ2ncCtmeSc/sy44FU4vBxYs/jbZA3xt3QH4YPow2gzMzIU+Z 47ByFCygI+rcu6ZEYVPViD6xTA7LoZ2MulsBr7QPIA/l7iX8uqKH3Qpgq4iRuaD3 Ww+zVN7nOrLCrpfQi0plRrO3wI62HieVeRkvZ10yCS7gFavjXxldu8V5ZVvfU33S Y17IM0zbdl3FSi7lQ2pIwrSC6Yvz9EE1x0qygVk8HYeEEWsgcuu4Xp3TEJ1Y412M ovsX+xREh7YzJP9HUZgX1DToI7Gp+91pBbVP3yEGt71oY16ezysRVlbkzKV2nTKo AZv0euhS+SJHDEPCjEbJj/VvQD/1QrTSKMkuu5Dy+tqcNDIV2DSTfbtuLlGvmtqU /0VIc6mSIYAof80vUKEkgt7MLvy8BamwtSBbB7cGyneJTq2o4uwRcifunjcHbbKn 7KcGtcGNZ0tUHhaK5dPFjLEwyLE75ei1mivE+kDigZgqlCT3SQpsimp7/MHtYYvo yk265JIOW+r6uHwEWFjCNSYJrNX/jPbTpurxb19PnnkY/qhfs4WyD3RRbXt5KmJD n5qaAfislxUubVfEs0tW =lBoI -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/553feddc.4050...@birkenwald.de
Bug#783620: initramfs-tools: initramfs broken on first boot into Jessie, Unable to mount root fs on unknown-block(0, 0)
Hi, I have tried two times to send the screenshot to this bug, but it was always eaten (delivered to @bugs.debian.org, but never made it to the BTS). I have put it online at http://users.birkenwald.de/~berni/volatile/783620.png Note that there is a bit of local integration work in these systems (a few additional packages, and the upgrade procedure switches from the legacy VMware tools to open-vm-tools), but nothing that deep that should affect initramfs. Also 90% of the upgrades go through without any issues. And the initrd content is binary-identical, so ... Best Regards, Bernhard