Re: Wheezy fails to complete boot after last update
On 04/11/14 01:10 AM, Gary Dale wrote: On 03/11/14 12:23 PM, Gary Dale wrote: After the last upgrade (kernel 3.2.0-4-amd64 and wget), I have a Debian/Wheezy server that isn't starting. It gets to the network then stops. The system isn't hung - it will still output messages to the screen when I plug and unplug USB devices, and I can use the SysRq key to do the reisub bit and see the messages from the various commands. When I boot to rescue mode, I get to the same point but it shows more network messages that seem to show the eth0 link is up, but it doesn't get beyond that. I can boot from sysrescuecd and everything works. And I can run the system in chroot and start the services I need for it do its job temporarily (bind, samba, cups) but I can't ssh to it. This is a pain because the server is remote (35 minutes away) so I can only work on it in person. And in person I have to steal a keyboard and monitor from another machine. Anyway, within the chroot I can view the various system logs but they don't show much. They show me (remotely) rebooting the server after the kernel upgrade then nothing until I chroot into it from sysrescuecd. Dmesg is similarly absent. The dmesg log is from the previous time I rebooted. The onscreen messages during a boot showed the system failing to load the firmware for the Realtek NIC (8111E) but installing the realtek firmware package didn't help - it just fixed the error message. The system is an ASUS M5A88-M mainboard with an AMD Fx4100 processor. It has a 3-disk mdadm RAID array setup as a RAID-1 /dev/md0 boot partition and RAID-5 /dev/md1p1 as root and /dev/md1p2 as /home. Does anyone have any ideas on how I can track down and/or fix this problem? Update: booting using sysrescuecd, I removed the /etc/rcS.d/S13networking link and rebooted into single-user mode. It completed OK, so I then ran /etc/init.d/networking start, which exhibited the same problems it did when run by init. However this time I was able to kill the process, at which point I noted the network had been started. Exiting from the single-user mode allowed the boot to continue to a normal command prompt. This still leaves me with the problem of not being able to reboot the computer remotely, since the workaround involves disabling the network. dmesg shows the following from the last boot: [ 181.805309] r8169 :03:00.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory [ 181.917202] r8169 :03:00.0: eth0: link down [ 181.920948] r8169 :03:00.0: eth0: link down [ 181.928316] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 184.284849] r8169 :03:00.0: eth0: link up [ 184.292178] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 195.016023] eth0: no IPv6 routers present The eth0: link becomes ready line is the last line I get on the screen when I boot into single-user mode with networking enabled. The previous dmesg.0 log from two months ago shows instead: [ 10.391548] r8169 :03:00.0: firmware: agent aborted loading rtl_nic/rtl8168e-3.fw (not found?) [ 10.391740] r8169 :03:00.0: eth0: unable to load firmware patch rtl_nic/rtl8168e-3.fw (-2) [ 10.405264] r8169 :03:00.0: eth0: link down [ 10.405333] r8169 :03:00.0: eth0: link down [ 10.408540] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 11.061379] RPC: Registered named UNIX socket transport module. The first (firmware) message change reflects the installation of the firmware-realtek package in an effort to fix the problem. I'm guessing that the change is that networking is now bringing the link up because I switched to a static IP address, whereas previously it was brought up by dchp (with the router assigning an IP address using its internal dhcp server). This is my /etc/network/interfaces file: auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.1.17 netmask 255.255.255.0 network 192.168.1.0 broadcast 192.168.1.255 gateway 192.168.1.1 Should it have something about IPv6 in it (or at least somewhere)? Or is there some other error? Christian, thanks for your help in my other problem. I was wondering if you have any suggestions on this one? -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/54623dc8.8010...@torfree.net
Re: Wheezy fails to complete boot after last update
Am 11.11.2014 um 17:48 schrieb Gary Dale: Update: booting using sysrescuecd, I removed the /etc/rcS.d/S13networking link and rebooted into single-user mode. It completed OK, so I then ran /etc/init.d/networking start, which exhibited the same problems it did when run by init. However this time I was able to kill the process, at which point I noted the network had been started. Exiting from the single-user mode allowed the boot to continue to a normal command prompt. This still leaves me with the problem of not being able to reboot the computer remotely, since the workaround involves disabling the network. dmesg shows the following from the last boot: [ 181.805309] r8169 :03:00.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory [ 181.917202] r8169 :03:00.0: eth0: link down [ 181.920948] r8169 :03:00.0: eth0: link down [ 181.928316] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 184.284849] r8169 :03:00.0: eth0: link up [ 184.292178] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 195.016023] eth0: no IPv6 routers present The eth0: link becomes ready line is the last line I get on the screen when I boot into single-user mode with networking enabled. I have no idea what's going on there, but there seems to be a kernel bug in there somewhere. When the system doesn't boot, did you try SysRq-w (hung tasks) and/or SysRq-t (all tasks)? Or SysRq-l (stack trace)? Should it have something about IPv6 in it (or at least somewhere)? Or is there some other error? IPv6 should be irrelevant here, the 'no IPv6 routers present' message is just that the kernel didn't see any router advertisements - if you don't have IPv6 in your subnet, that's normal and harmless. Christian -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5462405d.5080...@iwakd.de
Re: Wheezy fails to complete boot after last update
On 11/11/14 11:59 AM, Christian Seiler wrote: Am 11.11.2014 um 17:48 schrieb Gary Dale: Update: booting using sysrescuecd, I removed the /etc/rcS.d/S13networking link and rebooted into single-user mode. It completed OK, so I then ran /etc/init.d/networking start, which exhibited the same problems it did when run by init. However this time I was able to kill the process, at which point I noted the network had been started. Exiting from the single-user mode allowed the boot to continue to a normal command prompt. This still leaves me with the problem of not being able to reboot the computer remotely, since the workaround involves disabling the network. dmesg shows the following from the last boot: [ 181.805309] r8169 :03:00.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory [ 181.917202] r8169 :03:00.0: eth0: link down [ 181.920948] r8169 :03:00.0: eth0: link down [ 181.928316] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 184.284849] r8169 :03:00.0: eth0: link up [ 184.292178] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 195.016023] eth0: no IPv6 routers present The eth0: link becomes ready line is the last line I get on the screen when I boot into single-user mode with networking enabled. I have no idea what's going on there, but there seems to be a kernel bug in there somewhere. When the system doesn't boot, did you try SysRq-w (hung tasks) and/or SysRq-t (all tasks)? Or SysRq-l (stack trace)? No I didn't. The server is remote, so next time I'm out that way I'll give it try. This server is also running Samba4 from the wheezy-backports repository (I tested it on my local server first, and it seemed to be working OK) so perhaps there is something about the backport that is causeing a problem. Again, I'll check that out next time I'm out there. Thanks. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/546261c0.8080...@torfree.net
Wheezy fails to complete boot after last update
After the last upgrade (kernel 3.2.0-4-amd64 and wget), I have a Debian/Wheezy server that isn't starting. It gets to the network then stops. The system isn't hung - it will still output messages to the screen when I plug and unplug USB devices, and I can use the SysRq key to do the reisub bit and see the messages from the various commands. When I boot to rescue mode, I get to the same point but it shows more network messages that seem to show the eth0 link is up, but it doesn't get beyond that. I can boot from sysrescuecd and everything works. And I can run the system in chroot and start the services I need for it do its job temporarily (bind, samba, cups) but I can't ssh to it. This is a pain because the server is remote (35 minutes away) so I can only work on it in person. And in person I have to steal a keyboard and monitor from another machine. Anyway, within the chroot I can view the various system logs but they don't show much. They show me (remotely) rebooting the server after the kernel upgrade then nothing until I chroot into it from sysrescuecd. Dmesg is similarly absent. The dmesg log is from the previous time I rebooted. The onscreen messages during a boot showed the system failing to load the firmware for the Realtek NIC (8111E) but installing the realtek firmware package didn't help - it just fixed the error message. The system is an ASUS M5A88-M mainboard with an AMD Fx4100 processor. It has a 3-disk mdadm RAID array setup as a RAID-1 /dev/md0 boot partition and RAID-5 /dev/md1p1 as root and /dev/md1p2 as /home. Does anyone have any ideas on how I can track down and/or fix this problem? -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141103122323.12086m34tpmzm...@www.torfree.net
Re: Wheezy fails to complete boot after last update
On 03/11/14 12:23 PM, Gary Dale wrote: After the last upgrade (kernel 3.2.0-4-amd64 and wget), I have a Debian/Wheezy server that isn't starting. It gets to the network then stops. The system isn't hung - it will still output messages to the screen when I plug and unplug USB devices, and I can use the SysRq key to do the reisub bit and see the messages from the various commands. When I boot to rescue mode, I get to the same point but it shows more network messages that seem to show the eth0 link is up, but it doesn't get beyond that. I can boot from sysrescuecd and everything works. And I can run the system in chroot and start the services I need for it do its job temporarily (bind, samba, cups) but I can't ssh to it. This is a pain because the server is remote (35 minutes away) so I can only work on it in person. And in person I have to steal a keyboard and monitor from another machine. Anyway, within the chroot I can view the various system logs but they don't show much. They show me (remotely) rebooting the server after the kernel upgrade then nothing until I chroot into it from sysrescuecd. Dmesg is similarly absent. The dmesg log is from the previous time I rebooted. The onscreen messages during a boot showed the system failing to load the firmware for the Realtek NIC (8111E) but installing the realtek firmware package didn't help - it just fixed the error message. The system is an ASUS M5A88-M mainboard with an AMD Fx4100 processor. It has a 3-disk mdadm RAID array setup as a RAID-1 /dev/md0 boot partition and RAID-5 /dev/md1p1 as root and /dev/md1p2 as /home. Does anyone have any ideas on how I can track down and/or fix this problem? Update: booting using sysrescuecd, I removed the /etc/rcS.d/S13networking link and rebooted into single-user mode. It completed OK, so I then ran /etc/init.d/networking start, which exhibited the same problems it did when run by init. However this time I was able to kill the process, at which point I noted the network had been started. Exiting from the single-user mode allowed the boot to continue to a normal command prompt. This still leaves me with the problem of not being able to reboot the computer remotely, since the workaround involves disabling the network. dmesg shows the following from the last boot: [ 181.805309] r8169 :03:00.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory [ 181.917202] r8169 :03:00.0: eth0: link down [ 181.920948] r8169 :03:00.0: eth0: link down [ 181.928316] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 184.284849] r8169 :03:00.0: eth0: link up [ 184.292178] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 195.016023] eth0: no IPv6 routers present The eth0: link becomes ready line is the last line I get on the screen when I boot into single-user mode with networking enabled. The previous dmesg.0 log from two months ago shows instead: [ 10.391548] r8169 :03:00.0: firmware: agent aborted loading rtl_nic/rtl8168e-3.fw (not found?) [ 10.391740] r8169 :03:00.0: eth0: unable to load firmware patch rtl_nic/rtl8168e-3.fw (-2) [ 10.405264] r8169 :03:00.0: eth0: link down [ 10.405333] r8169 :03:00.0: eth0: link down [ 10.408540] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 11.061379] RPC: Registered named UNIX socket transport module. The first (firmware) message change reflects the installation of the firmware-realtek package in an effort to fix the problem. I'm guessing that the change is that networking is now bringing the link up because I switched to a static IP address, whereas previously it was brought up by dchp (with the router assigning an IP address using its internal dhcp server). This is my /etc/network/interfaces file: auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.1.17 netmask 255.255.255.0 network 192.168.1.0 broadcast 192.168.1.255 gateway 192.168.1.1 Should it have something about IPv6 in it (or at least somewhere)? Or is there some other error? -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/54586dcb.5070...@torfree.net