Re: [Linux-users] Box is locking up

Kent Fredric Sat, 01 Nov 2014 05:37:13 -0700

On 1 November 2014 21:12, Barry <[email protected]> wrote:

>
> I have read and reread all suggestions for which I thank you all.
>
> At last, I think I have found the problem by searching journalctl -d after
> yesterdays lockup. The following lines are what I found......
> ----------------------------
> -- Reboot --
> Oct 31 18:00:01 TheBox ifplugd(wlp2s9)[1300]: client: SIOCSIFMTU: Invalid
> argument
> Oct 31 18:00:01 TheBox avahi-daemon[784]: Registering new address record
> for 192.168.2.58 on wlp2s9.IPv4.
> Oct 31 18:00:01 TheBox avahi-daemon[784]: New relevant interface
> wlp2s9.IPv4 for mDNS.
> Oct 31 18:00:01 TheBox avahi-daemon[784]: Joining mDNS multicast group on
> interface wlp2s9.IPv4 with address 192.168.2.58.
> Oct 31 18:00:01 TheBox dhclient[19573]: Trying recorded lease 192.168.2.58
> Oct 31 18:00:01 TheBox avahi-daemon[784]: Interface wlp2s9.IPv4 no longer
> relevant for mDNS.
> Oct 31 18:00:01 TheBox avahi-daemon[784]: Leaving mDNS multicast group on
> interface wlp2s9.IPv4 with address 192.168.2.59.
> Oct 31 18:00:01 TheBox avahi-daemon[784]: Withdrawing address record for
> 192.168.2.59 on wlp2s9.
> Oct 31 18:00:01 TheBox ifplugd(wlp2s9)[1300]: client: 1 packets
> transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
> Oct 31 18:00:01 TheBox ifplugd(wlp2s9)[1300]: client: --- 192.168.2.1 ping
> statistics ---
> Oct 31 18:00:01 TheBox ifplugd(wlp2s9)[1300]: client: PING 192.168.2.1
> (192.168.2.1) 56(84) bytes of data.
> -------------------------------
> wlp2s9 is a wifi card inside my box. Earlier tonight I had another lockup
> so rebooted and went straight to journalctl. This is what I found and it
> refers to the same wifi card....
> ------------------------------
> -- Reboot --
> Nov 01 20:36:57 TheBox kernel: wlp2s9: associated
> Nov 01 20:36:57 TheBox kernel: wlp2s9: RX AssocResp from 08:86:3b:66:90:28
> (capab=0x411 status=0 aid=3)
> Nov 01 20:36:57 TheBox kernel: wlp2s9: associate with 08:86:3b:66:90:28
> (try 1/3)
> Nov 01 20:36:57 TheBox kernel: wlp2s9: authenticated
> Nov 01 20:36:57 TheBox kernel: wlp2s9: send auth to 08:86:3b:66:90:28 (try
> 2/3)
> Nov 01 20:36:57 TheBox kernel: wlp2s9: send auth to 08:86:3b:66:90:28 (try
> 1/3)
> Nov 01 20:36:57 TheBox kernel: wlp2s9: authenticate with 08:86:3b:66:90:28
> Nov 01 20:36:57 TheBox ifplugd(wlp2s9)[1345]: client: SIOCSIFMTU: Invalid
> argument
> Nov 01 20:36:57 TheBox avahi-daemon[784]: Registering new address record
> for 192.168.2.83 on wlp2s9.IPv4.
> -----------------------------------
> Btw no hw alterations have been made, and I have been running mageia2 for
> the last 2 weeks or so with no problems. Would I be correct in assuming a
> faulty software driver is the cause of my problem?
>
> My cure is to install the latest kernel and hope.
>
> Thanks
>


Late to the party here, but if you can ssh in to the box and prod at
things, that really doesn't look like its a certain hardware issue. It
*could* be, but I've seen software bugs like that. Or software bugs that
are exacerbated by random events in enviroment ( like wifi becomming less
stable, causing invisible servers X11 was depend on vanishing, and X11
blocking forever waiting for them to come back, which seems especially
relevant given that you've been assigned a _NEW_ IP address and X11 may be
listening bound on the old one which is no longer valid )

I would see if you can get your kernel compiled with magic-sysrq's enabled
and then next time it gets pinned, attempt to sysrq it back to life.

"MagicSysrq" is pressing left alt, print-screen and some other key
simultaneously.

MagicSysrq-s # sync drives to disk
MagicSysrq-r # release keyboard from X11
MagicSysrq-k # Kill whatever is running on the current VT ( Probably X11 )

Sometimes that is enough to get linux to nuke X and get you back to a VT.

Sometimes however, you'll get VT control, but the display may remain
disabled, and you can play by memory, log in as root, confirm you have
control by running `beep`, etc, then manually invoking commands to restart
X11.

And sometimes that's sufficient to bring back X if it is just some weird
X11 leak or something weird happening in a video driver.

And you can probably attempt much of this remotely too. I reiterate:
Sometimes just nuking all X11 apps and restarting X is all it takes.

However, what does prove interesting if you enter such a state and there
are apps that, for whatever reason, refuse to die. Usually if there's one,
there will be others blocking it somehow mutually, so going on a `kill`-ing
spee ( hello nsa, sorry, talking about terminals here ) can help weed out
the silent party at fault, or isolate other elements that are involved in
"refusal to die" problem.

Finally, there's a few other magicsysrq's that are useful here:

MagicSysrq-u # Remount all drives readonly
MagicSysrq-e # kill -15 all tasks
MagicSysrq-i  # kill -9 all tasks
MagicSysrq-f  # initiate OOM Killer
MagicSysrq-n # renice all high-priority tasks to normal priority


And when you've given up and are happy the system is cooked ( nb: _After_
syncing, the next commands are practicaly raw ACPI commands and you should
treat them with the care you would touching physical buttons, they're just
more convenient for some hardware )

MagicSysrq-b # reboot
MagicSysrq-o # poweroff


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL

_______________________________________________
Linux-users mailing list
[email protected]
http://lists.canterbury.ac.nz/mailman/listinfo/linux-users

Re: [Linux-users] Box is locking up

Reply via email to