On 12/7/23 17:06, gene heskett wrote:
I had a 3d slicer I was quitting to restart it and re-organish its caxhe, took my machine to a lockup showing half a workspace on one half the screen and half an adjacent workspace on the other half of the screen, no responce to the keyboard and no mouse. 28 days uptime which was better than most uptimes I have with bookworm. 2 weeks seems to be the norm.

TL;DR:

So I went to the front panel and pressed the reset button but when it came time to enter my pw, it just looped back to the login prompt.  So I did a full powerdown, enough times the psu finally went face down in the pool.  That was a good excuse to do an overdue D & C and some update installs while I was changing the supply, adding a 16 port sata-III controller and 8T worth of SSD's so I could use them to get amanda started again. Powered it up, bios found the new drives, but my pw was still rejected, it looped back to the login in 2 or 3 seconds.


A failing power supply can cause very strange behavior, including file corruption that will haunt you after replacing the power supply.


I do not normally have a root pw set, just me, using sudo when I need to do root stuff. So at grub I hit e and changed quiet to single, eventually spotting in all the spew, something go flying by about the root account being locked, so "single" does not work.  And of course no scrollback to read that msgs detail.

Thats bug #1. Single for rescue demands a root pw. If no root, it should default to the name in the sudoers file. So I'm still locked out and this is the 3rd time debian 12 has done this to me. So I get out a 12.2 net-install dvd and reboot to it, several times, but I was able to set a root pw so single worked but I still had to enter the new pw.


While I always set a root password, I do agree that people should be able to run Debian without a root password and everything should still work. Perhaps you should consider filing a bug report against grub (?).


At this point I successfully changed /my/ passwd to something a bit longer expressing my frustration. So since all my other machine control machines have a full desktop with full net access, I went to one of the bananapi-m5's and sent firefox to google with the search string "linux login loops back to login", 2nd hit, on stackexchange said to delete anything that looked like ./XAuthority* and ditto ./xsession* something was contaminated, killing x or its newer wayland cousin.  Bingo! Next reboot was normal.

So that is bug #3. Deleting that stuff, possibly losing some setup details by then re-starting X (etc) with a clean slate is far more preferable to another format and reinstall just to delete the contaminated files, formatting everything and losing 25 years of my history like happened when those 2 new 2T seacrates died at about 6 weeks runtime, within hours of each other a bit over a year ago.

Seems to me stuff like this ought to be fixable.

Many thanks for reading this far. Maybe someone with write perms on bugzilla can investigate?

Cheers, Gene Heskett.


If I am understanding you correctly, $HOME/.Xauthority and/or $HOME/.xsession-errors became corrupt when the PSU failed, and the fix was to replace the PSU and delete those files? About the only idea that comes to mind would be to put in a feature request against Xorg that Xorg implement some automatic failure recovery protocol that includes those files when Xorg fails to start the windowing environment.


That said, I have little faith in the "find the needle in the haystack" and "put Humpty Dumpty back together again" approach to fixing operating system instances. Even when it "seems to work", I rarely have confidence in the results -- "what needles did I miss?" My response to this situation is disaster preparedness -- I take images of my OS drives monthly, or as needed. In fact, I had a UEFI Debian image go sideways last weekend during maintenance. I spent ~15 minutes investigating/ trouble-shooting/ STFW/ RTFM/ etc., could not find the answer, spent ~15 minutes re-imaging, spent ~15 minutes updating, and was back in business with a known good OS instance in less than an hour without any outside help.


David

Reply via email to