On Sun, Jun 3, 2018 at 7:18 PM, D. Hugh Redelmeier via talk <[email protected]> wrote: > | From: o1bigtenor via talk <[email protected]> > > | My server has been operational for about a year and I am working on a > | number of different projects on it. Twice now (this last friday and 5 > | weeks early I came into the office to find that the server has somehow > | been taken down and has rebooted itself (process setup in the bios) > | but as it doesn't quite complete the boot process, I have to hit a key > | to tell it to continue and then finally to log in to read Debian > | (stable). > | > | So I am trying to determine what may have caused the system to do a > | reboot, > > Often a crash prevents logging. Clearly logging would have to happen > after the crash, something that isn't easy when the system has > crashed. But there is some hope.
Using suggestions offered I think I have been able to pinpoint the issue. > > Do you have a working UPS? I don't, and I lose power a few times a > year. That knocks out my computers (and clocks everywere). > > Aside: all device classes evolve to have enough intelligence to have > clocks that need setting, and then evolve to be networked to set their > own clocks. The timing of these steps is not fixed. > > Can you believe that I grew up with phones that had no clock? > > The first small computers I used had no clocks. The big ones did so > that IBM could charge for the time that they were used (eg. one used > to rent machines and have to pay overtime if they worked more than one > shift). CP/M's file system didn't have timestamps (the were added > long after I moved on). MS-DOS stupidly used local time for > timestamps, even though UNIX got it right (used UTC) before MS-DOS. > > | AIUI servers should be > | able to run happily for years without issues (barring hardware > | problems) so I want that kind of reliability. Where in /var/log will I > | be finding the most clues as to the events that lead up to this > | 'reboot'? > > Not being a debian user, I don't know which files are most useful. If > you are using systemd you might find that journalctl is the command > you need. > > You could look at them all (you can skip the ones which haven't changed > recently). > > > I don't know why your system stops at the POST page. Could it be that > your HDD doesn't spin up quickly enough for the normal boot logic? Dell has som kind of goofy BIOS stuff so that one needs to choose one of 2 options and then the UEFI stuff happens and then the reboot works. The waiting for input is not at issue here (the system has always been this way - - -grin! > > I have one server that hangs because the EFI System Partition's > filesystem gets corrupted during a crash (oops). I think that the > problem is that the OS leaves /boot/efi mounted most of the time > (that's dumb) so the filesystem gets marked as "dirty" and the > firmware doesn't like that. Thanks for the ideas! Dee --- Talk Mailing List [email protected] https://gtalug.org/mailman/listinfo/talk
