Re: syspatch -> no partition found ; any simple fix?

2020-10-30 Thread Amelia A Lewis
Heylas again,

On Thu, 29 Oct 2020 21:40:05 -0700, Greg Thomas wrote:
> On Thu, Oct 29, 2020 at 8:42 PM Amelia A Lewis  wrote:
[snip]
> 
>  If you were just running syspatch I'd be worried that a hardware failure
> showed up on reboot.  I'm way out of practice for troubleshooting OpenBSD
> but booting the installer from a USB drive or CD, dropping to a shell and
> checking your disk info will answer the hardware question for you.

On Fri, 30 Oct 2020 09:21:23 - (UTC), Stuart Henderson wrote:
> "No active partition" sounds like no MBR partition is marked as active.
> 
> I would boot the installer, shell, "fdisk sd0" and see how it looks, or
> possonly the MBR partition table is not written correctly or has been
> somehow overwritten.
> 

Thanks to both of you; I followed up by cracking the case (partly 
because the only drive the BIOS had in its boot order was the Toshiba, 
and I was pretty sure the boot volume was on an Crucial SSD). With a 
little fiddling (changing boot order (when it let me), switching 
uefi+legacy to legacy only (and even uefi only, but the only drive that 
has gpt is the big data drive (the Toshiba), which doesn't have 
anything bootable).

What seems to have happened, weirdly enough, is that my SSDs have gone 
from sd in 6.7 and before (at least 6.6) to wd in 6.8. I've got my 
daily output from 29 Oct (I keep most recent daily output emails, in 
case i need them), which lists everything as sd (sd0 [ssd, boot volume] 
and sd2 [toshiba data drive]). Now everything but the boot volume is 
disconnected, and it's not sd0, it's wd0. Which might explain its 
disappearance ... no, wrong level.

I just brought it up using 'boot /bsd.sp', which bypasses the kernel 
crash (which I didn't mention before because I hadn't seen it before): 
apparently, when bsd.mp crashes, it drops into ddb, and something 
happens that registers in bios: the disk stops being available to the 
bios. Variations on unplugging and replugging it, and fiddling with 
boot order and 'csm' options will make it find the bootloader again.

Since the behavior is rather strikingly weird (though prolly 
irreproducible by sane mortals), I'm gonna open a bug report, on the 
chance that I've triggered something that folks there might recognize.

Amy!
-- 
Amelia A. Lewisamyzing {at} talsever.org
It is practically impossible to teach good programming to students that 
have had a prior exposure to BASIC: as potential programmers they are 
mentally mutilated beyond hope of regeneration.
-- Edsger Dijkstra



Re: syspatch -> no partition found ; any simple fix?

2020-10-30 Thread Stuart Henderson
On 2020-10-30, Amelia A Lewis  wrote:
> It won't start the boot, but displays "No active partition". Checking 
> online, this message seems to indicate a failed upgrade, with the 
> bootloader load incomplete, and (because I was distracted, and running 
> three updates in a state of fatigue), it's actually likely that what I 
> did was to Ctrl-B D out of tmux before it returned from kernel 
> relinking, and then hit doas reboot unthinkingly. Anyway, that's my 
> guess.

If rebooting during relinking does cause some problem I don't think it
would manifest itself like that. (I've done this multiple times - there's
no indication that relinking is still taking place and can take
surprisingly long on systems with poor disk io - and *touch wood*
when issuing rdboot it has been ok so far - though I have been less
lucky with power failures during relinking).

"No active partition" sounds like no MBR partition is marked as active.

I would boot the installer, shell, "fdisk sd0" and see how it looks, or
possonly the MBR partition table is not written correctly or has been
somehow overwritten.

There should be an OpenBSD partition (and maybe some EFI partition if you
use that, I don't use EFI enough to remember..) and one should be
flagged with a * indicating that it's active.

If the partition is there but without a *, edit with "fdisk -e sd0' and use
the "flag" command to set the relevant partition active, e.g. "flag 3".

If some partition information is shown but it doesn't look like it does on
a working OpenBSD system then maybe someone has an idea if you post it here.

If *no* partition is listed there and you are sure that you used a default
"use whole disk for openbsd" when installing then fdisk -i sd0 (if you used
MBR) or fdisk -gi sd0 (if GPT) may help. This writes a new default MBR
partition table with OpenBSD spanning the whole disk but leaves other
information (including the OpenBSD "disklabel" partition table) intact.

> Is there a straightforward way to install kernel and bootloader without 
> requiring a system reinstall? Can I 'upgrade' with an install cd or usb 
> stick from (broken) 6.8+sp3 to 6.8, and then syspatch it up to date?

An 'upgrade' install to the same version would do that but would not mark
the MBR partition as active. I don't think it will fix this problem.

> I'm trying to avoid full reinstall because that seems likely to wipe 
> out existing configuration. I figure my fallback is create install 
> stick/cd (from the other local 6.8, which was successfully updated), 
> boot from that, pull backups of all the configuration so I don't have 
> to reconfigure all the services (and double-check sizes and locations 
> of disk slices on the boot drive, and store that somewhere safe, then 
> reinstall and copy stuff back (it's all backed up, in fact, but it's 
> not backed up recently enough for confidence). So ... faster way to fix 
> my screwup, when I've probably borked my kernel and the bootloader, 
> somehow?

If you need to recover files then I would try doing an install to a USB
stick and boot from that, to give a more full environment than the install
kernel with which to investigate/copy files/etc. Alternatively move the
drive to a working machine as an additional drive and see if you can mount
from there.

That is a couple of steps on though. Check fdisk first.



Re: syspatch -> no partition found ; any simple fix?

2020-10-29 Thread Greg Thomas
On Thu, Oct 29, 2020 at 8:42 PM Amelia A Lewis  wrote:

> Heylas,
>
> So, I ran 6.8 syspatch (patches 002 and 003 together) for three systems
> today (yesterday by the time anyone sees this, most likely). Two came
> right back up as expected. The third didn't, but as it's local, I could
>
> .

Or if it is entirely impossible that "No active partition" could be the
> result of kernel relinking borkage, and it's obvious to someone that
> something else (hardware failure showing up on a reboot?) happened, I'd
> welcome clues. Thanks.
>

 If you were just running syspatch I'd be worried that a hardware failure
showed up on reboot.  I'm way out of practice for troubleshooting OpenBSD
but booting the installer from a USB drive or CD, dropping to a shell and
checking your disk info will answer the hardware question for you.


syspatch -> no partition found ; any simple fix?

2020-10-29 Thread Amelia A Lewis
Heylas,

So, I ran 6.8 syspatch (patches 002 and 003 together) for three systems 
today (yesterday by the time anyone sees this, most likely). Two came 
right back up as expected. The third didn't, but as it's local, I could 
go retry at the console (all three were actually patched and rebooted 
via ssh).

It won't start the boot, but displays "No active partition". Checking 
online, this message seems to indicate a failed upgrade, with the 
bootloader load incomplete, and (because I was distracted, and running 
three updates in a state of fatigue), it's actually likely that what I 
did was to Ctrl-B D out of tmux before it returned from kernel 
relinking, and then hit doas reboot unthinkingly. Anyway, that's my 
guess.

Is there a straightforward way to install kernel and bootloader without 
requiring a system reinstall? Can I 'upgrade' with an install cd or usb 
stick from (broken) 6.8+sp3 to 6.8, and then syspatch it up to date?

I'm trying to avoid full reinstall because that seems likely to wipe 
out existing configuration. I figure my fallback is create install 
stick/cd (from the other local 6.8, which was successfully updated), 
boot from that, pull backups of all the configuration so I don't have 
to reconfigure all the services (and double-check sizes and locations 
of disk slices on the boot drive, and store that somewhere safe, then 
reinstall and copy stuff back (it's all backed up, in fact, but it's 
not backed up recently enough for confidence). So ... faster way to fix 
my screwup, when I've probably borked my kernel and the bootloader, 
somehow?

Or if it is entirely impossible that "No active partition" could be the 
result of kernel relinking borkage, and it's obvious to someone that 
something else (hardware failure showing up on a reboot?) happened, I'd 
welcome clues. Thanks.

Amy!
-- 
Amelia A. Lewisamyzing {at} talsever.org
Time and trouble will tame an advanced young woman, but an advanced old 
woman is uncontrollable by any earthly force.
-- Sir Impey Biggs [Dorothy L. Sayers, "Clouds of 
Witness"]