September 15, 2023 at 10:15 AM, "Martin Husemann" <mar...@duskware.de> wrote:
> > On Fri, Sep 15, 2023 at 04:01:15PM +0000, Emmanuel Dreyfus wrote: > > > > > A multiboot bootloader cannot, because all the information that is passed > > is about partition numbers. There is no way of specifying a LBA offset, > > hence the setup where you have a GPT inside raidframe seems impossible > > to support. > > > > Can you describe your setup and the boot order in more details? > I guess there is some fundamental misunderstanding somewhere. Since I was the one who mentioned the issue to Emmanuel allow me to chime in here.... > Where does your kernel live and how is it loaded? What partition(s) > have the bootme flag? I would never expect it to make any sense in the > inner GPT of a raid set, but only the (outer) GPTs of the components > (so firmware finds the bootloader). This came from attempting to install NetBSD-10_BETA on some new hardware, using sysinst, and following my nose as though I were a beginner (full disclosure: sysinst was modified to automatically add 'absent' for a missing component for a RAID 1 set so I would only need to have a single component available. But otherwise a vanilla 10-beta install). I took wd0 (or equivalent) and asked it to put a RAID partition on it. It happily made a new GPT for me, with one partition for RAID. (i.e. no EFI partition, which bit me after). I then put that new RAID partition into 1/2 of a RAID 1 set, and old it to configure and install to raid0. I told it 'use default partitions'. The resulting partitions were similar to: chicken# gpt show -a raid0 start size index contents 0 1 PMBR 1 1 Pri GPT header 2 32 Pri GPT table 34 30 Unused 64 262144 1 GPT part - EFI System Type: efi TypeID: c12a7328-f81f-11d2-ba4b-00a0c93ec93b GUID: 00c87892-e798-48dd-b0c7-abf421d25302 Size: 128 M Label: Attributes: None 262208 14823680 2 GPT part - NetBSD FFSv1/FFSv2 Type: ffs TypeID: 49f48d5a-b10e-11dc-b99b-0019d1879648 GUID: ca6ef7c6-551b-475b-9211-fa08d0402165 Size: 7238 M Label: Attributes: biosboot 15085888 4184064 3 GPT part - NetBSD swap Type: swap TypeID: 49f48d32-b10e-11dc-b99b-0019d1879648 GUID: 78bbf8c1-d5ac-43ad-8f06-d545206c2fb8 Size: 2043 M Label: Attributes: None 19269952 31 Unused 19269983 32 Sec GPT table 19270015 1 Sec GPT header chicken# (the above is from a test VM. The original machine that I had this issue on is now in production with custom GPT setup. But it is trivial to replicate the setup). That is, it automatically created the EFI partition as the *first* partition in the RAID set. I later learned that if I made a EFI partition (with the right /usr/mdec/*.efi bits) on wd0 I would get a lot further. I would get even further still if I didn't just 'use default partitioning' in sysinst, and have my "/" GPT partition as the partition at index 1 in the RAID. Without Emmanuel's changes "/" needs to be at GPT partition index 1, otherwise rf_buildroothack() won't work. To answer your last questions: The kernel lives on raid0 in GPT partition 2. By default, it isn't loaded, as UEFI can't find it, as there is no EFI partition by default on wd0. If I add an EFI partition with /usr/mdec/*.efi in efi/boot, then the kernel is correctly loaded from raid0 in GPT partition 2. The 'bootme' flag is added to GPT partition 2, but that doesn't help with rf_buildroothack() if '/' isn't actually at GPT partition 1. And I agree that I wouldn't expect it to find the EFI bits in the RAID set, but that's where sysinst put things by default. So no, Emmanuel's changes arn't needed if you really know what you're doing on an install with sysinst, but a novice just following the default is currently going to really be a long way from a bootable system. (I'm not even touching the need to set 'bootme' or to populate the out-of-RAID EFI partition or to mark the RAID set as '-A soft' in order to get to something working.... and I also see addressing these as something that can happen after netbsd-10 is shipped, as I think this will require some involved changes to sysinst) I hope this helps clarify things. (Yes, I'd be happy if rf_buildroothack() could Go Away, or at least be reduced in complexity..) Thanks. Later... Greg Oster