On 5/30/2025 6:29 AM, Chuck Zmudzinski wrote:
> On 5/29/2025 4:02 PM, Greg A. Woods wrote:
>> At Thu, 29 May 2025 15:01:50 -0400, Chuck Zmudzinski <frchu...@gmail.com> 
>> wrote:
>> ... 
>> 
>>> When I pass bootdev=dk12 in boot.cfg, the bootloader strangely tries dk1 as 
>>> root
>>> (which is wrong) and correctly detects dk11 as the dump device. But it never
>>> gives me the chance to enter the correct root device and instead tries to 
>>> load
>>> init which of course it cannot find the NetBSD init on dk1 because dk1 is 
>>> not
>>> the correct NetBSD root device. In fact on this box a Linux distro is 
>>> installed
>>> on dk1, as evidenced by the filesystem type detected on dk1: ext2fs.
>> 
>> Ah, I think that's a bug related to some bizarre/old hacks to find the
>> "booted_partion" for non-GPT disks:
>> 
>>              if (strncmp(xcp.xcp_bootdev, devname, strlen(devname)))
>>                      continue;
>> 
>>              if (is_disk && strlen(xcp.xcp_bootdev) > strlen(devname)) {
>>                      /* XXX check device_cfdata as in x86_autoconf.c? */
>>                      booted_partition = toupper(
>>                              xcp.xcp_bootdev[strlen(devname)]) - 'A';
>>                      DPRINTF(("%s: booted_partition: %d\n", __func__, 
>> booted_partition));
>>              }
>> 
> ...
> A very simple sanity check that might fix this case would be to reject
> the match if the extra digit in the bootdev string is a numerical digit
> instead of a letter of the alphabet between a and p, because only such
> a letter would indicate that the two devices are related as a full disk
> device and a device that is a partition on the full disk.
> 
> Essentially, it looks like we are getting a false positive when searching
> for the device on which the root partition resides, and I think maybe if
> we add that extra sanity check of making sure the extra digit is a letter
> between a and p instead of a numerical digit like 2 we would correctly
> detect dk12 as the root device on my system instead of getting dk1 as a
> false positive.

So, as long as there is no funny business with byte order and "endianness"
with "booted_partition", I think the additional sanity check could be just
a two-line addition between the statement when we compute "booted_partition"
and the DPRINTF statement that logs the value of "booted_partition":

                        if (booted_partition & 0xfffffff0)
                                continue;

This should work because we would expect the booted partition's value
to be an unsigned integer less than 0xf. If any of the higher order
28 bits of "booted_partition" has a 1, then we know we do not have a
valid partition index between 0 and 15 so we should bail out and try
the next devname. Of course this also assumes "booted_partition" is
declared as an uint32_t, or equivalent. I could not find its declaration
in arch/xen/xen/xen_machdep.c and presumably it is declared in a header
somewhere.

Chuck Zmudzinski

Reply via email to