On Thu, Dec 26, 2019 at 09:50:16AM +0100, Mark Kettenis wrote:
> What happens if the bootpath doesn't specify a partition?  Currently
> we end up with bp->val[2] being zero in that case, which means we end
> up booting from 'a' which is the default partition.  So I think you
> need to check for that case when you call setroot().
Right, if no partition is specified val[2] is zero - for both current
code as well as with my diff.

So in setroot() I should have checked whether val[2] was set.

Note how the current code accounts for bp being NULL and sets bootdv
accordingly, but then uses bp->val[2] unconditionally.  This looks like
a potential NULL dereference, perhaps noone has ever hit it?

        bp = nbootpath == 0 ? NULL : &bootpath[nbootpath-1];
        bootdv = (bp == NULL) ? NULL : bp->dev;

#if NMPATH > 0
        if (bootdv != NULL)
                bootdv = mpath_bootdv(bootdv);
#endif

        setroot(bootdv, bp->val[2], RB_USERREQ | RB_HALT);

> However, that means the partition != 0 check is probably meaningless...
Indeed, when omitting the partition OBP passes the bootpath as is

        {0} ok boot /pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w3aa32290d5dcd16c,0 
/bsd.debug
        ...
        Boot device: /pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w3aa32290d5dcd16c,0 
 File and args: /bsd.debug                                                      
        

but stand/ofwboot/ofdev.c:devopen() will default to "a"

        OpenBSD IEEE 1275 Bootblock 1.4                                         
        
        ..>> OpenBSD BOOT 1.15                                                  
        

        ERROR: /iscsi-hba: No iscsi-network-bootpath property                   
        
        Booting 
/pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w3aa32290d5dcd16c,0:a/bsd.debug  

So the kernel will always see a partition with such given bootpath,
hence the val[2] check is always true in that place.

> Also note that booting by WWWN isn't just for fibre-channel and
> hardware RAID.  It works for single disks as well on some controllers.
Ah, OK.  I errornuously implied this from reading the code.

> Why does matching the LUN fail?  If I read the code correctly, you'll
> end up with bp->val[1] = 0.  What is the value of sl->lun?
the LUN always matches, but the Target does not.  Both lun = bp->val[1]
and sl->lun are zero.

However, the last check

                if (bp->val[0] & 0xffffffff00000000 && bp->val[0] != -1) { 
                        ...

                        if (target == sl->target && lun == sl->lun) {           
   
                                nail_bootdev(dev, bp);                          
   
                                return;                                         
   
                        }                                                       
   
                }

fails because of target mismatch.

On my sd0 being the RAID volume I see target = bp->val[2] being
0xd5dcd16c (lower 16 bit of WWID) while sl->target is zero.

Reply via email to