> Date: Thu, 26 Dec 2019 17:55:24 +0100
> From: Klemens Nanni <k...@openbsd.org>
> 
> On Thu, Dec 26, 2019 at 09:50:16AM +0100, Mark Kettenis wrote:
> > What happens if the bootpath doesn't specify a partition?  Currently
> > we end up with bp->val[2] being zero in that case, which means we end
> > up booting from 'a' which is the default partition.  So I think you
> > need to check for that case when you call setroot().
> Right, if no partition is specified val[2] is zero - for both current
> code as well as with my diff.
> 
> So in setroot() I should have checked whether val[2] was set.
> 
> Note how the current code accounts for bp being NULL and sets bootdv
> accordingly, but then uses bp->val[2] unconditionally.  This looks like
> a potential NULL dereference, perhaps noone has ever hit it?
> 
>       bp = nbootpath == 0 ? NULL : &bootpath[nbootpath-1];
>       bootdv = (bp == NULL) ? NULL : bp->dev;
> 
> #if NMPATH > 0
>       if (bootdv != NULL)
>               bootdv = mpath_bootdv(bootdv);
> #endif
> 
>       setroot(bootdv, bp->val[2], RB_USERREQ | RB_HALT);
> 
> > However, that means the partition != 0 check is probably meaningless...
> Indeed, when omitting the partition OBP passes the bootpath as is
> 
>       {0} ok boot /pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w3aa32290d5dcd16c,0 
> /bsd.debug
>       ...
>       Boot device: /pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w3aa32290d5dcd16c,0 
>  File and args: /bsd.debug                                                    
>           
> 
> but stand/ofwboot/ofdev.c:devopen() will default to "a"
> 
>       OpenBSD IEEE 1275 Bootblock 1.4                                         
>         
>       ..>> OpenBSD BOOT 1.15                                                  
>         
> 
>       ERROR: /iscsi-hba: No iscsi-network-bootpath property                   
>         
>       Booting 
> /pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w3aa32290d5dcd16c,0:a/bsd.debug  
> 
> So the kernel will always see a partition with such given bootpath,
> hence the val[2] check is always true in that place.
> 
> > Also note that booting by WWWN isn't just for fibre-channel and
> > hardware RAID.  It works for single disks as well on some controllers.
> Ah, OK.  I errornuously implied this from reading the code.
> 
> > Why does matching the LUN fail?  If I read the code correctly, you'll
> > end up with bp->val[1] = 0.  What is the value of sl->lun?
> the LUN always matches, but the Target does not.  Both lun = bp->val[1]
> and sl->lun are zero.
> 
> However, the last check
> 
>               if (bp->val[0] & 0xffffffff00000000 && bp->val[0] != -1) { 
>                       ...
> 
>                       if (target == sl->target && lun == sl->lun) {           
>    
>                               nail_bootdev(dev, bp);                          
>    
>                               return;                                         
>    
>                       }                                                       
>    
>               }
> 
> fails because of target mismatch.
> 
> On my sd0 being the RAID volume I see target = bp->val[2] being
> 0xd5dcd16c (lower 16 bit of WWID) while sl->target is zero.

Hmm, you really shouldn't end up there if you're booting by WWN.  I
guess the

    bp->val[0] == sl->port_wwn

check is failing in your case.  What are the values of:

    bp->val[0]
    sl->port_wwn
    sl->node_wwn

in your case?

Reply via email to