On 13:17 Thu 24 Jun     , Iustin Pop wrote:
> On Wed, Jun 23, 2010 at 04:47:38PM +0300, [email protected] wrote:
> > From: Apollon Oikonomopoulos <[email protected]>
> > +  def Attach(self):
> > +    """Attach to an existing block device.
> > +
> > +
> > +    """
> > +    self.attached = False
> > +    try:
> > +      st = os.stat(self.dev_path)
> > +    except OSError, err:
> > +      logging.error("Error stat()'ing %s: %s" % (self.dev_path, str(err)))
> > +      return False
> > +
> > +    if not stat.S_ISBLK(st[stat.ST_MODE]):
> > +      logging.error("%s is not a block device", self.dev_path)
> > +      return False
> 
> Hmm… I wonder if using _ThrowError here wouldn't be better for error
> reporting (but yes, we don't use it everywhere, I don't remember why
> Attach() uses return False).
> 

I'm not sure about that either, I was following the way the other 
storage classes are implemented. In any case, I can change it to what 
you deem more appropriate.

> > +    self.major = os.major(st.st_rdev)
> > +    self.minor = os.minor(st.st_rdev)
> > +    self.attached = True
> > +
> > +    return True
> > +
> > +  def Assemble(self):
> > +    """Assemble the device.
> > +
> > +    """
> > +    pass
> > +
> > +  def Shutdown(self):
> > +    """Shutdown the device.
> > +
> > +    """
> > +    pass
> > +
> > +  def Open(self, force=False):
> > +    """Make the device ready for I/O.
> > +
> > +    """
> > +    pass
> > +
> > +  def Close(self):
> > +    """Notifies that the device will no longer be used for I/O.
> > +
> > +    """
> > +    pass
> 
> The fact that Open/Close are no-op means that you'll run all LUNS in
> read/write mode, all the time. That'll make data loss much more easy to
> trigger, and might require enhancements to Ganeti to prevent starting an
> instance on multiple nodes (nowadays we kind of rely on the fact that
> we run DRBD in A/P mode, and not A/A).

Yes, this is only basic support and this is where the storage drivers 
come in. Opening a LUN for example means mapping it to an initiator 
group and rescanning the SCSI bus, which is two separate functions: one 
for the master (communicate with the SAN to map the LUN) and one for the 
node (rescan the SCSI bus). cLVM I expect also provides an 
activate/deactivate facility that can be used for that purpose.

For the record, I also feel that not all disks should be available at 
all nodes at all time. But I wanted to have the basic design guidelines 
for the storage driver framework before proceeding to that.

> > +  def GetSyncStatus(self):
> > +    """Returns the sync status of the device.
> > +
> > +    If this device is a mirroring device, this function returns the
> > +    status of the mirror.
> > +
> > +    If the persistent block device name exists, it is assumed to be in
> > +    sync.
> > +
> > +    The status was already read in Attach, so we just return it.
> > +
> > +    @rtype: objects.BlockDevStatus
> > +
> > +    """
> > +
> > +    if self.attached:
> > +      ldisk_status = constants.LDS_OKAY
> > +    else:
> > +      ldisk_status = constants.LDS_FAULTY
> > +
> > +    return objects.BlockDevStatus(dev_path=self.dev_path,
> > +                                  major=self.major,
> > +                                  minor=self.minor,
> > +                                  sync_percent=None,
> > +                                  estimated_time=None,
> > +                                  is_degraded=False,
> > +                                  ldisk_status=ldisk_status)
> > +
> > +
> > +  def DisconnectNet(self):
> > +    pass
> > +
> > +  def AttachNet(self, multimaster):
> > +    pass
> 
> Hmm, I'm not really sure if we need to implement these last three calls
> (GetSyncStatus, DisconnectNet, AttachNet) at all.
> 
> If I understand correctly the design, Ganeti will be completely
> indifferent to the device status, as the SAN itself will take care of
> synching, etc. (at least untile the storage plugin framework is
> implemented). As such, do we need to query at all these attributes?

Yes, you're right, these are leftovers from a time I was faking DRBD 
behaviour and are not needed anymore, I just forgot to remove them.
> >      if self.adopt_disks: # instead, we must check the adoption data
> > -      all_lvs = set([i["adopt"] for i in self.disks])
> > -      if len(all_lvs) != len(self.disks):
> > -        raise errors.OpPrereqError("Duplicate volume names given for 
> > adoption",
> > +      all_disks = set([i["adopt"] for i in self.disks])
> > +      if len(all_disks) != len(self.disks):
> > +        raise errors.OpPrereqError("Duplicate disk names given for 
> > adoption",
> >                                     errors.ECODE_INVAL)
> > -      for lv_name in all_lvs:
> > -        try:
> > -          self.cfg.ReserveLV(lv_name, self.proc.GetECId())
> > -        except errors.ReservationError:
> > -          raise errors.OpPrereqError("LV named %s used by another 
> > instance" %
> > -                                     lv_name, errors.ECODE_NOTUNIQUE)
> > -
> > -      node_lvs = self.rpc.call_lv_list([pnode.name],
> > -                                       self.cfg.GetVGName())[pnode.name]
> > -      node_lvs.Raise("Cannot get LV information from node %s" % pnode.name)
> > -      node_lvs = node_lvs.payload
> > -      delta = all_lvs.difference(node_lvs.keys())
> > -      if delta:
> > -        raise errors.OpPrereqError("Missing logical volume(s): %s" %
> > -                                   utils.CommaJoin(delta),
> > -                                   errors.ECODE_INVAL)
> > -      online_lvs = [lv for lv in all_lvs if node_lvs[lv][2]]
> > -      if online_lvs:
> > -        raise errors.OpPrereqError("Online logical volumes found, cannot"
> > -                                   " adopt: %s" % 
> > utils.CommaJoin(online_lvs),
> > -                                   errors.ECODE_STATE)
> > -      # update the size of disk based on what is found
> > -      for dsk in self.disks:
> > -        dsk["size"] = int(float(node_lvs[dsk["adopt"]][0]))
> > +      if self.op.disk_template == constants.DT_PLAIN:
> > +        for lv_name in all_disks:
> > +          try:
> > +            self.cfg.ReserveLV(lv_name, self.proc.GetECId())
> > +          except errors.ReservationError:
> > +            raise errors.OpPrereqError("LV named %s used by another 
> > instance" %
> > +                                       lv_name, errors.ECODE_NOTUNIQUE)
> > +
> > +        node_lvs = self.rpc.call_lv_list([pnode.name],
> > +                                         self.cfg.GetVGName())[pnode.name]
> > +        node_lvs.Raise("Cannot get LV information from node %s" % 
> > pnode.name)
> > +        node_lvs = node_lvs.payload
> > +        delta = all_disks.difference(node_lvs.keys())
> > +        if delta:
> > +          raise errors.OpPrereqError("Missing logical volume(s): %s" %
> > +                                     utils.CommaJoin(delta),
> > +                                     errors.ECODE_INVAL)
> > +        online_lvs = [lv for lv in all_disks if node_lvs[lv][2]]
> > +        if online_lvs:
> > +          raise errors.OpPrereqError("Online logical volumes found, cannot"
> > +                                     " adopt: %s" % 
> > utils.CommaJoin(online_lvs),
> > +                                     errors.ECODE_STATE)
> > +        # update the size of disk based on what is found
> > +        for dsk in self.disks:
> > +          dsk["size"] = int(float(node_lvs[dsk["adopt"]][0]))
> > +      elif self.op.disk_template == constants.DT_BLOCK:
> > +        node_disks = self.rpc.call_bdev_list([pnode.name])[pnode.name]
> > +        node_disks = node_disks.payload
> > +        logging.info("node_disks: %s", node_disks)
> > +        delta = all_disks.difference(node_disks.keys())
> > +        if delta:
> > +          raise errors.OpPrereqError("Missing block device(s): %s" %
> > +                                     utils.CommaJoin(delta),
> > +                                     errors.ECODE_INVAL)
> > +        for dsk in self.disks:
> > +          dsk["size"] = int(float(node_disks[dsk["adopt"]]))
> 
> Can you confirm in this last block, you're just moving the DT_PLAIN code
> under an 'if', and add the DT_BLOCK under 'elif'?
> 
Precisely, and I renamed all_lvs to all_disks to be more 
storage-independent.

Apollon

Reply via email to