I've found in the course of trying to add support for hotplug properly
to the blk2scsa driver, that some changes to the API were necessary.
While here, I made some other aesthetic changes which are easy to do
now, but would require more annoying casework later.
A full updated spec is attached.
The main differences here relative to the first spec are:
* b2s_hba renamed to b2s_nexus
* b2s_target renamed to b2s_leaf
* structure passed for nexus allocation, instead of array of pointers
* request entry point associated with nexus structure instead of leaf
* addition of target and lun members to request structure
* support for non-zero lun numbers (multiple luns per target)
* inquiry data now handled via separate request command, rather than at
registration
* no separate allocation/deallocation for leaf structure, done by leaf
attach/detach automatically (possibly deferred for hotplug safety).
I'm extending the case timer another week to allow for the extra review
required. Sorry, for the late changes, but thanks!
Generic Block Device to SCSA Translation Layer
Functional Specification
Garrett D'Amore (gdamore at sun.com)
Nov 27, 2007
CHAPTER 1: Introduction
There are an ever growing number of digital memory formats, as well as
other kinds of storage media in the market today. Historically, Solaris
has had only limited support for most of them, when connected through a
USB or IEEE 1394 media reader.
Modern laptops and mobile computing devices are now shipping with slots
for these readers that are not connected via USB interfaces. Solaris needs
needs to provide support for media in these slots in a manner similar to
how they are presented via USB.
Writing a full block device driver is one possible way, but, unfortunately,
it also means implementing a number of components besides the block
driver itself, as various userland components exist (such as libsmedia)
which only know how to talk to certain block devices.
The sd(7d) device driver is the most common mass storage block driver, and
it is also how most removable media is presented to the system. Therefore,
it already has most of the hooks necessary to support userland volume
management, format and VTOC management, etc.
Unfortunately, sd(7d) expects to be able to talk to a SCSI disk. Busses
such as USB and IEEE1394 often present mass storage using some subset of
the SCSI command set, so combined with thin translation layers such as
scsa1394 or scsa2usb, it is possible to emulate a SCSI bus sufficiently
that these mass storage devices are usable with sd(7d).
Many of the intersting media formats, however, do not use any form of SCSI
command set. Sometimes the access method is quite different, even though
the underlying media is still typically block-oriented.
CHAPTER 2: blk2scsa translation layer
Our solution to this problem involves the creation of a new kernel/misc
module, blk2scsa. This module provides some straight-forward underlying
APIs for block-oriented drivers to implement, and maps those APIs to an
emulated SCSI bus and one or more emulated SCSI disk drives, so that these
devices are now usable with sd(7d).
This frees the block device driver from needing to implement most of the
labeling, ioctl, or other complex portions of the block device and instead
focus on just the core device access functionality.
In a sample system, imagine a block device driver called "nvflash". This
device driver has two independent flash chips per instance, each of which
is block oriented using 512 byte blocks, and can have a separate filesystem
on it. The plumbing might look like this:
+----------+
| nvflash |
| driver |
| +----------+ +----------------+ +--------------+
+----+ blk2scsa |----> | (emulated bus) | -+---> | sd, target 0 |
+----------+ +----------------+ | +--------------+
|
| +--------------+
+---> | sd, target 1 |
+--------------+
In the diagram above, the flash chips would be addressable as
/dev/{rdsk,dsk}/cXt0d0 and /dev/{rdsk,dsk}/cXt1d0.
Note that the actual flash chips could be removable (perhaps as if they
were slots on a physical media reader), and the sd driver would handle
them properly. This would include automatic volume manager support
by userland components, so that when a device is inserted it is
automatically
mounted and presented in a desktop GNOME session, for example.
The emulated HBA will support auto-sense-request (cannot be disabled),
and emulated targets will support the removable and hotpluggable properties
if they were registered with them.
CHAPTER 3: Programming Interface
=================================
The following API is provided for target devices.
Basic setup
-----------
The device driver shall #include <sys/scsi/adapters/blk2scsa.h>
The device driver shall add -N misc/blk2scsa to its link line, so
that the run-time kernel loader can resolve the sybmols
appropriately.
Types
-----
typedef struct b2s_nexus b2s_nexus_t;
The b2s_nexus_t structure is an opaque structure that is used as a handle
to the emulated host bus adapter.
typedef struct b2s_leaf b2s_leaf_t;
The b2s_leaf_t structure is an opaque structure that is used as a handle
to the emulated disk device.
typedef struct b2s_nexus_info b2s_nexus_info_t;
The b2s_nexus_info structure describes the nexus device that should be
allocated. It has the following members:
int nexus_version;
dev_info_t *nexus_dip;
void *nexus_private;
ddi_dma_attr_t *nexus_dma_attr;
boolean_t (*nexus_request)(void *, struct b2s_request *);
The nexus_version field is used for versioning, and represents the version
of the API that the device driver is coded for. It must be B2S_VERSION_0
in this specification.
The nexus_dip is the device node associated with the driver.
The nexus_private field is available for the device driver's own
use, and should point to nexus state.
The nexus_dma_attr is the DMA attributes describing the DMA
capabilities of
the driver. It may be NULL if the driver is incapable of DMA.
The nexus_request field is the entry point that the device driver
implements
to handle I/O requests. It is passed nexus_private as its first argument.
The second argument describes the request in further detail. If the
driver
handles the request (or queues it for handling), then it must return
B_TRUE. If the driver is unwilling to accept the request, but a future
attempt might be successful, then it should return B_FALSE. Please see
the description for b2s_request_t below.
typedef b2s_leaf_info b2s_leaf_info_t;
The b2s_leaf_info structure describes a leaf device (emulated SCSI
disk) that should be attached to the system. It has the following
members:
uint_t leaf_target;
uint_t leaf_lun;
uint32_t leaf_flags;
const char *leaf_unique_id;
uint32_t target_flags;
boolean_t (*target_request)(void *, struct b2s_request *);
The leaf_target and leaf_lun fields are target and lun numbers to
use for the emulated SCSI target. (For example, the disk device node
/dev/dsk/cXtYdZs2, the leaf_target is represented by Y, and the leaf_lun
is represented by Z.) The combination of these two fields must be unique
for a given leaf node.
The leaf_unique_id field is an ASCIIZ string containing a string that
uniquely identifies the device. The system uses this to protect against
incorrect hotplug operations. (I.e. insertion of a different leaf target
at the specified SCSI address, while the previous leaf was still open
by another consumer.) It may be used in the formulation of device
identifiers and GUIDs as well.
The leaf_flags field can contain one of two flags:
B2S_LEAF_REMOVABLE - indicates that the leaf supports removable
media
B2S_LEAF_HOTPLUGGABLE - indicates that the leaf can be hot
plugged (including either removal or attachment.)
typedef struct b2s_request b2s_request_t;
This structure is the fundamental handle used for tracking I/O requests
between the SCSA layer and the device driver. It has the following
accessible members:
int br_cmd;
uint64_t br_lba;
uint64_t br_nblks;
b2s_media_t br_media;
b2s_inquiry_t br_inquiry;
uint32_t br_flags;
The br_cmd field is the command code for the operation. See Commands
below for a full explanation. This field is supplied by the blk2scsa.
The br_lba field is the logical block address associated with the request.
It will be supplied by the blk2scsa framework. It is only valid when the
br_cmd field is B2S_CMD_READ or B2S_CMD_WRITE.
The br_nblks is the number of logical blocks for the request. It will be
supplied by the blk2scsa framework. It is only valid when the br_cmd
field
is B2S_CMD_READ or B2S_CMD_WRITE.
The br_media field is a description of the media for the target. It is
filled out by the driver in response to a B2S_CMD_GETMEDIA request.
See the description of b2s_media_t below.
The br_inquiry field contains inquiry data for the target, to be filled
out by the device driver in response to a B2S_CMD_INQUIRY request.
See the description of b2s_inquiry_t below.
The br_flags field is a bitfield of flags associated the request. The
public flags, which are read-only supplied by the framework, are
B2S_REQUEST_FLAG_POLL - indicates that a synchronous request without
use of interrupts should be performed. (Such as when sync()'ing
filesystems or performing a crash dump in response to a panic.)
B2S_REQUEST_FLAG_HEAD - indicates that the request should be placed at
the head of any queue, if possible.
B2S_REQUEST_FLAG_LOAD_EJECT - only valid with the B2S_CMD_STOP or
B2S_CMD_START commands, it indicates that the media should either be
loaded (B2S_CMD_START) or ejected (B2S_CMD_STOP) if possible.
B2S_REQUEST_FLAG_IMMED - if present, indicates that the driver should
not wait for the operation to complete on the device before returning
status with b2s_request_done(). It is only valid with the commands
B2S_CMD_FORMAT, B2S_CMD_START, and B2S_CMD_STOP commands.
typedef struct b2s_media b2s_media_t;
This structure is used in response to B2S_CMD_GETMEDIA. The device driver
shall supply information about the media in the following fields:
uint64_t media_blksz;
uint64_t media_nblks;
uint64_t media_flags;
The media_blksz and media_nblks fields are used to report the size and
total
number of logical blocks on the device.
The media_flags bit field can have the flag B2S_MEDIA_FLAG_READ_ONLY
set to
indicate that the media loaded in the target is not writable.
typedef struct b2s_inquiry b2s_inquiry_t;
This structure is used in rsponse to B2S_CMD_INQUIRY. The device driver
shall set the following fields for data to include in a standard SCSI
inquiry:
const char *inq_vendor;
const char *inq_product;
const char *inq_revision;
const char *inq_serial;
The inq_vendor, inq_product, inq_revision, and inq_serial
fields are ASCIIZ strings used to identify the device. Note that this
is for the target, and not for any removable media that may be present
in the target. These strings are used to formulate the response to
SCSI inquiries, so if the device driver is able to, it should use
values that are suitable for SCSI-2. (See ANSI X3.131-1994 for more
information.) Note that the driver need not pad these strings with
spaces.
Functions
---------
int b2s_mod_init(struct modlinkage *);
int b2s_mod_fini(struct modlinkage *);
These routines are called at _init and _fini respectively, to set up
and clean up HBA related entries in the device driver's dev_ops field.
As a consquence, a blk2scsa dependent driver need not supply any cb_ops
or bus_ops structure on its own behalf.
b2s_nexus_t *b2s_alloc_nexus(b2s_nexus_info_t *);
This allocates an initial emulated HBA structure, using the supplied
information.
void b2s_free_nexus(b2s_nexus_t *);
This frees an unattached emulated HBA structure.
b2s_leaf_t *b2s_attach_leaf(b2s_nexus_t *, b2s_leaf_info_t *);
This indicates that a leaf (disk) device has been physically
attached to the nexus (HBA). Appropriate hotplug actions will be
performed to make the device accessible, if possible. It returns
an opaque handle to the leaf on success, or NULL on failure.
void b2s_detach_leaf(b2s_leaf_t *leaf);
This indicates that a leaf device has been physically detached from
the nexus, and exists solely to support hotplug operation.
void b2s_request_mapin(b2s_request_t *req, caddr_t *addrp, size_t *lenp);
This ensures that the buffer associated with the request is mapped into
kernel address space, and returns the address and length of the buffer
associated with the request.
void b2s_request_dma(b2s_request_t *req, uint_t *num, ddi_dma_cookie_t **c);
This returns the number of DMA cookies associated with the request, and
a pointer to an array of the actual cookies. Note that requests are
pre-mapped by the framework if a suitable DMA attribute was supplied at
registration time. Note also that the buffer will always be fully mapped.
That is, it will never be necessary for a driver to use ddi_dma_getwin().
void b2s_request_done(b2s_request_t *req, int errno, size_t resid);
This is called by the driver when it has completed processing for a
request.
The errno takes an error code (see Error Codes below), and the resid
indicates the number of residual bytes that were not transferred.
int b2s_attach_nexus(b2s_nexus_t *);
This attaches the HBA (and any registered leaves) to the system. The
target driver should call this as part of its attach(9e) processing. It
return DDI_SUCCESS on success, DDI_FAILURE otherwise.
int b2s_detach_nexus(b2s_nexus_t *);
This detaches the HBA from the system. It returns DDI_SUCCESS on success,
DDI_FAILURE otherwise.
Commands
--------
B2S_CMD_INQUIRY
This command is sent to the driver to retrieve a description of the
device in response to a SCSI inquiry. The driver shall provide the
details
in the br_inquiry field of the associated request.
B2S_CMD_GETMEDIA
This command is sent to the driver to retrieve a description of the
media currently loaded. The driver shall provide the details in the
br_media field of the associated request.
B2S_CMD_START
This command is used to initialize the device for use. If the
B2S_REQUEST_FLAG_LOAD_EJECT is also present, then the device shall
load any removable media, if possible.
B2S_CMD_STOP
This command is used to cease use of the device. (At this point the
device may be powered down.) If the B2S_REQUEST_FLAG_LOAD_EJECT is
present, then the device should attempt to eject any removable media.
This command also implictly includes the effects of B2S_CMD_SYNC.
B2S_CMD_LOCK
B2S_CMD_UNLOCK
These commands are used to engage or release any locking mechanism
preventing removal of the media.
B2S_CMD_READ
B2S_CMD_WRITE
These commands read or write blocks to/from the device. The block address
to start at is indicated by the br_lba field of the request. The number
of blocks to transfer is indicated by the br_nblks field. The actual
data region can be determined using either the b2s_request_mapin() or
b2s_request_dma() functions (depending on whether or not DMA is to be
used.)
B2S_CMD_FORMAT
This command formats the media. It shall not be possible to format media
while a reservation set with B2S_CMD_RESERVE is in effect. No defect
list is supplied, so the driver is responsible to take whatever action
it deems appropriate. If the B2S_REQUEST_FLAG_IMMED is set, then the
device should not wait for the format to complete before calling
b2s_request_done().
B2S_CMD_SYNC
This command flushes any cached write data from the device to media.
Errors
------
The following error codes can be returned in response to a request using
b2s_request_done().
B2S_EOK
No error. The operation completed successfully.
B2S_ENOTSUP
The device does not support the requested operation. (This should be
returned, if the device does not have a door lock, for example.)
B2S_EFORMATTING
An attempt to access the device while it was formatting was made.
B2S_ENOMEDIA
No media is present in a target with removable media.
B2S_EMEDIACHG
The media may have changed since the last request was made. This is
used to prevent accidental overwrites to changed media.
B2S_ESTOPPED
The target has not been started with B2S_CMD_START yet.
B2S_EHARDWARE
An unknown hardware error occurred.
B2S_ENODEV
The target is not present or has been removed.
B2S_EMEDIA
An error on the medium occurred.
B2S_EDOORLOCK
The B2S_CMD_STOP failed to eject the media, because the doorlock was
engaged.
B2S_EWPROTECT
The media could not be written to because it is not writable.
B2S_EBLKADDR
The supplied LBA block address was invalid.
B2S_ESTARTING
The target is still starting up. Try agin later.
B2S_EIO
Generic failure.
B2S_ERSVD
Target reserved. (Internal use only.)
B2S_EPARM
Bad parameter occurred in the SCSI packet. (Internal use only.)
B2S_EBADMSG
Invalid SCSI message presented to driver. (Internal use only.)
B2S_EINVAL
An invalid parameter was present in the SCSI packet. (Internal use only.)
CHAPER 4: Interface Table
==========================
Imported Interface Level Comments
SCSI-2 Committed ANSI X1.131-1994
scsi_hba_tran Committed SCSI HBA API
scsi_pkt2bp() Consolidation Private
Exported Interface Level Comments
sys/scsi/adapters/blk2scsa.h Cons. Private blk2scsa header
misc/blk2scsa Consolidation Private kernel module
B2S_CMD_* Consolidation Private request command codes
B2S_E* Consolidation Private request error codes
B2S_REQUEST_FLAG_* Consolidation Private request flags
B2S_VERSION_0 Consolidation Private API versioning
struct b2s_request Consolidation Private request structure
struct b2s_leaf Consolidation Private leaf (disk) structure (opaque)
struct b2s_nexus Consolidation Private nexus (hba) structure
(opaque)
struct b2s_leaf_info Consolidation Private leaf (disk) attach
information
struct b2s_nexus_info Consolidation Private nexus (hba) registration
struct b2s_media Consolidation Private media description
struct b2s_inquiry Consolidation Private SCSI inquiry data
b2s_alloc_nexus() Consolidation Private nexus (hba) allocation
b2s_free_nexus() Consolidation Private nexus (hba) deallocation
b2s_attach_nexus() Consolidation Private nexus (hba) attachment
b2s_detach_nexus() Consolidation Private nexus (hba) detachment
b2s_attach_leaf() Consolidation Private leaf (disk) connection
b2s_detach_leaf() Consolidation Private leaf (disk) disconnection
b2s_request_mapin() Consolidation Private request buffer PIO access
b2s_request_dma() Consolidation Private request buffer DMA access
b2s_request_done() Consolidation Private request buffer completion
b2s_mod_init() Consolidation Private modlinkage initialization
b2s_mod_fini() Condolidation Private modlinkage de-initialization