I'm submitting the following case on my own behalf.  Times out on 7/13/2009.
This case seeks patch binding.

Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
This information is Copyright 2009 Sun Microsystems
1. Introduction
    1.1. Project/Component Working Name:
         STREAMS and Character Device Coexistence
    1.2. Name of Document Author/Supplier:
         Author:  Garrett D'Amore
    1.3  Date of This Document:
        06 July, 2009
4. Technical Description

STREAMS and Character Device Coexistence
----------------------------------------

Background:
-----------

There are two different frameworks for device drivers in Solaris -- one that
implies a device is a typical character/block device (and hence supports the
typical read(2), write(2), etc. entry points), and another that assumes that
devices are STREAMs devices and export their entry points via cb_ops.

Most device drivers fall firmly into one style (character/block) or
the other (STREAMs.)  Usually which kind of device the driver acts as is
determined by what type of hardware it is.  For example, NIC drivers are
required to be STREAMs devices in Solaris.  Audio devices historically were
STREAMs based, although since Boomer (PSARC 2008/318) they are now character
devices.


Problem:
--------

Some devices for one reason or another cannot properly be described as a
just character/block or STREAMs.  This is a problem that has recurred many
times in the author's experience.  Some specific examples:

* Venus (Sun Crypto Accelerator 4000) -- this device is a NIC, so it
  must be a STREAMs device.  However, for the Crypto framework used at
  the time (kcl, in Solaris 8) the crypto functionality was expressed
  via character interfaces.  (STREAMs was deemed far to cumbersome for
  this kind of functionality since a lot of complex structures had to
  be copied back and forth.)

* Audio - to be compatible with the legacy Audio API, audio devices
  need to be STREAMs devices.  But the new style OSS API is character
  based, and trying to impose STREAMs created a problem where STREAMs
  queueing semantics resulted in latency that the OSS API could not express
  to applications (and hence applications lost the ability to perform
  accurate positioning within the stream.)

* Converged IB/networking devices.  Devices from some vendors can support
  either Infiniband or 10GbE functionality, based upon firmware configuration
  or external influences.  The Infiniband functionality and framework was
  designed to be character/block driven (especially it wants to perform some
  efficient mmap(2) operations and wants to avoid STREAMs latency) while the
  10 gigabit ethernet functionality needs to be based upon STREAMs as part
  of the GLDv3.

The problem here is that Solaris' DDI requires all minor nodes of a single
device (dev_info_t) to be either STREAMs or character/block device.  There
is no way to mix and match.


Historial Workarounds:
----------------------

The historical workarounds have varied here.  There are two approaches
that the author has seen so far:

1) Nexus driver approach.  In this approach, the device is a nexus,
   and separate device drivers for each type of personality are developed.
   While this works, the problems with it stem from the fact that the nexus
   framework is very awkward, undocumented, and not available to 3rd parties.
   It also creates an artifical branch point in the tree, which might not
   be the way to handle.  This approach can get in the way of code sharing.
   For legacy products, like the converged networking device, making use of
   this may require substantial rearchitecture of already complex devices
   and subsystems.

2) Pseudo-driver.  A slightly different approach, by using LDI (or legacy
   equivalents) and a pseudo device, its possible to dynamically create minor
   nodes of a different device type.  The austr device in the audio subsystem
   was created for this purpose.  While it works, and doesn't violate any
   public DDI, its incredibly awkward, and requires devices to do extra magic
   outside of the driver to make sure that minor nodes are properly reflected.
   (For example, the audio framework performs a devfsadm -i austr during early
   boot to make sure that the instance for this pseudo driver is attached
   so that it can create and remove minor nodes on demand from the master
   device.)

   A similar approach was used in the Solaris 8 software for Venus -- the
   crypto minor nodes were owned by the crypto framework, rather than the
   physical device instance.  As a result of this, the framework needed
   to ensure that its minor node was always ready, and the drivers exported
   the ddi-no-autodetach and ddi-forceattach properties to make sure that
   hardware associated with the crypto was always ready to go.


Proposed Solution:
------------------

We'd like to propose that it should be possible for a device driver
to export minor nodes of both types (STREAMs and character/block devices.)

The main challenge here is deciding which entry points to use (the
cb_ops or the streamtab ones).  The solution is relatively simple.

At open(9e) time, the STREAMS point shall be allowed to return a
special errno (already defined), ENOSTR, to indicate that the minor node
supplied is not associated with STREAMS, but rather that the specfs framework
should retry the open using the character based open(9e).

(As an aside, if some STREAMS entry point did return ENOSTR without
intending this new semantic, the framework will call nodev() -- which
is what STREAMS drivers are supposed to use for the cb_open() entry
point -- returning ENODEV back to the calling application.  While the
semantic change is unfortunate, since ENOSTR is not a documented or
used return code from open(9e) today, it should not be viewed as a
problem.)

All other specfs entry points can simply change their checks for
STREAMSTAB(major) to a check for an already open stream.

For example, in spec_write():

        if (STREAMSTAB(getmajor(dev))) {
                ASSERT(vp->v_type == VCHR);
                smark(sp, SUPD);
                return (strwrite(vp, uiop, cr));
        }

Would be rewritten as:

      if (vp->v_stream != NULL) {
                ASSERT(vp->v_type == VCHR);
                smark(sp, SUPD);
                return (strwrite(vp, uiop, cr));
      }

(Note that strwrite already has an ASSERT(vp->v_stream) as its first line
of executable code other than variable assignment).


Minor Number Management:
------------------------

Note that while the above problem solves the situation generally,
there is a problem with frameworks where the framework manages the
minor number space.  The GLDv3 is one such framework.

Resolving this problem will need to be done on a case by case basis, and falls
outside of the scope of this case.

However, we would recommend future creators of such frameworks provide
for a mechanism for device drivers to control their own minor number
space.  It seems short sighted to assume that a single framework will
reasonably be able to know all the ways in which a device might want
to use minor numbers.


6. Resources and Schedule
    6.4. Steering Committee requested information
        6.4.1. Consolidation C-team Name:
                ON
    6.5. ARC review type: FastTrack
    6.6. ARC Exposure: open


Reply via email to