I'm submitting the following case on my own behalf. Times out on 7/13/2009. This case seeks patch binding.
Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI This information is Copyright 2009 Sun Microsystems 1. Introduction 1.1. Project/Component Working Name: STREAMS and Character Device Coexistence 1.2. Name of Document Author/Supplier: Author: Garrett D'Amore 1.3 Date of This Document: 06 July, 2009 4. Technical Description STREAMS and Character Device Coexistence ---------------------------------------- Background: ----------- There are two different frameworks for device drivers in Solaris -- one that implies a device is a typical character/block device (and hence supports the typical read(2), write(2), etc. entry points), and another that assumes that devices are STREAMs devices and export their entry points via cb_ops. Most device drivers fall firmly into one style (character/block) or the other (STREAMs.) Usually which kind of device the driver acts as is determined by what type of hardware it is. For example, NIC drivers are required to be STREAMs devices in Solaris. Audio devices historically were STREAMs based, although since Boomer (PSARC 2008/318) they are now character devices. Problem: -------- Some devices for one reason or another cannot properly be described as a just character/block or STREAMs. This is a problem that has recurred many times in the author's experience. Some specific examples: * Venus (Sun Crypto Accelerator 4000) -- this device is a NIC, so it must be a STREAMs device. However, for the Crypto framework used at the time (kcl, in Solaris 8) the crypto functionality was expressed via character interfaces. (STREAMs was deemed far to cumbersome for this kind of functionality since a lot of complex structures had to be copied back and forth.) * Audio - to be compatible with the legacy Audio API, audio devices need to be STREAMs devices. But the new style OSS API is character based, and trying to impose STREAMs created a problem where STREAMs queueing semantics resulted in latency that the OSS API could not express to applications (and hence applications lost the ability to perform accurate positioning within the stream.) * Converged IB/networking devices. Devices from some vendors can support either Infiniband or 10GbE functionality, based upon firmware configuration or external influences. The Infiniband functionality and framework was designed to be character/block driven (especially it wants to perform some efficient mmap(2) operations and wants to avoid STREAMs latency) while the 10 gigabit ethernet functionality needs to be based upon STREAMs as part of the GLDv3. The problem here is that Solaris' DDI requires all minor nodes of a single device (dev_info_t) to be either STREAMs or character/block device. There is no way to mix and match. Historial Workarounds: ---------------------- The historical workarounds have varied here. There are two approaches that the author has seen so far: 1) Nexus driver approach. In this approach, the device is a nexus, and separate device drivers for each type of personality are developed. While this works, the problems with it stem from the fact that the nexus framework is very awkward, undocumented, and not available to 3rd parties. It also creates an artifical branch point in the tree, which might not be the way to handle. This approach can get in the way of code sharing. For legacy products, like the converged networking device, making use of this may require substantial rearchitecture of already complex devices and subsystems. 2) Pseudo-driver. A slightly different approach, by using LDI (or legacy equivalents) and a pseudo device, its possible to dynamically create minor nodes of a different device type. The austr device in the audio subsystem was created for this purpose. While it works, and doesn't violate any public DDI, its incredibly awkward, and requires devices to do extra magic outside of the driver to make sure that minor nodes are properly reflected. (For example, the audio framework performs a devfsadm -i austr during early boot to make sure that the instance for this pseudo driver is attached so that it can create and remove minor nodes on demand from the master device.) A similar approach was used in the Solaris 8 software for Venus -- the crypto minor nodes were owned by the crypto framework, rather than the physical device instance. As a result of this, the framework needed to ensure that its minor node was always ready, and the drivers exported the ddi-no-autodetach and ddi-forceattach properties to make sure that hardware associated with the crypto was always ready to go. Proposed Solution: ------------------ We'd like to propose that it should be possible for a device driver to export minor nodes of both types (STREAMs and character/block devices.) The main challenge here is deciding which entry points to use (the cb_ops or the streamtab ones). The solution is relatively simple. At open(9e) time, the STREAMS point shall be allowed to return a special errno (already defined), ENOSTR, to indicate that the minor node supplied is not associated with STREAMS, but rather that the specfs framework should retry the open using the character based open(9e). (As an aside, if some STREAMS entry point did return ENOSTR without intending this new semantic, the framework will call nodev() -- which is what STREAMS drivers are supposed to use for the cb_open() entry point -- returning ENODEV back to the calling application. While the semantic change is unfortunate, since ENOSTR is not a documented or used return code from open(9e) today, it should not be viewed as a problem.) All other specfs entry points can simply change their checks for STREAMSTAB(major) to a check for an already open stream. For example, in spec_write(): if (STREAMSTAB(getmajor(dev))) { ASSERT(vp->v_type == VCHR); smark(sp, SUPD); return (strwrite(vp, uiop, cr)); } Would be rewritten as: if (vp->v_stream != NULL) { ASSERT(vp->v_type == VCHR); smark(sp, SUPD); return (strwrite(vp, uiop, cr)); } (Note that strwrite already has an ASSERT(vp->v_stream) as its first line of executable code other than variable assignment). Minor Number Management: ------------------------ Note that while the above problem solves the situation generally, there is a problem with frameworks where the framework manages the minor number space. The GLDv3 is one such framework. Resolving this problem will need to be done on a case by case basis, and falls outside of the scope of this case. However, we would recommend future creators of such frameworks provide for a mechanism for device drivers to control their own minor number space. It seems short sighted to assume that a single framework will reasonably be able to know all the ways in which a device might want to use minor numbers. 6. Resources and Schedule 6.4. Steering Committee requested information 6.4.1. Consolidation C-team Name: ON 6.5. ARC review type: FastTrack 6.6. ARC Exposure: open