Hi everyone, The just-released version of sfront (0.62) incorporates many of the ideas discussed on a linux-audio-dev thread last month on sfront. This posting describes the implementation decisions that were made, as an epilogue for the LAD archives. See the end of this posting for details on picking up sfront 0.62. 1. Intro -------- As a reminder, sfront takes MPEG 4 Structured Audio programs users write, and generates C programs that produce audio when compiled and executed. By specifying "control" and "audio" drivers, the programs can process audio and control input, and send audio output to devices. If the inputs are "non-interactive" in nature, the C program acts as an audio streaming engine for MP4-SA. If the inputs are "interactive", the C program is a low-latency effects processor and/or music instrument. The drivers described in this posting are: -aout linux: uses OSS API to send audio to soundcard. -ain linux: uses OSS API to read audio from soundcard. -cin linmidi: uses the OSS /dev/midi port to grab non-timestamped MIDI input from the MIDI In jack on a soundcard. Also recall that the "core" program sfront generates (i.e. w/o driver code) is ANSI C with only ANSI libraries (so that for simple rendering of Structured Audio to files, its platform independent). 2. Blocking I/O and Sfront -------------------------- The main impact the linux-audio-dev discussion had on sfront 0.62 involves sfront's "-playback" mode. When the "-playback" flag is used when generating the C file, the "-aout linux" audio driver sets the number of fragments to be a small fixed sized (4 turned out to be best, given various sfront internals issues), so that I/O blocking happens continously under normal operation. The "-playback" flag supports interactive as well as streaming applications, so you can use this blocked I/O approach for all sfront uses. The user can set a total latency value (i.e. number of seconds of audio held by 4 fragments) as a command-line sfront option, or else sfront uses sensible defaults, based on whether the -ain and -cin choices reflect a streaming application (in which latency defaults to 0.3 seconds) or an interactive application (in which latency defaults to 0.002 seconds). The actual latency used (i.e. the fragment size chosen) is closest latency value that results in power-of-two sizes for fragments. 3. Blocking I/O and SCHED_FIFO ------------------------------ Using geteuid(), the C program sfront generates detects when it is run as root, and sets SCHED_FIFO mode when -playback mode is set. Since blocking should occur regularly in -playback mode, the console should still be active. Several safety mechanisms are built in to prevent system lockup and performance problems: -- A timer is used to detect when a single "control cycle" of audio (settable by the user's SAOL program, but typically 1 to 50 milliseconds) takes more than 3 seconds of CPU time. If this happens, we assume the SAOL program has an infinite loop in it, and abort the C program with an error message. We use the ITIMER_PROF for this, which is not ideal, but we need the ITIMER_REAL for another purpose (see below). -- Using SNDCTL_DSP_GETOPTR ioctls, we sense if a block occurs for every audio write(). If 2 seconds go by without a block, we do a nanosleep() of 2ms + epsilon, to force a block. This lets any cntrl-C's through to the xterm to let users kill the session. -- A timer is set up to catch MIDI Input overruns. The es1370.c driver (to pick an example) uses an array of 256 bytes to store MIDI Input, which under maximum load corresponds to about 80 milliseconds (256 bytes * 0.0003125 milliseconds per byte). Under normal operation, the C program sfront creates should easily service this buffer before overflow. However, under severe load, the buffer may not be serviced, and NoteOff's may be lost, causing stuck notes. This failure mode has been reported in the field, usually when driving sfront from a sequencer via the MIDI In jack with heavy data streams. This problem is fixed in sfront 0.62: an ITIMER_REAL timer interrupts once every 40 milliseconds to sweep the MIDI data to a temporary buffer. This timer gets reset on control cycle boundaries, and so never goes off in normal circumstances, so its a low-overhead solution (a timer reset once every k-cycle). A side effect of using these timers was the need to harden the code against partial I/O operations, as well as EAGAIN and EINTR failures. 4. Non-Blocking I/O and its uses -------------------------------- In addition to -playback mode, -timesync mode continues to be supported in sfront as well. In -timesync mode (typically used for interactive applications only): -- A large number of fragments are requested -- Sfront spins at each k-cycle boundary In this mode, if the only I/O that happens is audio output and /dev/midi reads in O_NONBLOCK mode, the C program sfront creates never blocks. Because the program never blocks, if we run in SCHED_FIFO mode, the console is frozen (i.e. no ASCII keyboard or mouse input) while the program is running. Clearly, the -playback safeguards described above for SCHED_FIFO aren't sufficient here. We add an extra "dead-man" condition: At the start of the C program, the program is in SCHED_OTHER mode. Once a NoteOn or NoteOff is processed from the MIDI In jack, we switch to SCHED_FIFO. The program stays in that mode until 5 seconds elapse without a new NoteOn or NoteOff, at which the program returns to SCHED_OTHER mode. Return to SCHED_FIFO mode with the next NoteOn or NoteOff. This works well in practice -- when all else fails, pull the MIDI In cord from the soundcard, and you're guaranteed to get control back of the console in 5 seconds. This dead-man scheme wouldn't work while using audio input without MIDI input: detecting "no audio input" is too risky to implement in a world where soundcards can feed back on themselves. So sfront detects this condition and does not enter SCHED_FIFO mode. However, for my setup at least (OSS/Free running es1370.c), -timesync SCHED_FIFO is ineffective while using audio input _and_ MIDI input, for a different reason: If you spin on SNDCTL_DSP_GETISPACE to ensure a fragment is always ready before doing a read() to the soundcard, and you're in SCHED_FIFO mode, you seem to (at least sometimes) give up control to the SCHED_OTHER processes anyways. Using select in addition to SNDCTL_DSP_GETISPACE doesn't seem to help. So, there isn't a way to implement "freeze-mode" when using audio input. I'm unsure if this is an OS or sound driver bug, or reflects correct OSS behavior under SCHED_FIFO mode. 5. In the future ... -------------------- In the near term, expect only bug-fixes and algorithm tuning for sfront's Linux audio I/O system. In the long-term, the remaining issues in Linux audio I/O include: -- ALSA support: Right now, you can use -aout linux and -ain linux in OSS compatibility mode, but because ALSA users had trouble getting the /dev/midi interface working, there's a contributed -cin alsamidi module (thanks to Steven Pickles). Some of the features mentioned above (most notably, freeze-mode on -timesync, and the ITIMER_REAL for MIDI Input overrun) haven't been added to -cin alsamidi, since this SCHED_FIFO stuff really needs to be tested thoroughly before a release on real hardware, and I don't have ALSA installed on my machine at the moment. -- Experimenting with mlockall() -- Sfront 0.62 doesn't use mlockall() to pin down working pages in main memory. Since a SAOL program can (indirectly) malloc an unbounded amount of memory, any use of mlockall() must not interfere with the fail-safe measures for regaining machine control out of SCHED_FIFO mode, and stress testing needs to be done to verify safe operation. -- Machine auditing. In sfront 0.62, the C program sfront generates makes suggestions about whether to use -playback or -timesync, what latency value to use if the default isn't good, ect, based on the drivers chosen and the SAOL program. An obvious extension is to do a ps, and look for processes and daemons that are know to be bad with respect to kernel latency issues (syslogd syncing the disk, load monitors that use proc), and suggest that the user kill them if highest-quality audio is desired. ----- Finally, here's the announcement for the new sfront: Pick up sfront 0.62 07/05/00 at: http://www.cs.berkeley.edu/~lazzaro/sa/ Change log message: [1] Standard name cpuload is now supported in -playback and -timesync mode, for all drivers. Cpuload is a ksig, that takes on a value between 0 (machine is not loaded at all) and 1 (any further loading of machine risks loss of real-time playback). Cpuload is computed with no temporal filtering or windowing, and shows the performance on the last kcycle. [2] The -ain/-aout linux and -cin linmidi drivers are enhanced. The drivers are more robust against lockup and MIDI data loss. When run as root, the driver uses POSIX real-time scheduling to reduce audio dropouts (while we have carefully tested these root features, bugs in programs run as root can cause file-system damage: use at your own risk). Sa.c start-up screen now suggests the best sfront flags to use for a given patch. FreeBSD and linux sources are now merged, with many of these features available for both operating systems. Thanks to Paul Barton-Davis, Bertrand Petit, Benno Senoner, Kai Vehmanen, and the folks at saol.net. [3] Bugfixes in: memory allocation, dsound driver, random number generators, polymorphic table opcode rate semantics, -cin fstr file streaming, wavetable generator sizing, and array parameters in user defined opcodes. Thanks to Richard Dobson and the folks at saol.net. ------------------------------------------------------------------------- John Lazzaro -- Research Specialist -- CS Division -- EECS -- UC Berkeley lazzaro [at] cs [dot] berkeley [dot] edu www.cs.berkeley.edu/~lazzaro -------------------------------------------------------------------------