[lk] Linux Kernel 2.6: the Future of Embedded Computing, Part I

The embedded computing universe includes computers of all sizes, from tiny portable devices--such as wristwatch cameras--to systems having thousands of nodes distributed worldwide, as is the case with telecommunications switches. Embedded systems can be simple enough to require only small microcontrollers, or they may require massive parallel processors with prodigious amounts of memory and computing power. PDAs, microwaves, mobile phones and the like are a few typical examples.

With the release of kernel 2.6, Linux now poses serious competition to major RTOS vendors, such as VxWorks and WinCE, in the embedded market space. Linux 2.6 introduces many new features that make it an excellent operating system for embedded computing. Among these new features are enhanced real-time performance, easier porting to new computers, support for large memory models, support for microcontrollers and an improved I/O system. In this two-part article, we look at the improvements in the 2.6 kernel that have empowered it to make a foray into the demanding world of embedded computing.

Characteristics of Embedded Systems

For the uninitiated, embedded systems often need to meet timing constraints reliably. This is the most important criterion of an embedded system. Thus, a job executed after its scheduled deadline is as good as--or as bad as--a job executed incorrectly. In addition, embedded systems have access to far fewer resources than does a normal PC. They have to squeeze maximum value out of whatever is available. Often, a price difference of as little as $2 or $3 can make or break a consumer device produced on a large scale, such as wrist-watches.

Some real-world applications require the OS to be as reliable as possible, because the application is a part of a mission-critical operation. The OS should perform reliably and efficiently, if possible, under the cases of extreme load. If a crash occurs in one part of the module, it should not effect other parts of the system. Furthermore, recovery from crashes should be graceful. This more or less holds true for other non-critical embedded applications as well.

How Linux 2.6 Satisfies the Requirements

Having seen the requirements of a general embedded system, we now look at how well the Linux 2.6 kernel can satisfy these requirements. As mentioned earlier, embedded systems have stringent timing requirements. Although Linux 2.6 is not yet a true real-time operating system, it does contain improvements that make it a worthier platform than previous kernels when responsiveness is desirable. Three significant improvements are preemption points in the kernel, an efficient scheduler and improved synchronization.

Kernel Preemption

As with most general-purpose operating systems, Linux always has forbidden the process scheduler from running when a process is executing in a system call. Therefore, once a task is in a system call, that task controls the processor until the system call returns, no matter how long that might take. As of kernel 2.6, the kernel is preemptible. A kernel task now can be preempted, so that some important user application can continue to run. In Linux 2.6, kernel code has been laced with preemption points, instructions that allow the scheduler to run and possibly block the current process so as to schedule a higher priority process. Linux 2.6 avoids unreasonable delays in system calls by periodically testing a preemption point. During these tests, the calling process may block and let another process run. Thus, under Linux 2.6, the kernel now can be interrupted mid-task, so other applications can continue to run even when something low-level and complicated is going on in the background.

Embedded software often has to meet deadlines that renders it incompatible with virtual memory demand paging, in which slow handling of page faults would ruin responsiveness. To eliminate this problem, the 2.6 kernel can be built with no virtual memory system. Of course, it then becomes the software designer's responsibility to ensure enough real memory always is available to get the job done.

An Efficient Scheduler

The process scheduler has been rewritten in the 2.6 kernel to eliminate the slow algorithms of previous versions. Formerly, in order to decide which task should run next, the scheduler had to look at each ready task and make a computation to determine that task's relative importance. After all computations were made, the task with the highest score would be chosen. Because the time required for this algorithm varied with the number of tasks, complex multitasking applications suffered from slow scheduling.

The scheduler in Linux 2.6 no longer scans all tasks every time. Instead, when a task becomes ready to run, it is sorted into position on a queue, called the current queue. Then, when the scheduler runs, it chooses the task at the most favorable position in the queue. As a result, scheduling is done in a constant amount of time. When the task is running, it is given a time slice, or a period of time in which it may use the processor, before it has to give way to another thread. When its time slice has expired, the task is moved to another queue, called the expired queue. The task is sorted into this expired queue according to its priority. This new procedure is substantially faster than the old one, and it works equally as well whether there are many tasks or only a few in queue. This new scheduler is called the O(1) scheduler.

New Synchronization Primitives

Applications involving the use of shared resources, such as shared memory or shared devices, have to be developed carefully to avoid race conditions. The solution implemented in Linux, called Mutex, ensured that only one task is using the resource at a time. Mutex involved a system call to the kernel to decide whether to block the thread or allow it to continue executing. But when the decision is to continue, the time-consuming system call was unnecessary. The new implementation in Linux 2.6 supports Fast User-Space Mutexes (Futex). These functions can check from user space whether blocking is necessary and perform the system call to block the thread only when it is required. When blocking is not required, avoiding the unneeded system call saves time. It also supports setting priorities to allow applications or threads of higher priority to have first access to the contested resource.

Improved Threading Model and Support for NPTL

LinuxThreads, the current Linux thread library in Linux, is bad. In fact, "the fellow is as brain-damaged as LinuxThreads" is a common _expression_ among kernel hackers. The improved threading model in 2.6 is based on a 1:1 threading model, one kernel thread for one user thread. It also includes in-kernel support for the new Native Posix Threading Library (NPTL). The kernel's internal threading infrastructure has been rewritten to allow the Native POSIX Thread Library to run on top of it.

NPTL brings an eight-fold improvement over its predecessor. Tests conducted by its authors have shown that Linux, with this new threading, can start and stop 100,000 threads simultaneously in about two seconds. This task took 15 minutes on the old threading model.

Along with POSIX threads, 2.6 provides POSIX signals and POSIX high-resolution timers as part of the mainstream kernel. POSIX signals are an improvement over UNIX-style signals, which were the default in previous Linux releases. Unlike UNIX signals, POSIX signals cannot be lost and can carry information as an argument. Also, POSIX signals can be sent from one POSIX thread to another, rather than only from process to process, like UNIX signals.

Embedded systems often need to poll hardware or do other tasks on a fixed schedule. POSIX timers make it easy to arrange any task to be scheduled periodically. The clock that the timer uses can be set to tick at a rate as fine as one kilohertz, so software engineers can control the scheduling of tasks with precision.

Subarchitecture Support

Hardware designs in the embedded world often are customized for special applications. It is common for designers to need to solve a design issue in an original way. For example, a purpose-built board may use different IRQ management than what a similar reference design uses. In order to run on the new board, Linux has to be ported or altered to support the new hardware. This porting is easier if the operating system is made of components that are well separated, making it necessary to change only the code that has to change. The components of Linux 2.6 that are likely to be altered for a custom design have been refactored with a concept called Subarchitecture. Components are separated clearly and can be modified or replaced individually with minimal impact on other components of the board support package.

By formalizing Linux's support for the slightly different hardware types, the kernel can be ported more easily to other systems, such as dedicated storage hardware and other components that use industry-dominant processor types.

Linux on Microcontrollers

In the embedded marketplace, simpler microcontrollers often are the appropriate choice when low cost and simplicity are called for. Linux 2.6 comes with the acceptance and merging of much of the uClinux project into the mainstream kernel. The uClinux project is the Linux for Microcontrollers project. This variant of Linux has been a major driver of support for Linux in the embedded market. Unlike the normal Linux ports we are accustomed to, embedded ports do not have all the features that we associate with the kernel, due to hardware limitations. The primary difference is these ports feature processors that do not feature an MMU, or memory management unit--what makes a protected-mode OS protected. Although these generally are true multitasking Linux systems, they are missing memory protection and other related features. Without memory protection, it is possible for a wayward process to read the data of, or even crash, other processes on the system. This may make them unsuitable for a multi-user system but an excellent choice for a low-cost PDA or dedicated device.

The 2.6 version of Linux supports several current microcontrollers that don't have memory management units. Linux 2.6 supports Motorola m68k processors, such as Dragonball and ColdFire, as well as Hitachi H8/300 and NEC v850. Also supported is the ETRAX family of networking microcontrollers by Axis.

Audio & Multimedia

With major vendors in the consumer devices market forming such associations as the Consumer Embedded Linux Forum (CELF), Linux is becoming the first choice among operating systems in the consumer devices market. To help with the consumer devices market, Linux 2.6 includes the Advanced Linux Sound Architecture, or ALSA. This state-of-the-art facility supports USB and MIDI devices with fully thread-safe, multi-processor-safe software. With ALSA, a system can run multiple sound cards and do such things as play and record at the same time or mix multiple audio streams.

Video4Linux, the system for supporting video, is all new in Linux 2.6. Although it is not backward-compatible with previous video paradigms, it is intended for the latest stable versions of radio and TV tuners, video cameras and other multimedia. And on a completely new track, Linux 2.6 includes the first built-in support for Digital Video Broadcasting (DVB) hardware. This type of hardware, common in set-top boxes, can be used to make a Linux server into a TiVo-like device, with the appropriate software.

In Part II of this article, we look at human device interfaces, networking filesystems and 64-bit machines.