The PC Boot Process
By Scud-O <s...@thtj.com>
Your PC has 3 basic steps when it starts up. After these three
steps, your OS will be loading, and things are different from OS to
OS. Covered will be Both DOS Systems and Linux/UNIX systems.
Step 1: PC Power
As soon as you press the power button of your computer, electricity
flows to every circuit in the computer.
Step 2: Hardware Check
Once your system is getting power, there needs to be functioning
components in your computer, so your computer's BIOS tells the CPU to
go check on the status of the hardware. The hardware check usually
involves a memory check and count, a check for disk drives, ROM checks
Keyboard checks, speaker beepings, etc.
When your computer starts up, the CPU sits there not knowing what to
do next. The CPU is a dumb thing, it always needs to be told what to
do next. Luckily for us, the BIOS ( Basic Input / Output System ) is
on the motherboard to tell the CPU how it can communicate with the
Keyboard, Monitor, Hard Drives, etc. The BIOS is a very basic program
that is stored in ROM ( Read Only Memory ) which is always started up
when the computer is starting up. BIOS must be stored in ROM, because
most other memory, namely RAM and cache lose their information that
they store when the power is off. ROM never forgets.
All Intel chips have 1 main thing in common: when they power up, they
start right off by executing instructions located 16 bytes below the
1024k level, also known as FFFF:0000. That is why BIOS chips get
locations in the ranges of 960k to 1024k, so they can fill the needs
of the PC.
The BIOS's main job initially is to inventory and initialize
everything that is in your PC. This process can be broken down into 5
main steps:
1. Test low memory
For your computer to operate normally, it needs RAM to
work with. Most BIOSs will start by testing the bottom
part of your system's RAM. If this process fails, then
most BIOS will lock up and crash, unable to recover.
2. Scan for other BIOSs
The BIOS on your computer is not made to support
everything out there. Unusual video cards, SCSI cards,
Network cards, etc. all usually have their own BIOS
built into their circuit boards. The main BIOS on your
computer decides to play nice and let these mini-BIOSs
go first. These add on ROM normally have 3 signature
bytes on them for easy identification. The first bytes
are at hex 55, then AA, and finally a number that
indicates how long the BIOS will be. Divide this
number by 512, and you have the BIOS length. The main
BIOS also normally finds these other BIOS ROMs in the
memory addresses between 768k and 960k.
3. Yield to other BIOSs
If your main BIOS finds any other BIOSs, it kindly
lets them go first. For example if you had a video
card with its own BIOS, you would see its copyrights
and notices before the main ROM said that it was done.
Another example could be a SCSI card. When my computer
boots up, the SCSI card's BIOS prints a few copyrights
and notices to the screen, and lets me hit <Alt><Esc>
or something if I want to modify my SCSI IDs, low
level format a drive, add or subtract a drive, etc.
Once these BIOSs have finished all they need to do,
the main BIOS will go back to work.
4. Inventory system
Now that all of the BIOSs have done everything that
they need to do, the main BIOS goes and inventories
everything that it will have control over. At a
minimum the BIOS will at least check the RAM. The BIOS
will also quickly check the hard drives and floppy
drives. You can see this by the quick flashing of LEDs
on the hard drive LED and the floppy drive. The BIOS
will wait for these drives to respond, and they can
wait a long time. Some systems could wait up until 4
or 5 minutes until it would report a hard drive
failure. Also during this BIOS startup process, the
CMOS Setup information is read as well. The CMOS gives
your computer a detailed report of hard drive and
floppy drive information as well as disk layout, etc.
5. Test the system
Most of this step is included in the inventory, the
BIOS checks the RAM, floppy disks, hard drives, etc.
The process I have just described is often refered to as Power On
Self Test (POST). This is a test performed by the CPU where it checks
the various parts of the computer to determine if it it working
properly. Some things that you will see on your monitor while the CPU
is doing the POST:
1. Memory is counted
2. Messages from the CPU as peripherals are checked
3. Lights on the keyboard blinking
4. Possibly a beep from the speaker.
After all of this, the CPU then goes hunting for the rest of the OS.
It first checks for, and reads (if present), a floppy disk with the
OS, and if that fails, then it goes to the hard drive.
Step 3: Load the OS
Once the first 2 steps have been completed, the OS is ready to be
loaded. This is where things become OS dependant and they split off
from OS to OS. First I will discuss DOS and how it is loaded, and then
I will show how a Linux/UNIS OS would be loaded for comparision.
DOS:
By far the most common platform for computers is the Microsoft DOS OS.
(Windows 95 is still included in this discussion since it still
follows may of the same instructions that DOS follows for the boot up
process.) When a DOS system has gotten up to this step for the boot up
it performs the following steps:
1. Scan drives A:, and then C: to find the drive that is
ready with the OS.
2. If the drive the CPU is reading from is C:, load up the
Master Boot Record (MBR). If loading from the A drive,
skip to the DOS Boot Record (step 4).
3. Execute the program in the MBR and find the bootable
partition in the MBR.
4. Load the DOS Boot Record (DBR), the first sector of the
primary DOS partition or the first sector of a bootable
DOS floppy.
5. Pass control to the DBR.
6. The DBR directs the loading of the 'hidden files' IO.SYS
and MSDOS.SYS (for an MS-DOS system, IBMBIO.COM and
IBMDOS.COM for PC-DOS), which comprise most of DOS.
7. The first hidden file (IO.SYS or IBMBIO.COM) reloads the
other hidden files.
8. The first hidden file also load and intperprets the
CONFIG.SYS file and all of its device drivers.
9. Unless the SHELL statement says otherwise, DOS loads the
command shell COMMAND.COM from the root directory (C:\)
10. COMMAND.COM loads and executes AUTOEXEC.BAT
It is time to look at a few of these steps in more detail.
Load the MBR: Once the BIOS has founda drive that is ready for booting
(A: or C:), it loads the first sector of the data from
that drive. The coordinates are: cylinder 0, head 0,
sector 1.
Things are a bit different for floppy disks, but over
all they are pretty much the same, and as hard drives
are more common to boot from, we will focus on booting
from a hard drive. The CPU looks into the hard drive for
The MBR when it loads. The MBR is only 512 bytes, so
under normal conditions it should go into the memory
quite easily. Otherwise the system will not boot up.
Also, if the MBR is empty, then it will have nothing to
load and it will not boot up, since the MBR has a
program that has to run, it has to tell the CPU what to
look at to continue to load. Viruses commonly erase or
modify the MBR for this purpose, a messed up MBR will
cause a computer that does not run.
Load the DBR: After the MBR has been loaded, BIOS gives control over
to it. Since hard drives can be broken into many pieces
to run many OSes or for other reasons, the MBR has to
tell the CPU where to go next. The MBR does this, and
the CPU goes and finds the bootable partition. Once the
bootable partition have been found, the MBR passes over
control to the first sector of that hard drive. If you
have ever heard of boot sector viruses, this is where
they become active. They take over the first sector, run
the code they have to spread, and then tell the CPU
where it has stored the DBR and tells it to go there to
finish loading. Once the DBR is in control it finds the
2 'hidden' programs that work behind the scenes to
control the very basic interactions of the OS. These 2
files are IO.SYS and MSDOS.SYS for an MS-DOS system, and
IBMBIO.COM and IBMDOS.COM for those of you using PC-DOS.
If these 2 files are not found, you will see the ever
familiar 'Non-system disk or disk error' message.
Execute the hidden files: After the DBR has loaded the hidden files,
It passes control over to the first hidden file, IO.SYS
and disappears. The first file then double checks that
it has loaded properly and also checks the other hidden
file, MSDOS.SYS. Once up and running, IO.SYS loads the
CONFIG.SYS file.
Assuming that you do have a CONFIG.SYS file, the CPU
executes the commands in CONFIG.SYS and loads the
BUFFERS, FILES, STACKS, DOS, LAST DRIVE, and FCBS
commands and any device drivers.
Once CONFIG.SYS has loaded, and there is no SHELL
command in the CONFIG.SYS file, the hidden files will
load in COMMAND.COM (if a SHELL command is found, it
replaces COMMAND.COM with the file specified).
COMMAND.COM is the shell of DOS (just like sh or bash
are shells for UNIX) that is the program that takes user
input and loads programs at the user's request.
After all of this, AUTOEXEC.BAT is run, and any commands
that are in AUTOEXEC.BAT are run.
Linux/UNIX:
Linux is a complex Operating System. Linux has many, many parts to it
and thusly, it must move itself around many times when it it loading.
When an x86 processor is turned on it is a 16-bit processor that can
only see and access 1 MB of RAM. (Yes, even you kids with your little
Pentium II 300MHz computers and 128 MBs of RAM, you only have a dinky
16-bit processor and 1 meg of RAM on start up!) This mode of the
processor is known as 'real mode' and it necessary for compatiblity
with older processors. Everything must be in that 1 meg of RAM, the
firmware BIOS, video buffers, memory for expansion boards and the
infamous little 640k of RAM all must reside there. Adding to this
problem is that fact that BIOSs on PCs only load half a kilobyte of
code and establishes its own memory layout before it even loads the
first sector. Whatever the boot media might be (floppy, disk, etc.)
the first sector of the bootable partition is loaded into memory at
0x7c00, which is where all of the execution begins. What happens at
0x7c00 all depends on the method used to load Linux. There are 3 main
methods to boot up a Linux kernel, so we will discuss those 3 methods.
They are: booting the kernel from a floppy disk, LILO, and Loadlin.
Booting zImage and bzImage
Although most people have moved to using LILO these days, you can
still boot Linux up from a copy of the raw kernel on a floppy disk.
However, with the ever expanding size of the Linux kernel, this soon
may not be an easy possiblity. To try out this booting with out LILO,
place a disk in your floppy drive, and type ' cat zImage > /dev/fd0 '.
This should work perfectly on a Linux system. To configure your new
boot kernel, use rdev.
The file zImage that you just copied onto a floppy disk is the
compressed kernel image that resides in 'arch/i386/boot' after
'make zImage' or 'make boot' has been executed. The latter command
seems to be more universal for UNIX, so you can probably get the same
thing on a BSD or Sun box. If, instead you made a "big zImage", the
kernel file is called bzImage and is placed in the same directory.
As stated above, it is hard to boot a Linux kernel on a x86
machine because of the limited amount of memory available on boot up.
Linux has to move itself around several times to maximize the 640k of
memory that it has. When booting a zImage kernel, Linux performs the
following steps to boot up. All of the following path names are
relative to arch/i386/boot.
1. The first sector (0x7c00) moves itself up to 0x90000 and
loads subsequent sectors after itself, getting them from
the boot device using the BIOS's functions to access the
disk. The rest of the kernel is then loaded to 0x10000 to
allow for a maximum size of half a megabyte of data (this
is the compressed image). The boot sector code is at
bootsect.S, a real mode assembly file.
2. The code at 0x90200 (defined in setup.S) takes care of
a few of the hardware initializations and allows the
default text mode (video.S) to be changed. (Text mode
selection have been a compile time option since kernel
2.1.9)
3. Afterward the kernel is moved from 0x10000 (64k) to
0x1000 (4k). This move overwrites the BIOS data stored in
RAM, so BIOS calls can no longer be done. The first
physical page is not touched because it is the so-called
'zero-page' used for dealing with virtual memory.
4. At this point setup.S enters protected mode and jumps to
0x1000 where the kernel lives. All of the available
memory is available to be accessed now, and the system is
allowed to begin to run.
The above steps held true when the kernel was under half a megabyte
and able to fit into the half of a megabyte that was assigned to it,
the range between 0x10000 and 0x90000. As Linux has developed, it has
has many features added to it, and it has grown to well over half of
a megabyte of code. Needless to say, the kernel can no longer be
moved to 0x1000. These days the code at 0x1000 is the gunzip part of
gzip. The following steps have now been added to uncompress the
kernel and run it:
5. head.S in the compressed directory is at 0x1000 and it is
in charge of gunzipping the kernel. It does this by
simply calling the decompress_kernel function that is
defined in compressed/misc.c, which calls inflate, which
then goes and writes its output starting at 0x100000 (1MB)
High memory can now be accessed, because the processor is
definitely out of its limited boot environment - also
known as the 'real' mode.
6. After decompression, head.S jumps to tha actual beginning
of the kernel. The relevant code is in ../kernel/head.S,
outside of the boot directory.
The boot process is now over, and head.S (the code found at 0x100000
that used to be at 0x1000 before compressed boots were introduced) can
complete the processor initialization and call start_kernel(). After
this step, all of the code is written in C.
The process described above is great, but it only works if the
compressed kernel can fit into a half a megabyte of space, something
that some kernels are unable to do. If you have alot of device drivers
in your kernel, or if you are just installing Linux, and it has all of
its device drivers inside the kernel, a half of a megabyte is just
simply not enough. bzImage is the solution, and it was introduced in
kernel version 1.3.73.
You can generate bzImage by typing 'make bzImage' from the top
of the Linux source directory. This kind of kernel boots very
similarly to zImage, with a few modifications:
1. When the system is loaded to 0x10000, a little
helper code routine is called after loading each
64k data block. This helper code moves the data
block into high memory by implementing a special
BIOS call. Only in the newer BIOS versions is this
call implemented, so 'make boot' will still build
only the conventional zImage, but this may change
with in a short time period.
2. setup.S does not move the system back into 0x1000
(4k). Instead, after entering protected mode, it
jumps ahead to 0x100000 (1MB) where data has been
moved by BIOS in the previous step.
3. The decompressor found at 1MB writes the
uncompressed kernel image into low memory until it
is exhausted, and then into high memory after the
compressed image. The two pieces are then
reassembled to the address 0x100000. Several
memory moves are needed to perform this correctly.
The rule for building the big compressed image can be read
from the Makefile; it affects several files in arch/i386/boot. A good
thing about bzImage is that when kernel/head.S is called, it doesn't
notice the extra work, and everything goes forward as usual.
LILO : The LInux LOader
Most Linux systems on a x86 box don't boot the raw kernel, they use
LILO, and boot from the hard drive. LILO replaces part of the process
decribed above so that it can load Linux from a kernel that may be
scattered all throughout a disk. This allows you to boot linux from a
partition with out using a boot floppy. (Although you can run LILO off
of a floppy disk, if you do not wish to modify your hard drives MBR.)
LILO uses the BIOS services to load single sectors from disk
and then it jumps to setup.S. It arranges the memory layout in the
same fashion as bootsect.S; thus the usual booting procedure can be
done painlessly. LILO is also able to handle kernel command lines,
which is why LILO is more popular to use that loading the raw kernel
from a floppy.
If for some strange reason, you want to boot a bzImage with
LILO, you have to use version 18 or later of LILO. Earlier versions
were not designed to handle loading segments into high memory, an
ability that is needed for loading big images, so that setup.S to
find the memory layout that it needs and expects.
The biggest problem and disadvantage that LILO has is that it
uses the BIOS to load the whole system. This forces the kernel and
other important files to use the first 1024 cylinders of disks to be
accessible to the BIOS. When using a PC's hardware, you can see how
old-fashioned the architecture of the PC really is. (Yes, even you
there with your Pentium II 300, it is as slow and old as my 486/66!)
Loadlin
Say you are running MS-DOS or Win95, and you realize that you need to
load Linux for something real quick. Oh no! now I have to shutdown
the current OS and spent 10 minutes waiting for Linux to load up! Well
if you have loadlin installed, you can cut alot of this time out and
boot up Linux while your other OS is running. This program is similar
to LILO in that it loads the kernel from a disk partition and then
jumps to setup.S. However, it is different from LILO in that it not
only faces BIOS restrictions, but it must also dispose of established
memory layouts without compromising the system's stability. However,
loadlin does not face the half a kilobyte length that LILO does
because loadlin is a complete program, not just a boot sector. If you
are running anything after version 1.6 of loadlin, you can load the
big images.
Loadlin is able to pass a command line to the kernel and is
therefore, as flexible as LILO. For most purposes you will write a
batch file script for automated loadlin loading so that you dont have
to type a huge command line each time you load linux. Most people call
this batch file linux.bat. Loadlin is also capable of turning any
networked PC into a Linux box. All you need is the kernel equipped for
mounting the root partition via NFS, and loadlin and linux.bat
containing the correct IP addresses. You will of course also need a
propperly configured NFS server, but any linux box can do that job.
As an example, the following command line out turn a PC into a Linux
workstation:
loadlin c:\zimage rw nfsroot=/usr/root/minos \
nfsaddrs=168.143.27.120:168.143.27.1:168.143.143.2\
54:255.255.255.0:minos.mulder.clark.net
(The above is just an example, this command line would fail
on my Linux box since the IPs are incorrect, and I do not
run an NFS server, nor do I have more that one PC.)
start_kernel()
Once the architecture-specific initializations have been
completed (Linux is available on Alpha chips and SPARC systems
as well as PCs), the init/main.c takes control of whatever
processor you may be running, in this case an x86.
start_kernel() first calls on setup_arch(), which is
the very last architecture-specific function. Unlike the other
architecture-specific functions, this call can exploit all of
the processor's features and is much easier source to deal
with than the ones described before. This function is defined
in the kernel/setup.c under the architecture-specific source
tree. start_kernel() then initializes all of the kernel's
subsystems - IPC, networking, buffer cache, etc. After all of
this has taken place, the following 2 lines complete the code:
kernel_thread(init, NULL, 0);
cpu_idle(NULL);
The init thread is process number 1, it mounts the
root partition, executes /linuxrc if CONFIG_INITRD has been
selected during compile time, and then executes the init. If
init can not be found, /etc/rc is then executed. In general,
using rc is discouraged, since init is much more flexible than
a shell script in handling system configurations. for kernels
after 2.1.21, the /etc/rc(/) option is removed, thus making
it obsolete.
If neither init or /etc/rc can run of if they exit,
/bin/sh is executed repeatedly (2.1.21 and later will only
execute it once). this feature is only there as a safegaurd in
case init is removed or corrupted by mistake. If you remove
a.out support from the kernel without recompiling your old
init, you will at least have a shell that you can use to fix
your errors with after rebooting.
the kernel has not more jobs that it must do after is
spawns process 1, since all other functions are then handled
by init, /etc/rc, or /bin/sh in user space. What about process
0? This so called 'idle' task executes cpu_idle(), a function
that calls idle() in an endless loop. idle() in turn is an
architecture-dependant function that is usually in charge of
turning off the processor to save power and increase the
processor's lifetime.