Filing systems

Ian . Collier Thu, 2 Feb 1995 18:50:51 +0100

On Tue, 31 Jan 1995 20:12:59 +0000 (GMT), Simon Cooke said:
> > > btw: If anybody would like to come up with details of how to store files 
> > > (I can handle directory structures dead easy) for the hard-drive, I'd be 
> > > most grateful.


> > I thought we went through this a couple of months ago...

> Yeah, but it was everyone saying "Do it with this and that and the 
> other", but with no technical details.

Possibility 1.  I thought we did go into quite a bit of detail.
Possibility 2.  You didn't ask. ;-)

> So, if anyone's willing to spoon-feed me on this one?

OK, I'll feed.  Take it or leave it...
This information comes from "Operating Systems: Design and Implementation"
by A. S. Tanenbaum (Prentice-Hall International, 1987 ISBN 0-13-637331-3),
by the way.  It refers mostly to Minix, which has a slightly miniaturised
file system.

The disk is usually addressed in terms of 1K blocks.  The only part of
the system that knows about 512-byte sectors is the device driver.  For
addressing purposes within the filing system, the blocks are joined together
into zones.  Small disks (e.g. floppies) can have 1K zones so that "block"
and "zone" means the same, but larger zones will probably be useful for
very large disks.  Zones have at least two advantages and at least two
disadvantages.

 D1. Each file wastes on average half a zone, and very small files take
     up one zone each.
 D2. If the operating system allows one to write a file with holes in it
     then care must be taken to clear blocks at an appropriate time.  For
     example, suppose that on a system with 4K zones someone creates a
     1-byte file then seeks to address 4000 and writes another byte.  The
     user will have written to the first and last blocks in a zone and the
     system will have cleared those.  The system should make sure that the
     second and third blocks in the zone are also cleared so that the file
     does not contain garbage bytes.

 A1. Reading files sequentially is a bit faster when large zones are used,
     because the files have more blocks which are close to each other on the
     disk.
 A2. With 4K zones we only need 16 bits to write a disk address for a 256M
     disk, for example.

(don't ask me why they didn't have 512-byte blocks...)

A file is identified by its i-node.  An i-node contains information about
where the file is located on disk, how long it is and what its attributes
are.  A Minix i-node is 32 bytes long and contains the following information.

size/bits description

16        Mode [file type and rwx bits]
16        Uid
32        File size
32        Time of last modification
 8        Links [number of references to this file]
 8        Gid
16 x 7    disk addresses of the first 7 zones in the file
16        Indirect disk address
16        Doubly indirect disk address

If the file is 7 or fewer zones long, all the disk address information is
stored in the i-node.  If the file occupies between 7 and <a large number>
zones then the first 7 are listed in the i-node and a free zone is allocated
to store the rest of the list.  The number of the zone containing this list
is stored as the indirect disk address.  If the file is even bigger than
that, then the remaining zones of the file are listed in free zones, the
zones containing the lists are listed in another zone, and this zone number
is stored as the doubly indirect disk address.

The i-nodes are usually allocated and numbered when the file system is
created and occupy a portion of the disk which is separate from the zones
used to hold the data.  i-node 0 is immediately cleared and marked as
used and is never allocated (the "find free i-node" routine returns 0 on
failure).  A bitmap is used to define which i-nodes are currently used.

Unless someone can spot a fatal flaw, it seems to me that if you don't
feel happy with allocating the i-nodes first (it sets a maximum for the
number of files on the system and might waste a bit of space if the maximum
is not reached) you could store each file's i-node at the start of the file
itself.  In that case, instead of an i-node number for each file you have a
zone number.  However, this might make it more difficult to do consistency
checks or to un-erase lost files.

A directory is a simply a file which contains a list of names and numbers.
To write a file, you write its data, fill in the i-node, and then write the
name of the file and its i-node number in a directory.  Most BSD filing
systems have directories which can store names of varying lengths and
which have entries looking approximately like this.

  position of next entry
  i-node number
  length of this record
  "name ending with nul character"

It is possible to reference the same physical file in two different
directories.  The number of times a file is referenced is kept in the
"link" field in the i-node.  When a file is removed from a directory
this number is decremented and the file itself may be erased when the
link count reaches zero.

A symbolic link is just a file containing the name of another file.

The list of free zones may be kept in a bitmap or in a linked list (or
both).  Since the zones are free, it is OK to write in each zone the next
zone number in the chain.

If the linked list method is used and if erased files are moved to the end
of the list each time then the most recently erased file will not have its
blocks overwritten until the disk is nearly full, so before that it can be
un-erased.  For the i-nodes you might want to use a system which does not
mark an i-node as free immediately when a file is erased.  When i-nodes
become short, you either do a garbage collection (collect all the i-nodes
that have zero link fields) or keep a list of recently erased files so that
their i-nodes can be reclaimed.

Is that enough?

> ps Anyone know anything about the OS/2 HPFS system?

I'm sure someone does, but that person is not I.

imc

Filing systems

Reply via email to