Hoi Imre,

> You did however make the very valid point that implementing a
> compressed file system is not so easy as one might think.

Luckily the task gets a lot easier if the filesystem will be
readonly (tool to create it, driver to read it). And such a
filesystem is still pretty useful :-).



> There is indeed a problem with random access in a stream.
> That is not very trivial to solve. But it shouldn't be overly
> difficult either.

Most compressed filesystems mentioned work around this
problem by compressing the data in blocks, for example
each cluster separately. The compressed blocks can start
at any offset, but offsets are typically rounded to a
multiple of, for example, the physical sector size. If
you use big blocks, you have to decompress more data
(and possibly eat more CPU and RAM) before you can get
to the decompressed data at the end of a block. But if
you use small blocks, you get worse compression rates
as you have less context to compress "redundant" data.

Your driver either needs to buffer a whole decompressed
block or it will have to decompress the same block more
than once whenever the operating system wants to read
a part of the block. In most OSes, the operating system
always reads whole 512 byte sectors, but some OSes will
read whole clusters and others only read those parts of
a cluster that you actually have to access at a time.



As BIOS disk drivers cannot access "high" memory directly
and EMS pages are a bad idea for filesystem drivers as
you might have some app which uses EMS concurrently, you
typically have to put decompression buffers into a DOS
low memory space or at least into UMB space... Because
you want to avoid double decompression CPU overhead, you
probably want to limit the "compression block size" based
on how much DOS RAM you want to give a filesystem driver.

Doing LZSS on chunks of 4k or 8k size would be a start :-).
For comparison, FAT16 normally uses 2k or more per cluster,
FAT12 (harddisk variant) and FAT32 (if filesystem > 1/4 GB)
use 4k or more per cluster. Min/Max are always 0.5k/32k if
you want good compatibility (else 64k / even(?) 0.5-127.5k
cluster sizes, sector sizes from 32/64/128 to 512by or 2k).



> No what I mean with this is that if it is a linux driver it
> means that it is inherently linked to the internals of linux.

True. A Linux driver is more likely to be portable to SHSUCDX
style drivers than to our kernel. Still worth a try. The idea
is that SHSUCDX provides more "semantic" primitives (files
and directories etc) while the kernel built-in driver for
FAT operates in a more "lowlevel" way on FAT formatted disks
(block devices). Of course SHSUCDX itself also connects to
a kernel interface for the "semantic" side of a filesystem
(cdrom, network...). So we have a choice of interfaces.



> What I did not see was an overview of the structure of the OS

Well for romfs, the URL actually points to a spec for the FS.
I did not even look for an implementation of that one...

A more DOS friendly compressed filesystem would be one which
is FAT-ish with compressed data clusters, as suggested in the
post which started this thread :-). A driver for such a FAT-
ish compressed filesystem could give DOS a readonly block
device, and it would only have to implement basic knowledge
about the compression structures and the transformation into
normal FAT when DOS reads data or metadata. A DOS driver for
a FAT-ish compressed filesystem does not have to understand
anything about directory entries or FAT chains actually :-).

Eric


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Freedos-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freedos-devel

Reply via email to