Re: [Freedos-devel] compressed FAT filesystems

Antony Gordon Fri, 28 Mar 2008 13:40:45 -0700

I would think that the information contained in the MS-DOS 6 Programmers 
reference is free of any restrictions related to that lawsuit.


An initial clean room design would allow us to imitate the core functionality 
of DoubleSpace. This would allow us to read the compressed volume files, if for 
no other purpose than compatibility. The next step would be to improve the 
design where it is lacking. 

As far as writing to the file system, DoubleSpace allocates the BitFAT, MDFAT 
and FAT to maximum capacity. The resulting CVF file is maxxed for the drive in 
question, even though in most cases, the CVF will not be large enough to 
require the maximum capacity of these structures. This preallocation allows for 
growing and shrinking the CVF file very quickly. 

The other alternative might involve some sort of GZIP done on each file, so 
there is an 'invisible' decompression each time a file is accessed, and a 
decompression when the file is moved from a volume designated 'compressed' to a 
volume designated 'not compressed'. We can then use the reserved field in the 
BPB to designate a compressed file system or add to the list of FileSystemTypes 
at the end of the BPB (i.e., FAT12C, FAT16C, FAT32C).

I don't see how writing to the CVF would be a problem, since all the 'disk' 
writes would go through the BLOCK device driver, and since the driver is in 
effect a 'TSR' of sorts, it can take over Int 13h, portions of Int 21h, Int 
25h, and Int 26h (which comprise most of how DOS applications access the disk).

-T


> Date: Fri, 28 Mar 2008 20:27:50 +0100
> From: [EMAIL PROTECTED]
> To: [email protected]
> Subject: Re: [Freedos-devel] compressed FAT filesystems
> 
> 
> Hi Antony,
> 
> for now, I would not recommend to try to clone any MS
> compressed filesystem. Remember that they themselves
> have had licensing issues with their own compressed
> filesystem. One thing which makes doublespace and co
> complex is their ability to WRITE to the filesystem
> while it is mounted / in use. Let me explain my own
> suggestion for a simpler system:
> 
> - replace the 2nd FAT by an array of cluster sizes (FAT16)
>   or cluster offsets (FAT32). The latter has better speed
>   but for the former, you can buffer some intermediate
>   offsets in RAM (eg offset of every 256th cluster)
> 
> - clusters which have same size as uncompressed ones, are
>   uncompressed clusters. Simple. Compressed clusters should
>   be stored at offsets which are multiples of the sector
>   size, to keep access times low and to simplify access...
> 
> - clusters which have size 0 are not in use. Simple.
> 
> - either you allow no writing, or if you write, you only
>   allow writes for which the compressed cluster must not
>   grow. If a write violates both constraints, you have to
>   change the FAT to use a fresh cluster at the current end
>   of the compressed filesystem. Such clusters ARE allowed
>   to grow, but only if there are no used clusters at any
>   later offset in the compressed filesystem.
> 
> - if you allow writing, do not actually shrink compressed
>   clusters. Of course this can cause fragmentation. You
>   can have a tool which collects all unused clusters of
>   nonzero size and recycles the space they take, but this
>   would happen while the filesystem is not active / mounted.
> 
> - you could use heuristics on initial filesystem creation /
>   compression, for example to avoid compression of directories
>   (this makes it easier for the directories to grow later).
> 
> 
> 
> I hope that such an implementation would be quite straigth-
> forward and have good performance for read-only access. It
> can be a FAT block device for DOS as "output" and can take
> a file, partition or other data source as "input". It would
> do the following transformations:
> 
> - first FAT and boot sector stay as they are
> 
> - access to 2nd FAT is redirected to first FAT
> 
> - access to clusters is redirected by reading the table
>   explained above (which is stored at the location of the
>   former 2nd FAT) to find out the actual starting sector /
>   offset of the cluster in the compressed filesystem, and
>   then reading and decompressing that cluster and returning
>   the decompressed contents. Writing will be more complex
>   but is not needed for a first implementation.
> 
> - note that clusters are ALWAYS in strict linear order,
>   even with the suggested method which would allow writes
>   to the filesystem as described above. That avoids any
>   need for extra redirections like "cluster heaps" :-).
> 
> 
> 
> You also need a tool for initial compression / creation of
> such a compressed filesystem. Luckily it can operate in the
> same space as the uncompressed filesystem, replacing it...
> 
> - walk through all clusters, and copy them from their normal
>   version into their compressed version. As clusters never
>   are bigger than in uncompressed state, this is as easy as
>   successively overwriting all clusters with compressed data
>   and filling the former 2nd FAT data area with the size info
>   or offset info of all compressed clusters :-)
> 
> - this works for both FAT16 and FAT32, but I would not like
>   the hassles to make it work for FAT12. Yet it would be
>   possible. In FAT12, every entry has 1.5 bytes size, and
>   every "compressed size" entry will have to be 1 or 1.5
>   bytes as well. Actually 1 byte is enough as all sizes are
>   multiples of 1 sector, as suggested above, and one cluster
>   is never more than 64 or 128 sectors (of 512 bytes) big.
> 
> - note that you can, in principle, save some space by always
>   storing only compressed SIZES. Yet as we have as much space
>   as the 2nd FAT took, it is more efficient to store 32bit
>   OFFSETS (in sector size units: up to 2 terabytes) for FAT32.
>   Then you know at once where a compressed cluster startes,
>   without having to sum up sizes :-).
> 
> - the input filesystem can be an actual partition on disk
>   or the FAT filesystem of a RAMDISK, does not really matter
> 
> - the output can be stored on the source partition or on
>   the ramdisk, but you will typically download it into a
>   file when the compression is complete, as such a compressed
>   filesystem image is smaller than the original filesystem.
> 
> - I suggest to use the source disk as the output target for
>   the compression tool and use general tools like DISKCOPY
>   to copy the compressed filesystem to a file when done...
> 
> - of course you must not write to a filesystem WHILE it is
>   being processed by the compression tool :-p.
> 
> Eric :-)
> 
> PS: 16bit OFFSETS would not be so useful, as they only allow
> partition sizes of up to 32 MB if the unit is sectors ;-).
> 
> 
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> Freedos-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/freedos-devel

_________________________________________________________________
Test your Star IQ
http://club.live.com/red_carpet_reveal.aspx?icid=redcarpet_HMTAGMAR

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

_______________________________________________
Freedos-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freedos-devel

Re: [Freedos-devel] compressed FAT filesystems

Reply via email to