Geoff Winkless wrote:
> Why not a tagged format and tools which can handle both types?

I've been fleshing out the details of a potential new format to efficiently
represent any SAM disk type.  It uses your suggested tagged structure, some
of the tag types I mentioned recently, and Edwin's fragment idea for sparse
non-empty sectors.  Rather than wrapping existing image formats, it's a
separate format in its own right (perhaps to replace the existing unfinished
SDF?).

I've written enough code to write images out to a file in the new format, to
see how it handles various existing files.  Writing valid images is fairly
simple, but writing efficient/compact files requires a bit more analysis of
what is being written.  Normal disks are stored very compactly by
recognising patterns in the disk format, and custom disk formats also store
only non-empty tracks.

Each file starts with a 4-byte header: an ASCII signature (currently "XXX"
for testing) and a version byte (0x10 = version 1.0).  The rest of the file
is formed from tagged blocks, each with a 2-byte ASCII type and 2-byte
length (all values are stored in little-endian format), followed by the
specified length of data.  I've currently got the following types:

 "TX" - null-terminated ASCII text block
 "PF" - pre-format details, for regular format disks
 "SD" - sector data to fill pre-formatted areas
 "TK" - custom track format+data
 "RT" - raw track data
 "EN" - file end marker

"TX" : an optional field with no fixed format, most likely used for disk
descriptions, copyrights, etc.


"PF" : the data is an 11-byte block containing the following 8-bit values:

  Sides - number of sides in the pre-format block (normal=2)
  Tracks - number of tracks in the pre-format block (normal=80)
  Sectors - sectors per track for all sectors (normal=10)
  Size - sector size code (0=128, 1=256, 2=512 (normal), 3=1024) for all
sectors
  Fill - fill byte to use for data areas (normal=0x00)
  Base - lowest sector number in track (normal=1)
  Start - first sector number on track 0 of side 0 (normal=1)
  Step - step value between ordered sector numbers (normal=1)
  Interleave - number of sectors to advance between consecutive sectors on a
track (normal=1)
  TrackSkew - sector gap to the same sector number on the following track
(MasterDOS=1,SAMDOS=2)
  SideSkew - sector gap to the same sector number on the same track of the
next side (SAM=0,CP/M=1)

The pre-format block is a rectangular region of tracks spanning out from
track 0 of side 0 to whatever disk extent is needed.  It provides a
convenient way to prepare a region using a regular format pattern, without
having to list each track individually.  Custom data can then be filled in
with "SD" tags, or tracks can be completely reformatted with "TK" tracks.
The 6 parameters controlling the sector order allow almost any regular
format to be efficiently represented, including regular skewed SAM formats
and the Pro-Dos CP/M format.


"SD" : the data begins with a single byte containing the "physical" location
for the track.  The lower 7 bits contain the track number, with bit 7 set if
the data is for side 2 (this could use separate side and track bytes, but my
thought is that 2 sides and 128 tracks is enough for any real-world disk).  

The location byte is followed by a 1-byte packing code for each sector
present on the track, and for this reason, the track must already exist from
an earlier PF or TK tag.  After the packing bytes comes the sector data,
stored in the order the sectors are found on the track.  The meaning of the
data depends on the packing code for the each sector, which are as follows:

  0 = sector is filled with the default fill byte found in the PF header.
This is the default for sectors in pre-formatted tracks, but when overriding
the sector data in an existing track it must be supplied again explicitly.
  1 = sector is filled with a single byte, taken from the sector data block
  2 = sector is a fragment, using 5 bytes in the data block.  The first data
byte is the default fill byte for the sector, the next two bytes give an
offset into the sector for the start of the fragment, and the two bytes
after that are the block length.
  3 = the full sector data is supplied, and taken from the sector block

Additional codes and compression methods could be added quite easily, but
I've left out anything more complicated to keep the format simple.  It would
be better to store and supply the image in a zip/gzip file, which will have
compression superior to anything I could add.  Zip files are probably
preferred for familiarity, and have the added benefit of a built-in CRC.


"TK" : the data begins with a byte for the physical location of the track
(as with SD) and a byte holding the number of sectors in the track.  This is
followed by a block of 6-byte sector headers, and then the data block for
the sectors.  Each sector header holds:

  Track - id field track value (may be faked)
  Side - id field side value (may be faked)
  Sector - sector number
  Size - size code (as detailed in "PF" above)
  Flags -  bit 0 is set if there is a crc error in the id field header, bit
1 is set if there is no data field for the header, bit 2 is set if the data
field has a crc error, and bit 3 is set if the sector has a 'deleted' data
address mark.  All other bits are unused, and should be zero.
  Pack - pack method used to pack the data in each sector, the same as
described for "SD", except the default fill byte is always taken as zero.

The TK tag is used for non-standard tracks, or any track defined outside the
pre-format block.  The need for sector headers makes it more expensive to
use in comparison with a PF block filled using SD.


"RT" : the data begins with a single byte containing the physical location
for the track data.  The remaining data is what should be returned from a
READ_TRACK floppy controller command, and will be up to 6250 bytes long.
Only software relying on the contents of the track gaps (such as the
protected Defender disk) will need this tag, and even then only for tracks
that are read in raw form.


"EN" : there is no data for this tag, as it marks the end of the disk.
There should be nothing else in the file following the tag, and any extra
data will be ignored.


Running a variety of images through my test encoder gives the following
image sizes:

unformatted disk = 8 bytes (4-byte header + 4 byte EN tag!)
empty SAM disk = 23 bytes (header + PF tag + EN tag)
empty CP/M disk = 23 bytes (header + PF tag + EN tag)
disk containing the 10K samdos2 file = 8230 bytes (header + PF + a few SD
tags + EN)
Lemmings = 746284 bytes (header + lots of TK tags + EN)

The size of regular format disks is now proportional to the amount of data
on them, giving good side savings in most cases.  The final disk is the
largest (2 sides, 83 tracks), strangest (custom format with mixed sector
sizes) one I could find, and was 1019904 bytes in the old SDF format!

Any comments?  Think of anything that could be improved?  Or anything I've
not explained in enough well enough?

Si

Reply via email to