This is a bit long, so don't read unless you are actually interested in
the file-type/suffix debate.

Okay, I'm going to stick my neck out and summarise the pro's and con's
of some existing file type identification mechanisms.

1) MS-DOS: A three character suffix

Pros: Doesn't take up much space. You and your applications don't have to
     open the file to find out what is in it.

Cons: Too short and vague, so people unintentionally use the same suffix
     for different file types, rendering the suffix merely advisory.

2) Mac Finder: A four character Creator field and a four character Type field

Pros: A bit more expressive. The Type field is usable for sharing data in
     well-known formats (e.g. GIF) between applications but the Creator
     field allows applications to be sure they are reading data they wrote.

Cons: Creator fields have to be registered with the makers Apple to be sure
     of uniqueness. When given a Macintosh file, in practice you usually have
     to have a copy of the application it was created by in order to read it,
     even if it is a theoretically portable file format like GIF or ASCII
     CR/LF delimited text.

3) Unix files: A little OS-related file typing, no application file typing.

Pros: If application writers or users want to use suffixes they can, but
     they don't have to. Suffix concatenation is widely supported,
     e.g. "input.c.Z" is clearly a compressed C program file and there is
     no limit to the size of suffixes so "AvonGorge.jpeg" and "system.twmrc"
     are valid filenames (although the latter may only mean something to
     X windows experts).

Cons: Difficult to tell what a file contains or what application created
     the file, unless the application uses a suffix that you know, or it
     is an OS-typed file like an executable. The standard Unix utility 
     "/bin/file" can identify some files by looking at the contents of the
     file for magic strings, magic numbers or identifiable syntax, but it
     ignores meaningful suffixes and needs to be kept up to date.

4) SAM MasterDOS: Some OS-related file typing, some suffix support.

Pros: Similar to DOS if you stick to 3 character suffixes and use the
     default configuration. Similar to Unix, if you change the setting
     which truncates and aligns suffixes in DIR listings.

Cons: Similar to DOS if you stick to 3 character suffixes and use the
     default configuration. Similar to Unix, if you change the setting
     which truncates and aligns suffixes in DIR listings.


I personally like the Unix approach, where the OS only enforces file
typing where it matters to the OS. On the SAM, as mentioned here
already, only the various BASIC file types need to be known to the OS.
In fact it would probably be desirable if "OPENTYPE" files
were indistinguishable in practice from "CODE" files.

A way to get round the disadvantages of the Unix approach is to
encourage application writers to use well-known suffixes for
well-known file formats and choose descriptive suffixes for
application-specific file formats. I don't think limiting the size of
suffixes encourages people to use descriptive suffixes, so having
fixed length suffixes is counter-productive. It should be up to the
application writer to assess the trade-off between the size of the
suffix and the amount of space left for the user to choose a name.

I say: A new OS for the Sam should use file types to distinguish files
      where it really matters to the OS+GUI (e.g. BASIC files, Driver
      files) but leave any other file typing to applications.
      Application writers should be persuaded to use descriptive suffixes.
      If it is felt helpful, some body could set themselves up as the
      official registry of file suffixes, and could even supply a utility
      like "/bin/file" but no OS has to be rewritten to support new
      applications.

/<eith

Reply via email to