James Knott wrote:
Johnny Andersson wrote:
So how does Linux know what program to open a certain file with? ...

...I believe it's encoded in the first few bytes of the file. ...

I wouldn't brag about the way *nix handles file-typing. The only advantage it has over Windows is having never incorporated the type=suffix crud into the system. The filename suffix has always been just a part of the filename--nothing special at all.

The current system was bolted on and has only recently been moving toward a standard way to type files. I guess you could say Windows has for 20 years had a *bad* system that mostly works, while *nix has had *no* system. I'd call that a pretty marginal win ;-)

*nix does use the bytes of the file, often the first few bytes, but it's more by accident than by design--there's no standardized encoding of the file type anywhere. It works the same way a virus scanner works: there is a database of patterns that map file contents to file type.

This was first put together (at least as far back as the early 1980's) to support the 'file' utility, which takes any file and tries to determine and print the file type. E.g.:

$ file atest_data B2J events.*
atest_data:  directory
B2J:         ASCII text
events.csv:  UTF-8 Unicode C++ program text
events.csv~: UTF-8 Unicode English text
events.ods:  OpenDocument Spreadsheet
events.stc:  OpenOffice.org 1.x Calc spreadsheet
events.sxc:  OpenOffice.org 1.x Calc spreadsheet

You can see that file doesn't need a suffix, and that sometimes, it makes mistakes: events.csv is not a c++ program!. And I don't know why it thinks events.csv~ is different, they're just different revisions of the same data.

Just to show that the suffix doesn't matter, let's play a trick:
$ cp events.sxc events
$ cp events.sxc events.xls
$ file events*
events:      OpenOffice.org 1.x Calc spreadsheet
events.sxc:  OpenOffice.org 1.x Calc spreadsheet
events.xls:  OpenOffice.org 1.x Calc spreadsheet

File still knows that 'events' or even 'events.xls' are Calc docs.

This old strategy has been evolved to provide MIME types instead of human-readable descriptions:

$ file --mime events*
events:      application/x-zip
events.csv:  text/x-c++; charset=utf-8
events.csv~: text/plain; charset=utf-8
events.ods:  application/x-zip
events.stc:  application/x-zip
events.sxc:  application/x-zip
events.xls:  application/x-zip

You can see this isn't very accurate either.

The *nix desktop application frameworks have their own file mime-type identification systems, which work the same way, but are more robust and up-to-date, but there is no 'standard' service provided by the system. This is in the process of being standardized.

By the way, there are some cases when file extensions are needed in Linux
too, aren't there? For example .c, .g++, .h, .o, .tar etc. Doesn't an OGG
file in Linux need the .ogg extension?

That I don't know about. Also, file associations are a desktop thing. The command line doesn't know about them.

*nix has (since forever) used filename suffixes _by convention_, as a convenient way for us humans to know the file type. Some programs do understand certain extensions by default, but even there, it's usually just a run-time configuration option.

OOo uses the file suffix when it displays the list of files in the File > Open dialog, but once you tell it to open a particular file, OOo determines the type by looking at the file contents, not the name. E.g. if I open the 'events.xls' file (which is really a .ods document), OOo has no problem at all.

FMTYEWTK

<Joe

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to