Le 30 mai 2011 à 05:03, Chris Young a écrit :

> On Sat, 14 May 2011 20:49:10 +0200, François Revol wrote:
> 
>> [const?] char *fetch_mime_by_ext(const char *filename);
> 
> Personally I hate detecting filetypes by filename/extension - it's
> prone to errors/hacking.  Look at all the problems on Windows,
> renaming a file makes it think the filetype has changed, you can trick
> things like Outlook so easily just by renaming files.

+1, that's why we use MIME types on files in xattrs and have a MIME sniffer in 
Haiku, just like BeOS did for 15 years. :p

> I also realise that sometimes the filename is the only way to
> determine a filetype.  I think an "integrated" approach would be
> better, one function which passes the filename and the data.  I see no
> reason why the data (or at least the first few bytes) couldn't be
> available before you need to work out what type it is.

Sometimes the file is empty :p

Well, there are several cases:
- existing files (file:) that can have a mime xattr, the BeOS port already 
checks them for fetch_filetype,
- virtual files (not yet downloaded urls or in-memory cached data not 
downloaded) where the OS didn't yet sniff the mime, for which we have to sniff 
ourselves.

Ideally we would:
- check if the file is real, and it has a MIME info,
- try to sniff it,
- fallback to extension matching.

The API doesn't have to provide separate functions indeed, it can be a global 
one which can have the data buffer be optional, if NULL then we just don't 
sniff.

Of course there are some corner cases like directories (BeOS also has a mime 
type for those but different than NS'), and usual types like text/html which 
the OS might differentiate a bit too much sometimes or identify differently 
(text/xml+html for xhtml?).

For ex, Haiku uses text/x-source-code and usually doesn't differentiate between 
python, perl, bash, C++ or whatever source language.

François.

Reply via email to