Hello,
        I would like to modify how Apache associates files with MIME Types, but I'm 
not exactly clear how to implement it without gutting mod_mime.c (which some Apache 
developers may not like). This document is longer than I would have liked, but if I 
dived into code examples without some background I wouldn't be too popular.

        Generally speaking my goal is to add support for mapping HFS+ file types to 
MIME types in Apache running on Darwin 1.x. Currently Apache only supports mapping 
filename extensions (in a case insensitive manner) to MIME types, and also optionally 
using 'magic' numbers which read the first few bytes of a file to determine it's type. 
Most OS X users would probably find utility in this change. I wish to implement this 
in a manner which:

a) Is as flexible if not more so than the current filename extension behavior.

b) Doesn't clutter Apache code with Darwin-specific code (and loads of #ifdef 
__APPLE__ macros, nobody wants that)

c) Doesn't clutter Apache resource files

d) Doesn't change the current mechanism significantly

e) Doesn't give anybody a reason to disallow it being merged with Apache source. 

        I have studied the source and documentation off and on for a long time and 
have arrived at a few stumbling blocks, mainly because I don't know what changes are 
liable to make others disagreeable. For example there is no effort anywhere to 
distinguish between filename extensions and other forms of 'type declaration' in the 
source, config files, database, or directives. It's just implied that, for example, 
"AddType" means add a filename extension to MIME type mapping entry. I have several 
thoughtful suggestions how I can add support for another mapping type without causing 
confusion. As soon as somebody tells me which is preferable, I will likely seek advice 
how to implement it without gutting current source.

Runtime database:
        To implement a new file mapping I can either:

a) Create a new entry, disassociated with mod_mime's current entries. Thus a new entry 
like forced_types_hfs could be "image/jpeg 'JPEG'".

b) Create a new entry like (a), but also change the filename extension mapping entries 
to be more specific like forced_filename_extension_mapping_types or forced_types_ext.

c) Use the current entries, but change the syntax so filename extensions and file 
types are easily distinguished. Thus a forced_types entry could be "image/jpeg jpeg 
jpg jpe 'JPEG'" where 'JPEG' means file type.

        The difficulty in implementation (explained later) is about the same.

Apache Directive Syntax:
        The current syntax merely implies filename extension mapping. Even the period 
prefix which can be used to signify that an entry is a filename extension is optional 
(like "Addtype image/gif .gif" is the same as "Addtype image/gif gif"). To implement a 
new file mapping directive syntax I can either:

a) Create a new set of directives, similar to the current ones. For example 
"AddTypeHFS image/gif 'GIF '"

b) Do the same, and also change the filename extension syntax to me more specific. For 
example "AddExtensionMimeMapping" or "AddTypeExt"

c) Use the same directive, but change the syntax to allow file type entries like 
"AddType image/jpeg jpeg jpg jpe 'JPEG'"

One possible problem is a file type isn't limited to 7bit ascii. If this is important 
(please tell me) I can change the syntax to '0xFFFFFFFF" or "'0xFFFFFFFF". I don't 
anticipate a problem with upper ascii characters since they will be platform specific 
anyway (and will have it's own separate config file).

        The difficulty in implementation is about the same.

Implementation in source code:
        There is no cost in fetching the file type so long you do it while fetching 
the filename (where is this done?). If you try to fetch a file type when it isn't set, 
or from a filesystem which doesn't support them it returns null. Thus there is no 
reason not to always have this option on when running on Darwin. The primary 
difference between filename extensions and file types is Apache matches filename 
extensions in a case insensitive manner while file types are any 32bit value (or four 
char code).
        It does this by converting filename extensions to lower case before entering 
them in the runtime database, then converting the filename extension in question to 
lower case, then strcmp(). Personally I don't understand why strcasecmp() wasn't used 
instead, perhaps because it was slightly faster. I don't know many filename extensions 
which are case dependent (.C vs .c, .Z vs .z) so the option of making it case 
sensitive given a directive would be rather pointless. Anyway I'm not going to 
challenge that decision (unless somebody else does) so I'm not going to muck with it.
        However it does mean that if I want to use the same database entry for file 
types, I will have to check if the first character is "'". The currently code does 
this in several places (which makes me wonder why there isn't a macro):

if (*ext == '.')
        ++ext;

 ap_str_tolower(ct);

so I would have to change the code to something effectively like:

if (*ext == '.')
{
        ++ext;
        ap_str_tolower(ct);
}
else if (*ext != '\'')
else //assume filename extenion
        ap_str_tolower(ct);

        If this were a macro, the #ifdef __Apple__ would only be in one place instead 
of every directive function.

        If this modification to mod_mime.c were done, I would also want to add a 
directive to change which mapping had priority. Thus one Darwin user may prefer 
filename extension mapping, another may prefer file type mapping, and another may want 
filename extension mapping in one directory only.

        If you want me to develop a separate module the priority issue will have to be 
addressed because load order doesn't allow the user to change the priority on a 
directory-specific basis. 

        If you have any suggestions or questions, please go right ahead. I'm partial 
to using the existing directives with a slightly modified syntax and a modified 
mod_mime. All existing config files would work as-is and the source will probably be 
smaller. All I really need to know is what method stands the greatest change of being 
submitted, where does Apache read the filename, and how to submit the changes. 

Thank you for your time. And remember, vote soon and often |-)

Reply via email to