"Roy T. Fielding" <[EMAIL PROTECTED]> writes: > > Apache 1.3 on Win32 assumes that the names of files served are comprised > > solely of characters from character sets which are a superset of ASCII, > > such as UTF-8 or ISO-8859-1. It has no logic to determine whether or not > > You wanted to say "from character encodings that are a superset".
It is a good thing that I wasn't working on the license :) Names of file-based resources with Apache 1.3 on Win32 Apache 1.3 on Win32 assumes that the names of files served are comprised solely of characters from character encodings that are a superset of ASCII, such as UTF-8 or ISO-8859-1. By superset, we mean encodings where the non-ASCII characters do not reuse byte values 0x00-0x7F. Apache has no logic to determine whether or not a possible file name contains invalid characters. It has no logic to properly match actual non-ASCII file names with names specified in the Apache configuration file. Because Apache does not verify that the characters in file names are all from a valid character encoding, files containing various invalid characters in their names can be successfully served by Apache. However, this is not recommended for the following reasons: 1) Because Apache is unable to properly match actual non-ASCII file names with names in the Apache configuration file, taking into account any case folding or other transformations handled by the operating system when looking up files or otherwise matching file names, directives in the Apache configuration file may or may not be in force, depending on how the HTTP client specifies the resource. This may be a security concern, depending on your configuration. 2) Because Apache assumes that file names are from a character encoding which is a superset of ASCII, some of the checks it makes when validating file names will flag certain non-ASCII characters as invalid. For example, Apache on Win32 will flag a file name containing the ASCII character '|' (0x7C) as invalid. This logic will flag any file name containing the byte 0x7C as invalid, even if that byte does not represent '|' in the local character encoding. There are other characters checked for as well. Because of these checks, even if there are no security issues in your configuration, many Unicode characters or other wide characters cannot be used. Because of the lack of proper support for non-ASCII characters in file names, it is recommended that administrators not attempt to use any non-ASCII characters in file names. Any other configuration is unsupported. Apache 2.0 introduces the UTF-8 convention to access any filenames and resources in a predictable and safe manner. The implementation of this feature is too extensive to consider backporting to Apache 1.3. -- Jeff Trawick | [EMAIL PROTECTED] Born in Roswell... married an alien...
