On 1/21/2011 2:22 PM, William A. Rowe Jr. wrote: > > # Defaults to off, set to On to preserve %2F, Decode to use '/', > # or choose a unique pattern to avoid path exploits directed at > # back end servers. Note that either On or Decode can be very > # risky to back end servers, and may circumvent either httpd's > # or back end server access restrictions. A third option to > # accept encoded slashes is to assign them to a special value > # which would not cause httpd or back end servers to treat them > # as special characters, as one example, for the private private > # unicode point F02F to represent %2f, enable > # AllowEncodedSlashes %ef%80%af > > I would suggest we push for recognition of second-meanings of the basic > ASCII characters %00 - %7F As unicode private code points F000-F07F, > and coordinate this allocation with the ConScript effort; > http://en.wikipedia.org/wiki/ConScript_Unicode_Registry
That needs to be %ee%be%af in the example above. In fact, it should not be replaced with an encoded string, but the actual unicode pattern, e.g. a three octal sequence. Let me think about the syntax for a bit, so that utf-8 is simpler to read and type in the config. I expect that this will be a different between "%ee%be%af" and unquoted %ee%be%af, both of which would be valid but would have completely different meanings (either escaped or un-escaped/decoded, respectively). I goofed, to dodge the WGL4, some win32 silliness, and the AGL, it looks like it needs a different range. The EF00-EF7F range looks appropriate and I'm willing to advance this first as a ConScript proposal, and later raise it to the Unicode body. Essentially the definition would state something like; Many network transport, storage and carriage control layers use the basic ASCII set of characters to define specific behaviors. One of the earliest examples is the NULL termination behavior of the 0000 code point for string storage. This convention proposes adoption of PUA code points EF00 - EF7F to map such reserved characters for safe transport for presentation purposes, avoiding all primary-use behaviors of their 0000-007F encoding by transport, storage or carriage control protocols. As two examples, the code point EF2F represents the '/' FORWARD_SLASH character independent of and explicitly prohibiting the function of 002F as a file path delimiter, while the code point EF00 could be used to pass a 0000 glyph which averts NULL termination of the character string. I just happen to like the EF00 code page since the implicit 'y' in the middle of the word should be voiced in its pronunciation.
