Adam M. Costello wrote:
>> As an implementor of a web server, how do I know how to interpret the >> "host:" field in the HTTP header? If I see something that looks like >> ACE, I suppose I would need to decode it to the original UTF-8 to >> compare with my virtual domain definitions. > >No, you wouldn't. If you don't know anything about IDNs, then in your >mind all host names are ASCII, so you would do an ASCII comparison >between the Host: field (which is ASCII) and your virtual domain >definitions (which are also ASCII), and that would work just fine. This is only in the mind of a IETF hacker. In the mind of the common man, a host name is a name composed of letters, digits and some more characters. Any letters, not just English letters. The limitation to English letters is an artificial limit and accepted by common man. When entering host names, it does not matter where, common man will use native character set using any letters used in native language. The same applies to URLs or URIs. The can contain any letter. Today lots of people use non-ASCII letters in URLs, embedd them in links in HTML, enter them in browsers. Quite often they work as expected. Common man cannot understand why IETF does not take the easy route and change its artficial and incomprehensible limits in character allowed. Just update the current RFC to allow any character outside the ASCII range. It time everybody face the facts: if you define an RFC restricting things to ASCII where is is natural to allow any letter, people will ignore the RFC and use any letter anyway. There is so much talk about backward compatibility - but you only think about ASCII. There is a lot of inofficial use of non-ASCII in URLs, host names and other places. We must support them also. Dan
