[Sylvain] > Would there be any interest in asking the HTTP-BIS working group [1] what > they think about it? > > Currently I couldn't find anything in their drafts suggesting they had > decided to clarify this issue from a protocol's perspective but they might > consider it to be relevant to their goals. > > - Sylvain > > [1] http://www.ietf.org/html.charters/httpbis-charter.html
I checked the current version of their replacement for RFC 2616. It says """ 2.1.3. URI Comparison When comparing two URIs to decide if they match or not, a client SHOULD use a case-sensitive octet-by-octet comparison of the entire URIs """ Which doesn't work if the two URIs to be compared are in different encodings. I did find this page on the W3C site which at least explains the issues, and does a survey of existing modern browsers for how they encode URIs and IRIs. http://www.w3.org/International/articles/idn-and-iri/ """ Paths The conversion process for parts of the IRI relating to the path is already supported natively in the latest versions of IE7, Firefox, Opera, Safari and Google Chrome. It works in Internet Explorer 6 if the option in Tools>Internet Options>Advanced>Always send URLs as UTF-8 is turned on. This means that links in HTML, or addresses typed into the browser's address bar will be correctly converted in those user agents. It doesn't work out of the box for Firefox 2 (although you may obtain results if the IRI and the resource name are in the same encoding), but technically-aware users can turn on an option to support this (set network.standard-url.encode-utf8 to true in about:config). Whether or not the resource is found on the server, however, is a different question. If the file system is in UTF-8, there should be no problem. If not, and no mechanism is available to convert addresses from UTF-8 to the appropriate encoding, the request will fail. Files are normally exposed as UTF-8 by servers such as IIS and Apache 2 on Windows and Mac OS X. Unix and Linux users can store file names in UTF-8, or use the mod_fileiri module mentioned earlier. Version 1 of the Apache server doesn't yet expose filenames as UTF-8. You can run a basic check whether it works for your client and resource using this simple test. Note that, while the basics may work, there are other somewhat more complicated aspects of IRI support, such as handling of bidirectional text in Arabic or Hebrew, which may need some additional time for full implementation. """ Alan. _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com