On Mon, 15 Mar 2004, [utf-8] Jørgen Hovland wrote: > 0x00c0 352e 313b 202e 4e45 5420 434c 5220 312e 5.1;..NET.CLR.1. > 0x00d0 312e 3433 3232 290d 0a48 6f73 743a 2077 1.4322)..Host:.w > 0x00e0 7777 2e6a f872 6765 6e2e 6e75 0d0a 436f ww.j.rgen.nu..Co
This is cleary an invalid HTTP request. a) 0xf8 is not in the set of valid characters in a HTTP host name b) 0xf8 is also not a valid UTF-8 character. Loosely speaking it can be argued that HTTP in the future should be modified to support UTF-8 encoding according to the internationalisation guielines from IETF. There is however a lot of work remaining before this point is officially reached. If you want to sent requests like this you today MUST be using a browser which supports the established IDN transition encoding of non-ascii host names. Any other use is outside of any standards and things will fail. > I see that IE sends an url encoded GET line, which is what it is > supposed to do. I would put my 5 euro in that this is a non-implemented > squid feature. Yes and no. What you are trying to do is outside of any HTTP standards, and in addition in violation to the existing standards and guidelines in how it should be done. > Im simply interested in getting things to work. Permitting such domains > will acomplish this task. I have no control over all the browsers that > will be using our proxy. And Squid has neither. > As you probably are aware of, there are probably more browsers out there > not IDN capable than capable of IDN. I am well aware of this, and it is an initial pain which the network has to go thru before IDN gets widely accepted. Until then IDN names is plain not reliable as a technology and should be seen as a bonus not a must. Anyone selling IDN names without making sure the customer knows this is almost a fraud. > Rejecting such domains in a proxy software is not going to help anyone. There is limits on how much brain damage one adds to the proxy to work around browsers not following the standards, but if you find reasonable methods to deal with this brain damage you are welcome to submit a patch. > The smartest thing would be to automaticly translate to IDN in squid > directly (as an optional choice of course). We have considered accepting IDN encoding from UTF-8 into Squid, but so far nobody has sumbitted any patches for doing so. IDN encoding of other encodings is questionable. If you use --disable-hostname-checks then the IDN encoding can be done in a redirector helper to Squid if desired, without having to modify Squid. Or even better if your DNS supports UTF-8 (or whatever encoding you use) then no modifications to Squid is needed unless ofcourse your client tries to follow the standards given (according to the standard) garbage input and %nn encodes the hostname in which case this has to be undone (can also be done in the redirector helper without modifying Squid). The use of ISO-8859-X encodings within DNS for host/domain names is a bastard which in my opinion should never have been let loose on the Internet. Any DNS operator accepting ISO-8859-X encodings can be seriously questioned if they consider DNS as an important infrastructure for the Internet or just a quick way to earn money with no respect for the function DNS provides. The fact that the DNS protocol is encoding-independent does not varrant such abuse of the hostname DNS scheme (which is anything but encoding-independent). Regards Henrik