"Gisle Vanem" <[EMAIL PROTECTED]> writes: > "Hrvoje Niksic" <[EMAIL PROTECTED]> said: > >> > If it where not for the "Host:" header, the name could remain >> > un-escaped. I don't know what the standard say about this case. >> > Should the header contain "Host:www.xn--troms-zua.no" ? >> >> The Host header is (I think) not URL-escaped, so we can simply send >> the 8-bit characters as we received them. >> >> Here's a patch; please let me know if it works for you. > > It works kind off; wget resolves the name okay. The problem is that > www.troms�.no is served by a virtual server that gives you what's specified > in the "Host:" header. So doing > wget www.xn--troms-zua.no > > gives the "correct" page while "wget www.troms�.no" does not (the > same in IE also). > > IMHO for this to work, wget needs to know the ACE encoded name prior > to resolving and building the HTTP header. Not a trivial task.
Oh well. I guess the HTTP code that emits the `Host' header could eventually be taught to encode it into something "smart". For now, I think I'll leave this patch in -- processing %HH in host names (and passing through binary characters in them) seems like a good idea anyway. Thanks for testing this.
