Re: wget and international characters (ascii > 127)

Hrvoje Niksic Wed, 19 Oct 2005 04:06:44 -0700

Olav Mørkrid <[EMAIL PROTECTED]> writes:

> hrvoje, you say that wget will presume utf-8,


I didn't say that, I said that the two-byte sequence presumably
(i.e. presumed by me) represents UTF-8 and that Wget leaves it as-is.

> but then wget should have decoded %C3%AD to an accented i (í). but
> today wget simply decodes the characters one by one,

At least on Unix, that seems exactly the right thing to do.  How
should Wget know what is the encoding of the file system?

> wouldn't the correct thing be NOT to decode escaped characters (at
> least over 127), because it could mean anything depending on page
> authors intention of assumed encoding.

You have a point there.

Re: wget and international characters (ascii > 127)

Reply via email to