On 2017-02-22, at 17:38, Thomas Dickey wrote:

> On Wed, Feb 22, 2017 at 10:32:24PM +0200, Dimitrios Semitsoglou-Tsiapos wrote:
>> Greetings Lynx developers and users!
>> 
>> I have noticed that in `-dump` mode lynx will percent-encode reserved
>> characters in the "list of links" if `-display_charset=UTF-8` is set (or
>> perhaps any value other than ISO-8859-1). This can cause some URLs to
>> effectively break.
>> 
>> Would it perhaps be correct to simply ignore `display_charset` while
>> printing these URLs?
> 
> not really - it's generating the file (not passing it on), and is
> using a known encoding.
>  
From: https://tools.ietf.org/html/rfc1738
   2.2. URL Character Encoding Issues
    ...
   URLs are written only with the graphic printable characters of the
   US-ASCII coded character set. The octets 80-FF hexadecimal are not
   used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
   control characters; these must be encoded.

So non-USASCII UTF-8 characters must be encoded.

-- gil


_______________________________________________
Lynx-dev mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/lynx-dev

Reply via email to