On Thu, Feb 23, 2017 at 12:28:52PM +0200, Dimitrios Semitsoglou-Tsiapos wrote: > On Wed 22-Feb-17 19:38, Thomas Dickey wrote: > > On Wed, Feb 22, 2017 at 10:32:24PM +0200, Dimitrios Semitsoglou-Tsiapos > > wrote: > > > Greetings Lynx developers and users! > > > > > > I have noticed that in `-dump` mode lynx will percent-encode reserved > > > characters in the "list of links" if `-display_charset=UTF-8` is set (or > > > perhaps any value other than ISO-8859-1). This can cause some URLs to > > > effectively break. > > > > > > Would it perhaps be correct to simply ignore `display_charset` while > > > printing these URLs? > > > > not really - it's generating the file (not passing it on), and is > > using a known encoding. > > > > I am probably misinterpreting the problem, so I will give an example. I > have received email from ebay where they encode URLs multiple times > within all their links. For example, here's three successive (but not > necessarily consecutive) chunks of a single URL: > > HTML source lynx -dump > --------------------------------- --------------------------- > http://rover.ebay.com http://rover.ebay.com > https%3A%2F%2Fsvcs.ebay.com https://svcs.ebay.com > L%252B L%2B > http%253A%252F%252Frover.ebay.com http%3A%2F%2Frover.ebay.com > > >From those I have come up with a minimal example (they probably encode > too much personal information in their arguments for me to upload the > whole URL).
well... "dump" is formatted. Why don't you simply use the HTML source? > > Would ebay be at fault here (for their encoding or server handling), > lynx, or I for using the dumped URL directly? sounds like the last - I'd use lynx on the HTML file (via -source, or from some other program such as wget) -- Thomas E. Dickey <[email protected]> http://invisible-island.net ftp://invisible-island.net
signature.asc
Description: Digital signature
_______________________________________________ Lynx-dev mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/lynx-dev
