> From: William Prescott <appledesktop...@gmail.com> > Date: Fri, 17 Feb 2017 03:34:20 -0500 > > > I would also like to note that, even when the the document's links don't > > contain a tilde, Wget will still fail to fetch the pages as long as there > > is a tilde in the URL the Wget was called with. > > Let's consider the (UTF-8) URL "http://example.com/~foo/bar.html" > bar.html is Shift_JIS encoded and contains: > <meta http-equiv="Content-Type" content="text/html;charset=Shift_JIS"> > <a href="baz.html">Baz</a> > > (this time, bar.html is perfectly valid Shift_JIS and doesn't have a tilde) > > A recursive download will fail, because the relative URL appears to get > processed as > sjis_to_utf8(utf8_to_sjis("http://example.com/~foo/") + sjis("baz.html")) > resulting in > http://example.com/‾foo/baz.html > > I would have expected > utf8("http://example.com/~foo/") + sjis_to_utf8("baz.html") > resulting in > http://example.com/~foo/baz.html
How should wget know that "http://example.com/~foo/bar.html" comes from a UTF-8 encoding? Where should that piece of information come from?