Hi,
I encountered a bug in wget that occurs with recursive retrieval: if a
page contains 2 (or more) links:
<a href="http://example.com/~user/blah"> and
<a href="http://example.com/%7Euser/blah">
Both links point to the same page but the encoding is different. wget
doesn't recognise this as the same page and downloads the page 'blah'
twice. It also overwrites the first downloaded file.
Also if you specify the conversion option '-k', it only converts one of
the two links.
I had a quick look at the source code. It can be solved by changing
url_parse in url.c. Call url_unescape before parsing the url. This way
you get a the same parsed url for both links. I am not sure if this is a
good way to solve it. The conversion should probably be similar to the
conversion that's done to determine the file name of the URL.
Kind regards,
Bram.