"Ian Abbott" <[EMAIL PROTECTED]> writes: > what I actually used was more like the following: > > wget -r -l 1 http://somesite/~user/index.html \ > http://somesite/~user/a.html > > which resulted in a.html being downloaded twice. > > If I replace the ~'s on the command-line with %7E's then it > correctly downloads a.html only once.
This is informative; thanks. Does this patch fix the problem: 2001-12-19 Hrvoje Niksic <[EMAIL PROTECTED]> * recur.c (retrieve_tree): Enqueue the canonical representation of start_url, so that the test against dl_url_file_map works. Index: src/recur.c =================================================================== RCS file: /pack/anoncvs/wget/src/recur.c,v retrieving revision 1.40 diff -u -r1.40 recur.c --- src/recur.c 2001/12/18 22:20:14 1.40 +++ src/recur.c 2001/12/19 14:23:22 @@ -196,8 +196,10 @@ now. */ struct url *start_url_parsed = url_parse (start_url, NULL); - url_enqueue (queue, xstrdup (start_url), NULL, 0); - string_set_add (blacklist, start_url); + /* Enqueue the starting URL. Use start_url_parsed->url rather than + just URL so we enqueue the canonical form of the URL. */ + url_enqueue (queue, xstrdup (start_url_parsed->url), NULL, 0); + string_set_add (blacklist, start_url_parsed->url); while (1) {