"Ian Abbott" <[EMAIL PROTECTED]> writes:

> what I actually used was more like the following:
> 
>    wget -r -l 1 http://somesite/~user/index.html \
>    http://somesite/~user/a.html
> 
> which resulted in a.html being downloaded twice.
> 
> If I replace the ~'s on the command-line with %7E's then it
> correctly downloads a.html only once.

This is informative; thanks.  Does this patch fix the problem:


2001-12-19  Hrvoje Niksic  <[EMAIL PROTECTED]>

        * recur.c (retrieve_tree): Enqueue the canonical representation of
        start_url, so that the test against dl_url_file_map works.

Index: src/recur.c
===================================================================
RCS file: /pack/anoncvs/wget/src/recur.c,v
retrieving revision 1.40
diff -u -r1.40 recur.c
--- src/recur.c 2001/12/18 22:20:14     1.40
+++ src/recur.c 2001/12/19 14:23:22
@@ -196,8 +196,10 @@
      now. */
   struct url *start_url_parsed = url_parse (start_url, NULL);
 
-  url_enqueue (queue, xstrdup (start_url), NULL, 0);
-  string_set_add (blacklist, start_url);
+  /* Enqueue the starting URL.  Use start_url_parsed->url rather than
+     just URL so we enqueue the canonical form of the URL.  */
+  url_enqueue (queue, xstrdup (start_url_parsed->url), NULL, 0);
+  string_set_add (blacklist, start_url_parsed->url);
 
   while (1)
     {

Reply via email to