Did you test your patch? I patched it on my source code and it doesn't work.
There are lot of files under http://biz.yahoo.com/edu/, but the patched code only downloaded the index.html. [EMAIL PROTECTED] src]$ ./wget -r --domains=biz.yahoo.com -I /edu/ http://biz.yahoo.com/edu/ [EMAIL PROTECTED] src]$ ls biz.yahoo.com/ edu/ [EMAIL PROTECTED] src]$ ls biz.yahoo.com/edu/ index.html [EMAIL PROTECTED] src]$ Here is the debug info, note that in proclist() function, frontcmp(p, s) supposed return 1, but it returns 0. `p' is 'edu/' which, keed the trailing '/' from parameter, and 's' is 'edu' - the directory of crawled url. Since 's' doesn't start with 'p', then it failed. If pass the url's 'path' instead of 'dir' to accdir(), it may work. Actually, I really recommend change the '-include-directories' parameter to '-include-urls'(so does -exlclude..). Then keeps the '/' characters in the parameter make more sense and easier to use. I used htdig before, which uses 'exclude_urls: /cgi-bin/' as well in its configuration. [EMAIL PROTECTED] src]$ gdb wget (gdb) b accdir Breakpoint 1 at 0x806cb42: file utils.c, line 714. (gdb) run -r --domains=biz.yahoo.com -I /edu/ http://biz.yahoo.com/edu/ Starting program: /home/weiye/downloads/wget-1.8.2/src/wget -r --domains=biz.yahoo.com - I /edu/ http://biz.yahoo.com/edu/ --18:55:07-- http://biz.yahoo.com/edu/ => `biz.yahoo.com/edu/index.html' Resolving biz.yahoo.com... done. Connecting to biz.yahoo.com[66.163.175.141]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] [ <=> ] 6,741 6.43M/s 18:55:07 (6.43 MB/s) - `biz.yahoo.com/edu/index.html' saved [6741] Breakpoint 1, accdir (directory=0x8089df0 "edu", flags=ALLABS) at utils.c:714 714 if (flags & ALLABS && *directory == '/') (gdb) n 716 if (opt.includes) (gdb) 718 if (!proclist (opt.includes, directory, flags)) (gdb) s proclist (strlist=0x807f090, s=0x8089df0 "edu", flags=ALLABS) at utils.c:690 690 for (x = strlist; *x; x++) (gdb) n 691 if (has_wildcards_p (*x)) (gdb) p *x $1 = 0x807f0a0 "/edu/" (gdb) n 698 char *p = *x + ((flags & ALLABS) && (**x == '/')); /* Remove '/' */ (gdb) 699 if (frontcmp (p, s)) (gdb) p p $2 = 0x807f0a1 "edu/" (gdb) p s $3 = 0x8089df0 "edu" (gdb) p p $4 = 0x807f0a1 "edu/" (gdb) n 701 } (gdb) bt #0 proclist (strlist=0x807f090, s=0x8089df0 "edu", flags=ALLABS) at utils.c:701 #1 0x806cb76 in accdir (directory=0x8089df0 "edu", flags=ALLABS) at utils.c:718 #2 0x8064d8d in download_child_p (upos=0x807e7e0, parent=0x808c800, depth=0, start_url_parsed=0x8080000, blacklist=0x807e100) at recur.c:514 #3 0x80648b0 in retrieve_tree (start_url=0x807e080 "http://biz.yahoo.com/edu/") at recur.c:348 #4 0x8062179 in main (argc=6, argv=0x9fbff444) at main.c:822 #5 0x804a20d in _start () (gdb) Thanks very much!! --- "Aaron S. Hawley" <[EMAIL PROTECTED]> wrote: > no, i think your original idea of getting rid of the code that removes the > trailing slash is a better idea. i think this would fix it but keep the > "degenerate case of root directory" (whatever that's about): > > Index: src/init.c > =================================================================== > RCS file: /pack/anoncvs/wget/src/init.c,v > retrieving revision 1.54 > diff -u -u -r1.54 init.c > --- src/init.c 2002/08/03 20:34:57 1.54 > +++ src/init.c 2003/06/13 20:24:16 > @@ -753,7 +753,6 @@ > > if (*val) > { > - /* Strip the trailing slashes from directories. */ > char **t, **seps; > > seps = sepstring (val); > @@ -761,10 +760,10 @@ > { > int len = strlen (*t); > /* Skip degenerate case of root directory. */ > - if (len > 1) > + if (len == 1) > { > - if ((*t)[len - 1] == '/') > - (*t)[len - 1] = '\0'; > + if ((*t)[0] == '/') > + (*t)[0] = '\0'; > } > } > *pvec = merge_vecs (*pvec, seps); > > On Thu, 12 Jun 2003, wei ye wrote: > > > For the situation I only need '/r/', there is no option for I to do that. > > > > If user need '/r*/', they should specify -I '/r*/' instead. > > > > Simple patch attached, please consider it. Thanks!! > > > > [EMAIL PROTECTED] src]$ diff -u utils.c.orig utils.c > > --- utils.c.orig Fri May 17 20:05:22 2002 > > +++ utils.c Thu Jun 12 20:24:21 2003 > > @@ -696,7 +696,9 @@ > > else > > { > > char *p = *x + ((flags & ALLABS) && (**x == '/')); /* Remove '/' */ > > - if (frontcmp (p, s)) > > + /* if *p="c", pass if s is "c" or "c/..." not "ca...". */ > > + int plen = strlen(p); > > + if ( (strncmp (p, s, plen) == 0) && (s[plen] == '/' || s[plen] == > '\0') > > ) > > break; > > } > > return *x; > > [EMAIL PROTECTED] src]$ > > > -- > I get threatening vacation messages from "J K", too. ===== Wei Ye __________________________________ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com