you're right, the include-directories option operates much the same way (my guess in the interest of speed) as the rest of the accept/reject options.
which (others have also noticed) is a little flakey. /a On Fri, 13 Jun 2003, wei ye wrote: > Did you test your patch? I patched it on my source code and it doesn't work. > > There are lot of files under http://biz.yahoo.com/edu/, but > the patched code only downloaded the index.html. > > [EMAIL PROTECTED] src]$ ./wget -r --domains=biz.yahoo.com -I /edu/ > http://biz.yahoo.com/edu/ > [EMAIL PROTECTED] src]$ ls biz.yahoo.com/ > edu/ > [EMAIL PROTECTED] src]$ ls biz.yahoo.com/edu/ > index.html > [EMAIL PROTECTED] src]$ > > > Here is the debug info, note that in proclist() function, frontcmp(p, s) > supposed return 1, but it returns 0. > `p' is 'edu/' which, keed the trailing '/' from parameter, and 's' > is 'edu' - the directory of crawled url. Since 's' doesn't start with 'p', > then it failed. > > If pass the url's 'path' instead of 'dir' to accdir(), it may work. > > Actually, I really recommend change the '-include-directories' parameter to > '-include-urls'(so does -exlclude..). Then keeps the '/' characters in the > parameter make more sense and easier to use. I used htdig before, which uses > 'exclude_urls: /cgi-bin/' as well in its configuration. > > > [EMAIL PROTECTED] src]$ gdb wget > (gdb) b accdir > Breakpoint 1 at 0x806cb42: file utils.c, line 714. > (gdb) run -r --domains=biz.yahoo.com -I /edu/ http://biz.yahoo.com/edu/ > Starting program: /home/weiye/downloads/wget-1.8.2/src/wget -r > --domains=biz.yahoo.com - I /edu/ http://biz.yahoo.com/edu/ > --18:55:07-- http://biz.yahoo.com/edu/ > => `biz.yahoo.com/edu/index.html' > Resolving biz.yahoo.com... done. > Connecting to biz.yahoo.com[66.163.175.141]:80... connected. > HTTP request sent, awaiting response... 200 OK > Length: unspecified [text/html] > > [ <=> ] 6,741 6.43M/s > > > 18:55:07 (6.43 MB/s) - `biz.yahoo.com/edu/index.html' saved [6741] > > > Breakpoint 1, accdir (directory=0x8089df0 "edu", flags=ALLABS) at utils.c:714 > 714 if (flags & ALLABS && *directory == '/') > (gdb) n > 716 if (opt.includes) > (gdb) > 718 if (!proclist (opt.includes, directory, flags)) > (gdb) s > proclist (strlist=0x807f090, s=0x8089df0 "edu", flags=ALLABS) at utils.c:690 > 690 for (x = strlist; *x; x++) > (gdb) n > 691 if (has_wildcards_p (*x)) > (gdb) p *x > $1 = 0x807f0a0 "/edu/" > (gdb) n > 698 char *p = *x + ((flags & ALLABS) && (**x == '/')); /* Remove > '/' */ > (gdb) > 699 if (frontcmp (p, s)) > (gdb) p p > $2 = 0x807f0a1 "edu/" > (gdb) p s > $3 = 0x8089df0 "edu" > (gdb) p p > $4 = 0x807f0a1 "edu/" > (gdb) n > 701 } > (gdb) bt > #0 proclist (strlist=0x807f090, s=0x8089df0 "edu", flags=ALLABS) at > utils.c:701 > #1 0x806cb76 in accdir (directory=0x8089df0 "edu", flags=ALLABS) at > utils.c:718 > #2 0x8064d8d in download_child_p (upos=0x807e7e0, parent=0x808c800, depth=0, > start_url_parsed=0x8080000, blacklist=0x807e100) at recur.c:514 > #3 0x80648b0 in retrieve_tree (start_url=0x807e080 > "http://biz.yahoo.com/edu/") > at recur.c:348 > #4 0x8062179 in main (argc=6, argv=0x9fbff444) at main.c:822 > #5 0x804a20d in _start () > (gdb) > > Thanks very much!! -- Consider supporting GNU Software and the Free Software Foundation By Buying Stuff - http://www.gnu.org/gear/ (GNU and FSF are not responsible for this promotion nor do they necessarily agree with the views or opinions of the author)