Re: [htdig] premature merging

Geoff Hutchison Fri, 11 Aug 2000 12:13:54 -0700
On Fri, 11 Aug 2000 [EMAIL PROTECTED] wrote:

> My exclude_urls is set to .gif

This is usually done through bad_extensions, but that's fine.

> so I can't see a problem with that. The strange thing here is that it
> goes through about 15 of the 50 start_url URLs and then merges. It
> seems to me that htdig thinks that it is finished digging for some
> reason and I can't pinpoint the reason why.

One other thing to check is that you don't have an inadvertent newline in
the start_url list--it will ignore anything after the newline. One good
way to list a series of URLs is to use the `/path/to/file` syntax to
include a file of URLs.

> I ran the dig with -vvv and the output seemed fine, it was following
> all links, indexing the pdf's,  and parsing them perfectly. 

But it seems to ignore the URLs after a point. This is a good reason to
either hunt for a newline w/o a '\' character before it, or to move the
URLs into a separate file and include that.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
Re: [htdig] premature merging

Reply via email to