According to Curtis Ireland:
> Is there any way to have start_url get its list from an SQL back-end?
> Has anyone already built a patch to handle this?
> 
> Here are a couple of solutions I can think of to bi-pass the problem,
> but I'm sure I'm not alone in desiring this feature.
> 
> 1) Build a PHP link built with links to all the sites we want to index.
> Have htDig use this as its start_url
> 2) Before htDig starts its database build, dump all the links to a text
> file and have the htdig.conf include this file
> 
> The one problem with these two solutions is how would the limit_urls_to
> variable work? I want to make sure the links are properly indexed
> without going past the linked site.

Either solution seems workable - it all depends on what your preference
is.  For the first solution, you'd need to have a limit_urls_to setting
that's liberal enough to allow through all the links that the PHP script
will spit out.  You should probably set your max_hop_count to 1 to avoid
having htdig go beyond the first hop, from the PHP output to the documents
it references.

For the second solution, you could probably just leave limit_urls_to as
the default, which is the same as the value of start_url, and set your
max_hop_count to 0.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to