Thanks Jim for your response, very very helpful!
> > 5)
> > If my main robots.txt file says to not scan /member_area
> since it's a
> > top secret
>
> I assume you are being facetious? At least I hope so ;) Use of
Yes I was ;p
I'll try to explain better what I was asking:
Say I have htdig_one.conf
start_url: http://www.mydomain.com/
and http://www.mydomain.com/robots.txt has:
Disallow: /members/
Then http://www.mydomain.com/members/ will not get spidered/indexed into the database
for htdig_one.conf
Ok pretty standard and simple. Now the question:
I want to set up a separate database for http://www.mydomain.com/members/ so I do this:
( I realize the data is still accessable so the separate
database doesn't secure the data, I simply need the data seperated)
htdig_two.conf
start_url: http://www.mydomain.com/members/
that will creat the db ...
but http://www.mydomain.com/robots.txt still has:
Disallow: /members/
in it.
So will htdig_two.conf still be able to spider/index http://www.mydomain.com/members/
Or will the http://www.mydomain.com/robots.txt file stop htdig in it's tracks in this
case?
Does that make more sense?
TIA
Dan
-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html