According to Carl Edwards: > With you suggestion won't I end up with all the > URL's that don't require authorization in the > db twice? Once with each server name. > > How about if I index first the insecure server > then the secure server with a a server alias? > So if it finds the file already exists in the > db under another name it will not reread it?
Do you really think it's a good idea to put your secure and insecure documents into the same database? This opens up a big security hole if this database is searchable from the insecure server. (See http://www.htdig.org/FAQ.html#q4.20) You'd probably be better off indexing the secure and insecure sites separately, and making only the insecure database available from the insecure site. For the secure site, you could either offer two separate searches, with 2 different config files, or you could merge the two databases together and make that available only on the secure site. I'm not sure what you're suggesting by using a server alias, but it doesn't sound to me like this would work, security considerations aside. If you use server_aliases, then htdig will try to fetch the documents using the "canonical" server name, so it would be essentially the same as if you only had the one server listed in your start_url. > >From: Geoff Hutchison <[EMAIL PROTECTED]> ... > >> What I would really like is to have both servers > >> in the start_url list and if a requires authentication > >> response is received from the request then authenticate > >> and save that servers name in the db. > > > >If you enter both servers in the start_url, it will index both. You > >can't do quite what you want, since an authentication will be sent > >regardless of whether it's asked. (This doesn't make much difference > >since the non-authenticating server will just ignore that header.) > > > >So what I'd suggest is something like this: > > > >start_url: http://server1.foo.com/ http://secure1.foo.com/ > >authentication: ... > >local_urls: http://server1.foo.com/=/path/to/files/ \ > > http://secure1.foo.com/=/path/to/files/ > > > >Since each document has one URL and only one URL, you'll need to index > >"twice." Of course since you're indexing on the filesystem and your OS > >will cache files, indexing should be relatively quick. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

