> I investigated 3 unixservers (there are a lot more) and found 8651 URL's
> with a mix of upper/lowercase characters. All these URL's will be
> ignored if case_sensitive=false.

No, they won't be ignored, but the resulting URL in the query results 
(which will be lowercased) may not work.

> some page, you will miss only the pages that realy exist in multiple,
> only case-different, names, but all the other ones are treated 
> correctly.
> So then you won't miss any page with uppercases in it's name.

When indexing, ht://Dig will not "miss" any pages with case_sensitive: 
false, it's simply a question of whether an all-lowercase URL will work 
to retrieve a mixed-case URL.

But in any case, you can also index the UNIX servers and Windows servers 
separately with two different config files (i.e. one with 
case_sensitive: false and one with true). Use these for indexing and 
re-indexing. Then copy one of them to a database you'll use for 
searching and use htmerge to merge the other database into this new one. 
You'll have all the servers with correct URLs. (It's not quite as nice 
as having the per-server case_sensitive attribute, but it'll work.)

<http://www.htdig.org/FAQ.html#q4.4>
<http://www.htdig.org/FAQ.html#q4.5>
<http://www.htdig.org/htmerge.html>

Regards,

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to