According to Olivier Korn:
> Some time ago, Gilles adviced me to do the following :
> 
> LC_COLLATE=C htmerge -c site1.conf
> LC_COLLATE=C htmerge -c site2.conf
> LC_COLLATE=C htmerge -c site3.conf
> *then*
> LC_COLLATE=C htmerge -c site1.conf -m site2.conf
> LC_COLLATE=C htmerge -c site1.conf -m site3.conf
> 
> We both were not sure wether it was necessary to do the first pass or not
> but I still do this nowadays and it is working perfectly.
> 
> Note : "LC_COLLATE=C" is there because I use another locale than C (or
> en_US).

On many systems, the en_US locale uses a collating sequence that treats
accented characters as unaccented, just like fr_FR or other iso-8859-1
based locales.  Use "LC_COLLATE=C" if there's any chance you have accented
characters in your documents.  This is done in the 3.1.6 snapshot.

As it turns out, the first htmerge pass, after htdig, is needed on each
database before you run htmerge -m.  The code that handles the merging
of two databases expects that the wordlist has already been purged of
control records that htdig uses to tell htmerge about documents to update
or delete.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to