According to Alexander I. Lebedev:
> I've commented out the loop that fills the StringLists for characters
> 160..255, so the only problem I can have is that with   (and it can
> be easily solved if there will be a need).  Unfortunately, there are so
> many different encodings at the moment (175 charmaps in current i18n
> release), which makes it difficult to find a good solution.  IMHO, the
> best one is to transform "&<> and &nbsp; to SGML form, and leave the rest
> for the locale.  Another solution may be to leave these characters
> configurable in the config file (something similar to valid_punctuation).

The problem with commenting out only that loop is you remove the first
96 entries only from the myToList and myNumFromList lists, but not
from myTextFromList.  As a result, this latter list will no longer
line up with the other two, so none of your text-based SGML entities
will be translated properly, not even those in the bottom half of the
character set.  What you'd really need to do is set

  myTextFromString = "&nbsp;";

and add entries to myToList and myNumFromList for 160, so that at
least that entity is handled from the upper half, and all the bottom
half translations line up.  If you could set it up to do this only for
htsearch, and not for htdig, that might be better, because then you
wouldn't give up the translation of the other SGML entities.  Mind you,
if your results aren't encoded in ISO-8859-1 anyway, maybe it's just as
well you don't translate the others because they won't display as the
right characters in the output.

Ultimately, the best solution is to go with full Unicode support for
internal encodings, and map all other charset encodings in indexed
documents to Unicode.  Then, we could do away with all of this locale
nonsense which has been a source of so many problems.  However, that
would be a huge undertaking, and right now no one seems to be up to
the challenge.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to