According to Schallehn Volker:
> We are running ht://Dig 3.1.6. The Website uses unicode characters for
> displaying german umlauts etc. What htdig does is to transform a unicode
> character for example "ä" into "ä" We tried translate_amp
> with both options "true" and "false", but without success. Is there any way
> to prevent the "&"-character to be translated to "&"?

Right now, the only way is to change the code in encodeSGML() so it doesn't
change the "&" to "&", around lines 980-981 in htsearch/Display.cc.
That will solve this problem, but it may introduce a new one, as there may
be contexts in your documents in which the "&" needs to be retranslated into
"&".

The fundamental problem here is that when htdig indexes a document,
it doesn't clearly distinguish between entities it converts and those
it doesn't.

E.g.:, an HTML guide may say something like:

   You can encode the < character as <, and encode ä as
   ä.

When that is indexed by htdig, the excerpt stored in the database will be:

   You can encode the < character as &lt;, and encode &#x00E4; as &#x00E4;.

Now, htsearch changes this to...

   You can encode the &lt; character as &amp;lt;, and encode &amp;#x00E4; as 
&amp;#x00E4;.

... but with the modification above to encodeSGML it would output this
(as HTML)...

   You can encode the < character as <, and encode &#x00E4; as &#x00E4;.

Ultimately, the proper fix might be to change htdig such that any &
character it encounters that's not part of an entity it changes would go
into the database as some other unique character, which htsearch could
then convert back into the & character without forcing the conversion
to &amp;.  That's a bit more involved a change, though.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This SF.NET email is sponsored by: FREE  SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your  SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to