I am going to re-read the locales FAQs, at least one more time ... but thus 
far I have had no success on Mandrake 9.2 set-up to run on UTF-8.

UTF-8 characters come out on my machine with htsearch as an entit-ised 
version of Latin-1, e.g. GenÃve (G-e-n-e_grave-n-e) comes out as 
Genaève.
Thought UTF-8 was the standard language for the web (or at least XHTML) now?

The above may be my mis-settings.

But Палата (that is some Cyrillic 
inserted in an English-language webpage, that is placed in as entities and 
has not been transcribed to UTF-8 (or any other encoding for that matter), 
shows up nicely in the original webpage, but in the htsearch output is 
rendered as above (that is the source code is now Па& 
etc.).

The few non-numeric entities I have checked seem to survive digging/searching 
(& ®).

Is there a way to stop the killing of numeric entities?



Michael Chapman.



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to