Hello Michi,
Tuesday, January 15, 2002, 5:17:30 PM, you wrote:
MR> Hi!
MR> I want to write a method, that converts text into html-readable characters.
MR> so I have to replace "<", ">", "&" and "\" with their named entities - that is
clear.
Yup, looks like everybody has to write such function at least a dozen
times in his life :-)!
MR> but what about unicode and characters above the ASCII-128.
MR> I think, if I have got a text (with or without unicode-characters) it is ok, to
MR> substitute all characters above ASCII-128 and all unicode charcters with &#xxx;
Well, it's really okay unless you do not mind spending 6 or more bytes
per char ( &#xxx; takes 6 bytes at least).
It depends on what charset you use. (You set it with
response.setContentType("text/html; charset=ISO-8859-1");
)
If you set ISO-8859-1 or do not touch anything and get this charset by
default, then you have to encode all the chars > 255 as &#xxx;
MR> but how to know the right unicode-encoding for the ASCII-characters 128-255 ???
MR> I think the first 256 unicode-characters are identical to iso-8859-1 (is this
correct???).
IMHO it's correct.
So you do not have to replace 128-255. Leave them as they are.
MR> so what if I want to substitute greek-characters (0370-03FF unicode and
iso-8859-7)!
MR> how do I know, how to subsitute each character?
See no problem here, just replace character any character over 255
with the appropriate &#abc; sequence. For example replace the U+0370
char with Ͱ or with Ͱ whatever you like better.
The code that followes &# in HTML 4.0 is a Unicode code of the char
(either decimal in case of &#abc; or hex in case ઼)
Good luck!
--
Best regards,
Anton Tagunov mailto:[EMAIL PROTECTED]
___________________________________________________________________________
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff SERVLET-INTEREST".
Archives: http://archives.java.sun.com/archives/servlet-interest.html
Resources: http://java.sun.com/products/servlet/external-resources.html
LISTSERV Help: http://www.lsoft.com/manuals/user/user.html