Gabriel

I believe that you may be thinking that the characters are in ASCII.  That's
not so, they are in unicode.  All the english characters are covered by
default but without the extended unicode set, you won't be able to recognize
the latin characters.  Hope this helps, Stephanie

-----Original Message-----
From: Gabriel J Zimmerman [mailto:[EMAIL PROTECTED]]
Sent: Thursday, October 12, 2000 11:23 AM
To: [EMAIL PROTECTED]
Subject: A problem with extended latin characters.


Hi,

I was wondering if people had experience parsing for extended latin
characters (Those characters not used in english particularly but in
other romance languages such as German i.e. � � � � � �). I am trying to
parse pages which use these characters so that for example '�' becomes
the html equivalent 'é'

However there seems to be a strange problem occuring, perhaps having to
do with a difference between how a Windows environment recognizes these
characters as compared to a unix environment.

Basically, the parser is not finding them. In theory, the int value for
example for a char � would be 233, which is probably what the parser is
looking for. However, there seems to be a different representation in
some cases. When I have my program output the text String that is to be
parsed the � is replaced by \351. The stranger thing is that when I get
the String directly from the database, the � remains as is and my
parsing program finds it without a problem. However, using JRun as the
servlet engine at some point for whatever reason the � becomes a \351.

I am wondering if this has something to do with a Windows vs. Unix
representation of this, since sometimes when I paste an � from a windows
document into a unix one (simply using telnet) it also shows up as
something like \351.

Does anyone know what might be going on here and how I might be able to
parse these characters?

Thanks,

Gabriel Zimmerman
Groundzero Associates

===========================================================================
To unsubscribe: mailto [EMAIL PROTECTED] with body: "signoff
JSP-INTEREST".
Some relevant FAQs on JSP/Servlets can be found at:

 http://java.sun.com/products/jsp/faq.html
 http://www.esperanto.org.nz/jsp/jspfaq.html
 http://www.jguru.com/jguru/faq/faqpage.jsp?name=JSP
 http://www.jguru.com/jguru/faq/faqpage.jsp?name=Servlets

===========================================================================
To unsubscribe: mailto [EMAIL PROTECTED] with body: "signoff JSP-INTEREST".
Some relevant FAQs on JSP/Servlets can be found at:

 http://java.sun.com/products/jsp/faq.html
 http://www.esperanto.org.nz/jsp/jspfaq.html
 http://www.jguru.com/jguru/faq/faqpage.jsp?name=JSP
 http://www.jguru.com/jguru/faq/faqpage.jsp?name=Servlets

Reply via email to