[jboss-user] [Tomcat, HTTPD, Servlets & JSP] - Unicode character issue - happens only on Linux

flarosa Mon, 15 Jan 2007 12:26:54 -0800

Hi,

I have a customer who regularly cuts text from Word documents before pasting 
them into forms I created for him on his web site. The text often contains 
non-UTF-8 characters such as u2019 for single quotes or u201C for 
double-quotes. We were having some problems storing these characters in our 
database, so I added a filter that replaces them with the standard quotes from 
the UTF-8 set.


I tested my work by deploying to a local copy of JBoss on my workstation, which 
is a Windows XP computer, and it worked fine. I did the conversion using the 
String.replace function, for example:

s = s.replace('\u201C', '"');

However, when I deployed this to my production environment - which has the same 
version of Java, and the same version of JBoss, but is Linux - it failed. To 
see what was going on, I tried logging all the characters of the input string 
using s.codePointAt(). It turns out that instead of getting characters 201C and 
2019, I'm getting character FFFD in both cases.

Does anyone understand why this is happening? I have been working with Java for 
almost 7 years, and I have never encountered an inconsistency between its 
behavior on Linux and Windows before.

Thanks,
Frank

View the original post : 
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4001916#4001916

Reply to the post : 
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4001916
_______________________________________________
jboss-user mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/jboss-user

[jboss-user] [Tomcat, HTTPD, Servlets & JSP] - Unicode character issue - happens only on Linux

Reply via email to