Bugs item #890978, was opened at 2004-02-05 13:05 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=376685&aid=890978&group_id=22866
Category: JBossCMP Group: v3.2 Status: Open Resolution: None Priority: 5 Submitted By: Milen Dyankov (azzazzel) Assigned to: Nobody/Anonymous (nobody) Summary: Problem with non-English characters in UTF-8 encoded queries Initial Comment: I'm resubmitting this issue since the the last time I did (see #887491) it was understood wrong, closed and marked as "invalid" by "loubyansky" without even making the effort to clarify it. -------------------------------- Initial Comment (see #887491 for full text): It seems that all characters coded with more than one byte (2+ bytes) in UTF-8 encoded queries are incorrectly parsed by [EJB/JBoss]QLParser as seen in this log fragment: [...] If I pass parameter like '\u0105' instead of 'ą' then it works. [...] Comment By: Alexey Loubyansky (loubyansky): Why is 'ą' supposed to be understood? Either you provide unicode content as is (not the 'ą' form) or you use unicode escapes as defined in the Java spec, i.e. '\u'. -------------------------------- I very well know that "ą" is not supposed to be understood! What I have typed in the <TEXTAREA> was character with Unicode code \u0105 also called "LATIN SMALL LETTER A WITH OGONEK" I guess it was converted to "ą" by SF and I haven't even noticed it was! I bet if you type Russian characters in <TEXTAREA> they would also be displayed in &#XXX; form. This subject was discussed previously on JBoss-user list and Alexey Loubyansky was also answering my e-mails there. (See: http://www.mail-archive.com/[EMAIL PROTECTED]/msg35226.html) I have also contacted Alexey Loubyansky and Dain Sundstrom since they are mentioned to be the authors of "JBossQLParser.jjt" and "EJBQLParser.jjt". Alexey didn't answer, while Dain stated he does not work for JBoss any more. I was asked to open a bug report by Heiko Rupp on jboss-user list! Now once again to make it clear: I do not enter in my queries characters in the form "ą" but naturally in UTF-8 encoding (as they are typed)! It does not work! It is incorrectly parsed! I believe it is because parser expects 1 byte long character (\u0105 has two bytes in UTF-8). As I said before setting "JAVA_UNICODE_ESCAPE = false" in "JBossQLParser.jjt" and "EJBQLParser.jjt" solves the problem! More specifically it causes that parser understands UTF-8 but does not understand Unicode escaped characters (in the form \uXXXX). I don't know how to set it in order to understand both! Can I please ask you, to have another look on this! Please contact me if you need more information on this subject! Milen Dyankov ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=376685&aid=890978&group_id=22866 ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ JBoss-Development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development