[JBoss-dev] [ jboss-Bugs-890978 ] Problem with non-English characters in UTF-8 encoded queries

SourceForge.net Thu, 05 Feb 2004 05:10:46 -0800

Bugs item #890978, was opened at 2004-02-05 13:05
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=890978&group_id=22866


Category: JBossCMP
Group: v3.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Milen Dyankov (azzazzel)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problem with non-English characters in UTF-8 encoded queries

Initial Comment:
I'm resubmitting this issue since the the last time I
did (see #887491) it was understood wrong,
closed and marked as "invalid" by "loubyansky" without
even making the effort to clarify it.

--------------------------------
Initial Comment (see #887491 for full text):

It seems that all characters coded with more than one
byte (2+ bytes) in UTF-8 encoded queries are
incorrectly parsed by [EJB/JBoss]QLParser as seen in
this log fragment:
[...]
If I pass parameter like '\u0105' instead of '&#261;'
then it works.
[...]


Comment By: Alexey Loubyansky (loubyansky):

Why is '&#261' supposed to be understood?
Either you provide unicode content as is (not the
'&#261' form) or you use unicode escapes as defined in
the Java
spec, i.e. '\u'.

--------------------------------

I very well know that "&#261" is not supposed to be
understood!
What I have typed in the <TEXTAREA> was character with
Unicode code \u0105 also called "LATIN SMALL LETTER A
WITH OGONEK"
I guess it was converted to "&#261" by SF and I haven't
even noticed it was!
I bet if you type Russian characters in <TEXTAREA> they
would also be displayed in &#XXX; form.

This subject was discussed previously on JBoss-user
list and Alexey Loubyansky was also answering my
e-mails there.
(See:
http://www.mail-archive.com/[EMAIL PROTECTED]/msg35226.html)
I have also contacted Alexey Loubyansky and Dain
Sundstrom since they are mentioned to be the authors of
"JBossQLParser.jjt" and "EJBQLParser.jjt".
Alexey didn't answer, while Dain stated he does not
work for JBoss any more.
I was asked to open a bug report by Heiko Rupp on
jboss-user list!

Now once again to make it clear:

I do not enter in my queries characters in the form
"&#261" but naturally in UTF-8 encoding (as they are
typed)!
It does not work! It is incorrectly parsed! I believe
it is because parser expects 1 byte long character
(\u0105 has two bytes in UTF-8).

As I said before setting "JAVA_UNICODE_ESCAPE = false"
in "JBossQLParser.jjt" and "EJBQLParser.jjt" solves the
problem!
More specifically it causes that parser understands
UTF-8 but does not understand Unicode escaped
characters (in the form \uXXXX).
I don't know how to set it in order to understand both!

Can I please ask you, to have another look on this!
Please contact me if you need more information on this
subject!

Milen Dyankov




----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=890978&group_id=22866


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development

[JBoss-dev] [ jboss-Bugs-890978 ] Problem with non-English characters in UTF-8 encoded queries

Reply via email to