Re: [Resin-interest] simple utf8 question

2008-03-29 Thread Riccardo Cohen
I have this :

DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

I'll check in the driver if there is a sort of parameter like charset or 
encoding...

Knut Forkalsrud wrote:
 Riccardo Cohen wrote:
 2) search

 it fails to find any record :

 qr=globalaction.m_manager.createQuery(select u from userinfo u where
 u.lastname like '+catalogchapter+%' order by lastname);

 when catalogchapter is £ I find the record, when it is א or 种 it
 returns zero record without any error (result list count=0).
   
 
 This must be a MySQL issue, either in the JDBC driver or on the server.
 Make sure your table deals with extended unicode ranges.  For example try:
 
   show create table userinfo \G
 
 
 You should get output like
 
 CREATE TABLE userinfo (
 .
 ) ENGINE=InnoDB AUTO_INCREMENT=6796 DEFAULT CHARSET=latin1
 
 
 If the charset is latin1 like in my example you may have problems 
 dealing with the Chinese characters.
 
 
 -Knut
 
 
 
 ___
 resin-interest mailing list
 resin-interest@caucho.com
 http://maillist.caucho.com/mailman/listinfo/resin-interest

-- 
Très cordialement,

Riccardo Cohen
---
Articque
http://www.articque.com
149 av Général de Gaulle
37230 Fondettes - France
tel : 02-47-49-90-49
fax : 02-47-49-91-49



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] simple utf8 question

2008-03-27 Thread Riccardo Cohen
Thanks a lot for your explanations.

1) displaying

I tried to add res.setContentType(text/html;charset=utf-8); and it 
worked : string coming from mysql are correcly displayed in the html 
browser.
But since then, all my bundle resources (where I put translation strings 
in utf 8) did not appear correctly anymore.
I found in PropertyResourceBundle doc that apparently the resource 
bundle is always read in ISO-8859-1 encoding and it seems that you 
cannot change (that except if you use xml ?). I suppose that the strings 
where read incorrectly and sent incorrectly to the browser, the two 
errors compensating did show a correct page before setContentType().

what do you suggest for strings localization ? the docs does not tell 
how to use xml for resourcebundle... or how to change encoding for text 
resource bundle...

2) receiving

the html page specify charset=UTF-8 in head content, and when typing £ 
sign I get C2 A3. I suppose it is utf8 even if not exactly the code you 
gave. With that string the search succeed, but with the chinese 
character 务 it fails.
I added character-encodingutf-8/character-encoding in resin.conf, 
and it did some change (before that, the returned search string was not 
equal to the sent search string). But the search still fails with 
chinese char. Maybe this is an issue with amber/mysql ?


PS: I fill my mysql database with phpmyadmin which seems to handle 
unicode utf8 quite well (better than me !).

If you have any idea, thanks for your help.


Knut Forkalsrud wrote:
 Riccardo Cohen wrote:
 I have an utf-8 html form that searches in the database, and produces 
 result in an utf-8 html page.
   
 I have 2 conversion problems :

 1) displaying items
   
 
 response.setContentType(text/html;charset=utf-8) may be enough.  It 
 will set the response writer/output stream conversion to utf-8.
 
 How do you convert a string to utf8 ?
   
 utf8bytes = name.getBytes(UTF8) gives you the UTF-8 output.  If you pass 
 that directly to response.getOutputStream() that should be it.  Don't try to 
 convert it back to a String once you have done the UTF-8 encoding.  If your 
 view is dealing with strings or characters instead of bytes, let the view 
 take care of the UTF-8 encoding.  If your view is JSP there are page 
 directives for this.  If your view relies or response.getWriter() setting the 
 desired charset in the content type should be enough.  If your view relies on 
 response.getOutputStream() you need to take care of it yourself, buy you also 
 will write bytes to the output, not characters.
 
 
 in the form I receive a string wich is encoded in utf8 by the browser, 
 but it is not recognized as utf8 by the server.
 
 You may want to specify what character encoding your resin environment 
 expects http://caucho.com/resin/doc/env-tags.xtp#character-encoding .  
 Browsers typically return the form data encoded by whatever character 
 set the html document holding the form had, so if you present your form 
 in UFT-8 you can be reasonably sure you get utf-8 data back in the 
 submission.  Just make sure you indeed have UTF-8 data in the form, with 
 a GET request the £ (GBP) character should show up as %C2%A9.
 
 I cannot convert it to  utf8 for mysql.
   
 
 Let the mysql JDBC driver get the String, it will deal with any UTF-8 
 encoding internally by itself.  Don't try to force it.
 
 
 -Knut
 
 
 
 
 ___
 resin-interest mailing list
 resin-interest@caucho.com
 http://maillist.caucho.com/mailman/listinfo/resin-interest

-- 
Très cordialement,

Riccardo Cohen
---
Articque
http://www.articque.com
149 av Général de Gaulle
37230 Fondettes - France
tel : 02-47-49-90-49
fax : 02-47-49-91-49



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] simple utf8 question

2008-03-26 Thread Knut Forkalsrud

Riccardo Cohen wrote:
I have an utf-8 html form that searches in the database, and produces 
result in an utf-8 html page.
  
I have 2 conversion problems :


1) displaying items
  


response.setContentType(text/html;charset=utf-8) may be enough.  It 
will set the response writer/output stream conversion to utf-8.



How do you convert a string to utf8 ?
  

utf8bytes = name.getBytes(UTF8) gives you the UTF-8 output.  If you pass that 
directly to response.getOutputStream() that should be it.  Don't try to convert it back 
to a String once you have done the UTF-8 encoding.  If your view is dealing with strings 
or characters instead of bytes, let the view take care of the UTF-8 encoding.  If your 
view is JSP there are page directives for this.  If your view relies or 
response.getWriter() setting the desired charset in the content type should be enough.  
If your view relies on response.getOutputStream() you need to take care of it yourself, 
buy you also will write bytes to the output, not characters.


in the form I receive a string wich is encoded in utf8 by the browser, 
but it is not recognized as utf8 by the server.


You may want to specify what character encoding your resin environment 
expects http://caucho.com/resin/doc/env-tags.xtp#character-encoding .  
Browsers typically return the form data encoded by whatever character 
set the html document holding the form had, so if you present your form 
in UFT-8 you can be reasonably sure you get utf-8 data back in the 
submission.  Just make sure you indeed have UTF-8 data in the form, with 
a GET request the £ (GBP) character should show up as %C2%A9.



I cannot convert it to  utf8 for mysql.
  


Let the mysql JDBC driver get the String, it will deal with any UTF-8 
encoding internally by itself.  Don't try to force it.



-Knut

___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest