Re: [Resin-interest] simple utf8 question

2008-03-29 Thread Riccardo Cohen
I have this :

DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

I'll check in the driver if there is a sort of parameter like charset or 
encoding...

Knut Forkalsrud wrote:
> Riccardo Cohen wrote:
>> 2) search
>>
>> it fails to find any record :
>>
>> qr=globalaction.m_manager.createQuery("select u from userinfo u where
>> u.lastname like '"+catalogchapter+"%' order by lastname");
>>
>> when catalogchapter is £ I find the record, when it is א or 种 it
>> returns zero record without any error (result list count=0).
>>   
> 
> This must be a MySQL issue, either in the JDBC driver or on the server.
> Make sure your table deals with extended unicode ranges.  For example try:
> 
>   show create table userinfo \G
> 
> 
> You should get output like
> 
> CREATE TABLE userinfo (
> .
> ) ENGINE=InnoDB AUTO_INCREMENT=6796 DEFAULT CHARSET=latin1
> 
> 
> If the charset is latin1 like in my example you may have problems 
> dealing with the Chinese characters.
> 
> 
> -Knut
> 
> 
> 
> ___
> resin-interest mailing list
> resin-interest@caucho.com
> http://maillist.caucho.com/mailman/listinfo/resin-interest

-- 
Très cordialement,

Riccardo Cohen
---
Articque
http://www.articque.com
149 av Général de Gaulle
37230 Fondettes - France
tel : 02-47-49-90-49
fax : 02-47-49-91-49



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] simple utf8 question

2008-03-28 Thread Knut Forkalsrud
Riccardo Cohen wrote:
> 2) search
>
> it fails to find any record :
>
> qr=globalaction.m_manager.createQuery("select u from userinfo u where
> u.lastname like '"+catalogchapter+"%' order by lastname");
>
> when catalogchapter is £ I find the record, when it is א or 种 it
> returns zero record without any error (result list count=0).
>   

This must be a MySQL issue, either in the JDBC driver or on the server.
Make sure your table deals with extended unicode ranges.  For example try:

  show create table userinfo \G


You should get output like

CREATE TABLE userinfo (
.
) ENGINE=InnoDB AUTO_INCREMENT=6796 DEFAULT CHARSET=latin1


If the charset is latin1 like in my example you may have problems 
dealing with the Chinese characters.


-Knut



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] simple utf8 question

2008-03-28 Thread Riccardo Cohen
1) resourcebundle :

I can also write a very simple class that reads the text file in utf8
and build a hashtable from it, it will be easyier for me than 
understanding java abstractions...

I also found in http://java.sun.com/javase/6/jcp/mr2/ that java6 adds
utf8 support... not for now

2) search

it fails to find any record :

qr=globalaction.m_manager.createQuery("select u from userinfo u where
u.lastname like '"+catalogchapter+"%' order by lastname");

when catalogchapter is £ I find the record, when it is א or 种 it
returns zero record without any error (result list count=0).

Knut Forkalsrud wrote:
> Riccardo Cohen wrote:
>> what do you suggest for strings localization ? the docs does not tell 
>> how to use xml for resourcebundle... or how to change encoding for text 
>> resource bundle...
>>   
> 
> One approach is to use the \u codes for any character outside 
> Latin-1 ( are the hex digits of the code point in the Unicode tables).
> 
> That may make your resource bundle file less than ideal in terms or 
> readability, but you may chose that for robustness.  You may even want 
> to use the \u codes for every non-ascii character if you don't want 
> other user's errors to affect you.
> 
> The other approach is to get familiar with the class loading scheme for 
> resource bundles and write your own implementation.  
> 
> 
>> 2) receiving
>>
>> the html page specify charset=UTF-8 in head content, and when typing £ 
>> sign I get C2 A3. I suppose it is utf8 even if not exactly the code you 
>> gave.
> 
> Correct, typo on my part.
> 
>> With that string the search succeed, but with the chinese 
>> character 务 it fails.
>>   
> 
> Fails, how?  I believe that is character 0x52a1 
> , which 
> should be perfectly representable in Java's 16 bit characters.
> 
>> I added utf-8 in resin.conf, 
>> and it did some change (before that, the returned search string was not 
>> equal to the sent search string). But the search still fails with 
>> chinese char. Maybe this is an issue with amber/mysql ?
>>   
> 
> Possibly.  Hard to say from here.
> 
> -Knut
> 
> 
> 
> ___
> resin-interest mailing list
> resin-interest@caucho.com
> http://maillist.caucho.com/mailman/listinfo/resin-interest

-- 
Très cordialement,

Riccardo Cohen
---
Articque
http://www.articque.com
149 av Général de Gaulle
37230 Fondettes - France
tel : 02-47-49-90-49
fax : 02-47-49-91-49





___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] simple utf8 question

2008-03-27 Thread Knut Forkalsrud
Riccardo Cohen wrote:
> what do you suggest for strings localization ? the docs does not tell 
> how to use xml for resourcebundle... or how to change encoding for text 
> resource bundle...
>   

One approach is to use the \u codes for any character outside 
Latin-1 ( are the hex digits of the code point in the Unicode tables).

That may make your resource bundle file less than ideal in terms or 
readability, but you may chose that for robustness.  You may even want 
to use the \u codes for every non-ascii character if you don't want 
other user's errors to affect you.

The other approach is to get familiar with the class loading scheme for 
resource bundles and write your own implementation.  


> 2) receiving
>
> the html page specify charset=UTF-8 in head content, and when typing £ 
> sign I get C2 A3. I suppose it is utf8 even if not exactly the code you 
> gave.

Correct, typo on my part.

> With that string the search succeed, but with the chinese 
> character 务 it fails.
>   

Fails, how?  I believe that is character 0x52a1 
, which 
should be perfectly representable in Java's 16 bit characters.

> I added utf-8 in resin.conf, 
> and it did some change (before that, the returned search string was not 
> equal to the sent search string). But the search still fails with 
> chinese char. Maybe this is an issue with amber/mysql ?
>   

Possibly.  Hard to say from here.

-Knut



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] simple utf8 question

2008-03-27 Thread Riccardo Cohen
Thanks a lot for your explanations.

1) displaying

I tried to add res.setContentType("text/html;charset=utf-8"); and it 
worked : string coming from mysql are correcly displayed in the html 
browser.
But since then, all my bundle resources (where I put translation strings 
in utf 8) did not appear correctly anymore.
I found in PropertyResourceBundle doc that apparently the resource 
bundle is always read in ISO-8859-1 encoding and it seems that you 
cannot change (that except if you use xml ?). I suppose that the strings 
where read incorrectly and sent incorrectly to the browser, the two 
errors compensating did show a correct page before setContentType().

what do you suggest for strings localization ? the docs does not tell 
how to use xml for resourcebundle... or how to change encoding for text 
resource bundle...

2) receiving

the html page specify charset=UTF-8 in head content, and when typing £ 
sign I get C2 A3. I suppose it is utf8 even if not exactly the code you 
gave. With that string the search succeed, but with the chinese 
character 务 it fails.
I added utf-8 in resin.conf, 
and it did some change (before that, the returned search string was not 
equal to the sent search string). But the search still fails with 
chinese char. Maybe this is an issue with amber/mysql ?


PS: I fill my mysql database with phpmyadmin which seems to handle 
unicode utf8 quite well (better than me !).

If you have any idea, thanks for your help.


Knut Forkalsrud wrote:
> Riccardo Cohen wrote:
>> I have an utf-8 html form that searches in the database, and produces 
>> result in an utf-8 html page.
>>   
>> I have 2 conversion problems :
>>
>> 1) displaying items
>>   
> 
> response.setContentType("text/html;charset=utf-8") may be enough.  It 
> will set the response writer/output stream conversion to utf-8.
> 
>> How do you convert a string to utf8 ?
>>   
> utf8bytes = name.getBytes("UTF8") gives you the UTF-8 output.  If you pass 
> that directly to response.getOutputStream() that should be it.  Don't try to 
> convert it back to a String once you have done the UTF-8 encoding.  If your 
> view is dealing with strings or characters instead of bytes, let the view 
> take care of the UTF-8 encoding.  If your view is JSP there are page 
> directives for this.  If your view relies or response.getWriter() setting the 
> desired charset in the content type should be enough.  If your view relies on 
> response.getOutputStream() you need to take care of it yourself, buy you also 
> will write bytes to the output, not characters.
> 
> 
>> in the form I receive a string wich is encoded in utf8 by the browser, 
>> but it is not recognized as utf8 by the server.
> 
> You may want to specify what character encoding your resin environment 
> expects  .  
> Browsers typically return the form data encoded by whatever character 
> set the html document holding the form had, so if you present your form 
> in UFT-8 you can be reasonably sure you get utf-8 data back in the 
> submission.  Just make sure you indeed have UTF-8 data in the form, with 
> a GET request the £ (GBP) character should show up as %C2%A9.
> 
>> I cannot convert it to  utf8 for mysql.
>>   
> 
> Let the mysql JDBC driver get the String, it will deal with any UTF-8 
> encoding internally by itself.  Don't try to force it.
> 
> 
> -Knut
> 
> 
> 
> 
> ___
> resin-interest mailing list
> resin-interest@caucho.com
> http://maillist.caucho.com/mailman/listinfo/resin-interest

-- 
Très cordialement,

Riccardo Cohen
---
Articque
http://www.articque.com
149 av Général de Gaulle
37230 Fondettes - France
tel : 02-47-49-90-49
fax : 02-47-49-91-49



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] simple utf8 question

2008-03-26 Thread Knut Forkalsrud

Riccardo Cohen wrote:
I have an utf-8 html form that searches in the database, and produces 
result in an utf-8 html page.
  
I have 2 conversion problems :


1) displaying items
  


response.setContentType("text/html;charset=utf-8") may be enough.  It 
will set the response writer/output stream conversion to utf-8.



How do you convert a string to utf8 ?
  

utf8bytes = name.getBytes("UTF8") gives you the UTF-8 output.  If you pass that 
directly to response.getOutputStream() that should be it.  Don't try to convert it back 
to a String once you have done the UTF-8 encoding.  If your view is dealing with strings 
or characters instead of bytes, let the view take care of the UTF-8 encoding.  If your 
view is JSP there are page directives for this.  If your view relies or 
response.getWriter() setting the desired charset in the content type should be enough.  
If your view relies on response.getOutputStream() you need to take care of it yourself, 
buy you also will write bytes to the output, not characters.


in the form I receive a string wich is encoded in utf8 by the browser, 
but it is not recognized as utf8 by the server.


You may want to specify what character encoding your resin environment 
expects  .  
Browsers typically return the form data encoded by whatever character 
set the html document holding the form had, so if you present your form 
in UFT-8 you can be reasonably sure you get utf-8 data back in the 
submission.  Just make sure you indeed have UTF-8 data in the form, with 
a GET request the £ (GBP) character should show up as %C2%A9.



I cannot convert it to  utf8 for mysql.
  


Let the mysql JDBC driver get the String, it will deal with any UTF-8 
encoding internally by itself.  Don't try to force it.



-Knut

___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


[Resin-interest] simple utf8 question

2008-03-23 Thread Riccardo Cohen
Hi
I have an utf-8 html form that searches in the database, and produces 
result in an utf-8 html page.

I have 2 conversion problems :

1) displaying items

When I append to the page the String received from the ejb (from mysql), 
the result is not good.
I tried all these :
   utf8bytes=name.getBytes("UTF8");
   utf8name=new String(utf8bytes);
   utf8name=new String(utf8bytes,"UTF8");
   utf8name=new String(utf8bytes,"ASCII");

but nothing works. I found this operation to make it display correctly 
in utf8 :

   utf8bytes=name.getBytes("UTF8");
   for (idxtmp=0;idxtmphttp://java.sun.com/docs/books/tutorial/i18n/text/index.html but it did 
not help.
Thanks a lot for any tip, or the good documentation.

-- 
Très cordialement,

Riccardo Cohen
---
Articque
http://www.articque.com
149 av Général de Gaulle
37230 Fondettes - France
tel : 02-47-49-90-49
fax : 02-47-49-91-49



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest