I wrote a small standalone class, to test
  .getString()
vs
  .getBytes()
and .getString doesn't handle the UTF8 characters
correctly.
You can download the source code at
http://patrick.schlaepfer.com/TestUTF8.tar.gz

With mysql getObject, returns a an Object an
not a byte[] - which makes sense. So the UTF8
encoding gets lost there.

So I changed in
cocoon-2.1.4/src/blocks/database/java/org/apache/cocoon/transformation/SQLTr
ansformer.java
The lines
// String retval =  SQLTransformer.getStringValue( rs.getObject( i ) );

String retval =  SQLTransformer.getStringValue( rs.getBytes( i ) );

and

// String retval =  SQLTransformer.getStringValue( rs.getObject( name ) );
String retval =  SQLTransformer.getStringValue( rs.getBytes( name ) );

and
retString = "B "+new String( (byte[]) object, "UTF8" );
(B is only for debugging)

And now ther characters are encoded correctly.

Have no idea, if this is also the case with other Databases
but at least with MySQL 4.1.1 it works.

Any comments are welcome
Patrick

> -----Urspr�ngliche Nachricht-----
> Von: Bertrand Delacretaz [mailto:[EMAIL PROTECTED]
> Gesendet: Donnerstag, 1. April 2004 07:24
> An: [EMAIL PROTECTED]
> Betreff: Re: Unicode Umlauts/SQLTransformer
>
>
> Le 31 mars 04, � 16:23, Patrick Schlaepfer a �crit :
>
> > Made the observation that SQLTransformer, doesn't care
> > that much about character Encoding:
> >
> > String retval = SQLTransformger.getStringValue(rs.getObject(i));
> > and then returns a new String((byte[]) object)
>
> According to the Java API, this "Constructs a new String by decoding
> the specified array of bytes using the platform's default charset.".
>
> IIUC the platform's default charset is what can be set with the
> -Dfile.encoding parameter, so things should be fine *if* the encoding
> is correctly handled all the way down the pipeline. I don't know if
> this is the case though, you might want to test it by dumping the
> String at various stages or starting with minimal pipelines.
>
> OTOH I'm wondering if the use of rs.getObject(i) as opposed to
> rs.getString() isn't a problem regarding encoding. It would be
> interesting to compare the two, either in a simple test program outside
> of Cocoon, or by modifying the SQLTransformer to use rs.getString() if
> rs.getMetaData().getColumnType(i) says this is a String column.
>
> -Bertrand
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to