Hello.

I found what Dan told at page C-184 in specification of JDBC 3.0 .


As our discussion shows, difference between ASCII and ISO-8859-1  seems to be 
very confusing .

It is not surprising user of derby misunderstand ASCII and ISO-8859-1 , and call "correctly" getAsciiStream as "getISO8859_1Stream()" .
I think changing behavior of getAsciiStream() would cause confusion application 
programs relying on current behavior .
So I think we should keep current behavior .

// In my personal feeling , I feel some kind of solidarity ....
// Many engeneers in Japan , who MUST use character outside 0x0000 - 0x00ff ,  
often be troubled with character encoding problem .
// Even engeneers , who only use character inside 0x0000 - 0x00ff , can be 
troubled ...
// I think Experienced Japanese engeneer would not surprise so much finding 
this kind of behavior .
// It's just everyday experience to be troubled with character encoding problem 
 ;_; ( Japanese smiley of crying :) ) .

Well ...  difficulty in encoding characters is World Wide Problem ...


It may be better to record about this information .

Adding information at 
http://db.apache.org/derby/papers/JDBCImplementation.html#GetAsciiStream%28%29 
would make paper gloomy .
I think writing at somehwere in wiki is preferable .
I will record it with modification of this issue .

Best regards .


/*

        Tomohito Nakayama
        [EMAIL PROTECTED]
        [EMAIL PROTECTED]
        [EMAIL PROTECTED]

        Naka
        http://www5.ocn.ne.jp/~tomohito/TopPage.html

*/
----- Original Message ----- From: "Daniel John Debrunner" <[EMAIL PROTECTED]>
To: "Derby Development" <[email protected]>
Sent: Friday, September 23, 2005 12:26 AM
Subject: Re: [jira] Commented: (DERBY-525) getAsciiStreamshould replace 
non-ASCII characters with 0x3f, '?' to match embedded


Bernt M. Johnsen wrote:

Daniel John Debrunner (JIRA) wrote (2005-09-22 15:10:29):

   [ 
http://issues.apache.org/jira/browse/DERBY-525?page=comments#action_12330193 ]

Daniel John Debrunner commented on DERBY-525:
---------------------------------------------

See this link for the justifications on why getAsciiStream() uses 8 bits and 
not 7.

http://db.apache.org/derby/papers/JDBCImplementation.html#GetAsciiStream%28%29

Basically, it's based upon definitions from the JDBC spec.


Ok. But if you map Unicode characters in the range 0x0000-0x00ff to
1-byte values without some translation, you get ISO-8859-1 characters,
not ASCII characters (which only covers the values 0x00-0x7f). I guess
it's user-friendly, but then the userdoc should explicitely explain
what is done in a way that is understandable to people who happen know
what exactly what the different standards define (Europeans and Asians
tend to be somewhat better educated in this than people from the
US.... for obvious reasons).

Hey, don't blame me, first I'm not from the US and secondly, this
behaviour is defined by JDBC (and not clearly at that). :-)

To quote JDBC 3.0:

CHAR(code) Character with ASCII code value code, where code is between 0
and 255

So JDBC defines ASCII as codes 0-255, 8 bit, and since this is a JDBC
function we need to follow the JDBC spec.


Technically getAsciiStream() is *not* converting to ASCII characters,
it's converting to encoded bytes that in turn can be converted to ASCII,
or ISO-8859-1 using character encoding. Ideally I think Sun should have
deprecated this method when getCharacterStream was added to JDBC, then
the same (and clearer) functionality would have been provided using
standard Java character encoding.

Or maybe calling it getISO8859_1Stream() would have been a better name!

Dan.





--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.11.4/109 - Release Date: 2005/09/21





--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.11.4/109 - Release Date: 2005/09/21

Reply via email to