Tom,

I don't consider it a 'uselessly obstrucionist policy' for the client to 
use the encoding the server says it is using :-)  The jdbc code simply 
issues a 'select getdatabaseencoding()' and uses the value the server 
tells it to.  I would place the blame more on the server for lying to 
the client :-)

I consider this a problem with the backend in that it requires multibyte 
support to be enabled to handle supporting even single byte character 
sets like LATIN1.  (True it supports LATIN1 without multibyte, but it 
doesn't correctly report to the client what character set the server is 
using, so the client has know way of knowing if it should use LATIN1, 
LATIN2, or KOI8-R -- the character set of the data is an important piece 
of information for a client especially in java where some encoding needs 
to be used to convert to ucs2).

Now it is an easy change in the jdbc code to use LATIN1 when the server 
reports SQL_ASCII, but I really dislike hardcoding support that only 
works in english speaking countries and Western Europe.  All this does 
is move the problem from being one that non-english countries have to 
being one where it is a non-english and non-western european problem 
(eg. Eastern Europe, Russia, etc.).

In the current jdbc code it is possible to override the character set 
that is being used (by passing a 'charSet' parameter to the connection), 
so it is possible to use a different encoding than the database is 
reporting.

from Connection.java:
     //Set the encoding for this connection
     //Since the encoding could be specified or obtained from the DB we 
use the
     //following order:
     //  1.  passed as a property
     //  2.  value from DB if supported by current JVM
     //  3.  default for JVM (leave encoding null)

thanks,
--Barry


Tom Lane wrote:

> Tony Grant <[EMAIL PROTECTED]> writes:
> 
>> On 04 May 2001 10:29:50 -0400, Tom Lane wrote:
>> 
>>> Does this happen with a non-multibyte-compiled database?  If so, I'd
>>> argue that's a serious bug in the JDBC code: it makes JDBC unusable
>>> for non-ASCII 8-bit character sets, unless one puts up with the overhead
>>> of MULTIBYTE support.
>> 
>> I fought with this for a few days. The solution is to dump the database
>> and create a new database with the correct encoding.
> 
>> MULTIBYTE is not neccesary I just set the type to LATIN1 and it works
>> fine.
> 
> 
> But a non-MULTIBYTE backend doesn't even have the concept of "setting
> the encoding" --- it will always just report SQL_ASCII.
> 
> Perhaps what this really says is that it'd be better if the JDBC code
> assumed LATIN1 translations when the backend claims SQL_ASCII.
> Certainly, translating all high-bit-set characters to '?' is about as
> uselessly obstructionist a policy as I can think of...
> 
>                       regards, tom lane
> 
> 


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to