Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-11 Thread Barry Lind

No this isn't a locale issue.  This is a character set issue.  Java is 
unicode based.  Therefore it needs to convert data it receives from the 
server into unicode.  In order to do this, it needs to know the 
character set that the server is sending back the data in.  Locale 
issues like collation sequences, date formats, etc. are unrelated to 
this specific issue.

thanks,
--Barry

Bruce Momjian wrote:
 Can I ask, isn't this the meaning if locale?  Is the problem that we
 need locale capability in jdbc?  We have a --enable-locale configure
 option.
 
 
 
 
Is this a jdbc issue or a general backend issue?



Bruce,

I think the TODO item should be:

Ability to set character set for a database without multibyte enabled

Currently createdb -E (and the corresponding create database sql 
command) only works if multibyte is enabled.  However it is useful to 
know which single byte character set is being used even when multibyte 
isn't enabled.  Currently there is no way to specify which single byte 
character set a database is using (unless you compile with multibyte).

thanks,
--Barry


Bruce Momjian wrote:

I can add something if people agree there is an issue here.



I've added a new section Character encoding to
http://lab.applinet.nl/postgresql-jdbc/, based on the
information from Dave and Barry.

I haven't seen a confirmation from pgsql-hackers or Bruce yet
that this issue will be added to the Todo list. I'm under the
impression that the backend developers don't see this as a
problem.

Regards,
Ren? Pijlman

On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:


I would like to add one additional comment.  In current sources the jdbc 
driver detects (through a hack) that the server doesn't have multibyte 
enabled and then ignores the SQL_ASCII return value and defaults to the 
JVM's character set instead of using SQL_ASCII.

The problem boils down to the fact that without multibyte enabled, the 
server has know way of specifiying which 8bit character set is being 
used for a particular database.  Thus a client like JDBC doesn't know 
what character set to use when converting to UNICODE.  Thus the best we 
can do in JDBC is use our best guess (JVM character set is probably the 
best default), and allow the user to explicitly specify something else 
if necessary.

thanks,
--Barry

Rene Pijlman wrote:


[forwarding to pgsql-hackers and Bruce as Todo list maintainer,
see comment below]

[insert with JDBC converts Latin-1 umlaut to ?]
On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:



You have to set the encoding when you make the connection.

Properties props = new Properties();
props.put(user,user);
props.put(password,password);
props.put(charSet,encoding);
Connection con = DriverManager.getConnection(url,props);
where encoding is the proper encoding for your database



For completeness, I quote the answer Barry Lind gave yesterday. 

[the driver] asks the server what character set is being used
for the database.  Unfortunatly the server only knows about
character sets if multibyte support is compiled in. If the
server is compiled without multibyte, then it always reports to
the client that the character set is SQL_ASCII (where SQL_ASCII
is 7bit ascii).  Thus if you don't have multibyte enabled on the
server you can't support 8bit characters through the jdbc
driver, unless you specifically tell the connection what
character set to use (i.e. override the default obtained from
the server).

This really is confusing and I think PostgreSQL should be able
to support single byte encoding conversions without enabling
multi-byte. 

To the very least there should be a --enable-encoding-conversion
or something similar, even if it just enables the current
multibyte support.

Bruce, can this be put on the TODO list one way or the other?
This problem has appeared 4 times in two months or so on the
JDBC list.

Regards,
Ren? Pijlman [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl





---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

 



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-10 Thread Bruce Momjian


Can I ask, isn't this the meaning if locale?  Is the problem that we
need locale capability in jdbc?  We have a --enable-locale configure
option.



 
 Is this a jdbc issue or a general backend issue?
 
 
  Bruce,
  
  I think the TODO item should be:
  
  Ability to set character set for a database without multibyte enabled
  
  Currently createdb -E (and the corresponding create database sql 
  command) only works if multibyte is enabled.  However it is useful to 
  know which single byte character set is being used even when multibyte 
  isn't enabled.  Currently there is no way to specify which single byte 
  character set a database is using (unless you compile with multibyte).
  
  thanks,
  --Barry
  
  
  Bruce Momjian wrote:
   I can add something if people agree there is an issue here.
   
   
  I've added a new section Character encoding to
  http://lab.applinet.nl/postgresql-jdbc/, based on the
  information from Dave and Barry.
  
  I haven't seen a confirmation from pgsql-hackers or Bruce yet
  that this issue will be added to the Todo list. I'm under the
  impression that the backend developers don't see this as a
  problem.
  
  Regards,
  Ren? Pijlman
  
  On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:
  
  I would like to add one additional comment.  In current sources the jdbc 
  driver detects (through a hack) that the server doesn't have multibyte 
  enabled and then ignores the SQL_ASCII return value and defaults to the 
  JVM's character set instead of using SQL_ASCII.
  
  The problem boils down to the fact that without multibyte enabled, the 
  server has know way of specifiying which 8bit character set is being 
  used for a particular database.  Thus a client like JDBC doesn't know 
  what character set to use when converting to UNICODE.  Thus the best we 
  can do in JDBC is use our best guess (JVM character set is probably the 
  best default), and allow the user to explicitly specify something else 
  if necessary.
  
  thanks,
  --Barry
  
  Rene Pijlman wrote:
  
  [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
  see comment below]
  
  [insert with JDBC converts Latin-1 umlaut to ?]
  On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
  
  
  You have to set the encoding when you make the connection.
  
  Properties props = new Properties();
  props.put(user,user);
  props.put(password,password);
  props.put(charSet,encoding);
  Connection con = DriverManager.getConnection(url,props);
  where encoding is the proper encoding for your database
  
  
  For completeness, I quote the answer Barry Lind gave yesterday. 
  
  [the driver] asks the server what character set is being used
  for the database.  Unfortunatly the server only knows about
  character sets if multibyte support is compiled in. If the
  server is compiled without multibyte, then it always reports to
  the client that the character set is SQL_ASCII (where SQL_ASCII
  is 7bit ascii).  Thus if you don't have multibyte enabled on the
  server you can't support 8bit characters through the jdbc
  driver, unless you specifically tell the connection what
  character set to use (i.e. override the default obtained from
  the server).
  
  This really is confusing and I think PostgreSQL should be able
  to support single byte encoding conversions without enabling
  multi-byte. 
  
  To the very least there should be a --enable-encoding-conversion
  or something similar, even if it just enables the current
  multibyte support.
  
  Bruce, can this be put on the TODO list one way or the other?
  This problem has appeared 4 times in two months or so on the
  JDBC list.
  
  Regards,
  Ren? Pijlman [EMAIL PROTECTED]
  
  ---(end of broadcast)---
  TIP 6: Have you searched our list archives?
  
  http://www.postgresql.org/search.mpl
  
  
   
  
  
  
  ---(end of broadcast)---
  TIP 2: you can get off all lists at once with the unregister command
  (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
  
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 853-3000
   +  If your life is a hard drive, |  830 Blythe Avenue
   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-09 Thread Bruce Momjian


I can add something if people agree there is an issue here.

 I've added a new section Character encoding to
 http://lab.applinet.nl/postgresql-jdbc/, based on the
 information from Dave and Barry.
 
 I haven't seen a confirmation from pgsql-hackers or Bruce yet
 that this issue will be added to the Todo list. I'm under the
 impression that the backend developers don't see this as a
 problem.
 
 Regards,
 Ren? Pijlman
 
 On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:
 I would like to add one additional comment.  In current sources the jdbc 
 driver detects (through a hack) that the server doesn't have multibyte 
 enabled and then ignores the SQL_ASCII return value and defaults to the 
 JVM's character set instead of using SQL_ASCII.
 
 The problem boils down to the fact that without multibyte enabled, the 
 server has know way of specifiying which 8bit character set is being 
 used for a particular database.  Thus a client like JDBC doesn't know 
 what character set to use when converting to UNICODE.  Thus the best we 
 can do in JDBC is use our best guess (JVM character set is probably the 
 best default), and allow the user to explicitly specify something else 
 if necessary.
 
 thanks,
 --Barry
 
 Rene Pijlman wrote:
  [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
  see comment below]
  
  [insert with JDBC converts Latin-1 umlaut to ?]
  On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
  
 You have to set the encoding when you make the connection.
 
 Properties props = new Properties();
 props.put(user,user);
 props.put(password,password);
 props.put(charSet,encoding);
 Connection con = DriverManager.getConnection(url,props);
 where encoding is the proper encoding for your database
 
  
  For completeness, I quote the answer Barry Lind gave yesterday. 
  
  [the driver] asks the server what character set is being used
  for the database.  Unfortunatly the server only knows about
  character sets if multibyte support is compiled in. If the
  server is compiled without multibyte, then it always reports to
  the client that the character set is SQL_ASCII (where SQL_ASCII
  is 7bit ascii).  Thus if you don't have multibyte enabled on the
  server you can't support 8bit characters through the jdbc
  driver, unless you specifically tell the connection what
  character set to use (i.e. override the default obtained from
  the server).
  
  This really is confusing and I think PostgreSQL should be able
  to support single byte encoding conversions without enabling
  multi-byte. 
  
  To the very least there should be a --enable-encoding-conversion
  or something similar, even if it just enables the current
  multibyte support.
  
  Bruce, can this be put on the TODO list one way or the other?
  This problem has appeared 4 times in two months or so on the
  JDBC list.
  
  Regards,
  Ren? Pijlman [EMAIL PROTECTED]
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
 http://www.postgresql.org/search.mpl
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-09 Thread Rene Pijlman

On Sun, 9 Sep 2001 10:24:32 -0400 (EDT), Bruce Momjian wrote:
I can add something if people agree there is an issue here.

IMO the issue is twofold. Without multibyte compiled in: 

1) the server cannot tell the client which single byte character
encoding is being used, so a client like JDBC cannot properly
convert to its native encoding

2) its not possible to create a database with a single byte
encoding other than ASCII (see my posting
http://fts.postgresql.org/db/mw/msg.html?mid=1029462)

I'm not sure to what extent these issues are related.

Also, client/server character conversion is coupled to multibyte
support (see Peter's reply to my posting). This may be a
limitation for other clients, but I'm not sure about that.

Basically, it seems that multibyte support is adding features
that are needed in single byte environents as well. Perhaps the
problem can be solved by documentation (recommending to enable
multibyte support in non-ASCII singlebyte environments), perhaps
by an alias (--enable-character-encoding), perhaps the
functionality needs to be split into a true multibyte part and a
generic part. I don't know what's best, this probably depends on
the price of compiling in multibyte support.

Regards,
René Pijlman

 I've added a new section Character encoding to
 http://lab.applinet.nl/postgresql-jdbc/, based on the
 information from Dave and Barry.
 
 I haven't seen a confirmation from pgsql-hackers or Bruce yet
 that this issue will be added to the Todo list. I'm under the
 impression that the backend developers don't see this as a
 problem.
 
 Regards,
 Ren? Pijlman
 
 On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:
 I would like to add one additional comment.  In current sources the jdbc 
 driver detects (through a hack) that the server doesn't have multibyte 
 enabled and then ignores the SQL_ASCII return value and defaults to the 
 JVM's character set instead of using SQL_ASCII.
 
 The problem boils down to the fact that without multibyte enabled, the 
 server has know way of specifiying which 8bit character set is being 
 used for a particular database.  Thus a client like JDBC doesn't know 
 what character set to use when converting to UNICODE.  Thus the best we 
 can do in JDBC is use our best guess (JVM character set is probably the 
 best default), and allow the user to explicitly specify something else 
 if necessary.
 
 thanks,
 --Barry
 
 Rene Pijlman wrote:
  [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
  see comment below]
  
  [insert with JDBC converts Latin-1 umlaut to ?]
  On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
  
 You have to set the encoding when you make the connection.
 
 Properties props = new Properties();
 props.put(user,user);
 props.put(password,password);
 props.put(charSet,encoding);
 Connection con = DriverManager.getConnection(url,props);
 where encoding is the proper encoding for your database
 
  
  For completeness, I quote the answer Barry Lind gave yesterday. 
  
  [the driver] asks the server what character set is being used
  for the database.  Unfortunatly the server only knows about
  character sets if multibyte support is compiled in. If the
  server is compiled without multibyte, then it always reports to
  the client that the character set is SQL_ASCII (where SQL_ASCII
  is 7bit ascii).  Thus if you don't have multibyte enabled on the
  server you can't support 8bit characters through the jdbc
  driver, unless you specifically tell the connection what
  character set to use (i.e. override the default obtained from
  the server).
  
  This really is confusing and I think PostgreSQL should be able
  to support single byte encoding conversions without enabling
  multi-byte. 
  
  To the very least there should be a --enable-encoding-conversion
  or something similar, even if it just enables the current
  multibyte support.
  
  Bruce, can this be put on the TODO list one way or the other?
  This problem has appeared 4 times in two months or so on the
  JDBC list.
  
  Regards,
  Ren? Pijlman [EMAIL PROTECTED]
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
 http://www.postgresql.org/search.mpl
 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-09 Thread Rene Pijlman

I've added a new section Character encoding to
http://lab.applinet.nl/postgresql-jdbc/, based on the
information from Dave and Barry.

I haven't seen a confirmation from pgsql-hackers or Bruce yet
that this issue will be added to the Todo list. I'm under the
impression that the backend developers don't see this as a
problem.

Regards,
René Pijlman

On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:
I would like to add one additional comment.  In current sources the jdbc 
driver detects (through a hack) that the server doesn't have multibyte 
enabled and then ignores the SQL_ASCII return value and defaults to the 
JVM's character set instead of using SQL_ASCII.

The problem boils down to the fact that without multibyte enabled, the 
server has know way of specifiying which 8bit character set is being 
used for a particular database.  Thus a client like JDBC doesn't know 
what character set to use when converting to UNICODE.  Thus the best we 
can do in JDBC is use our best guess (JVM character set is probably the 
best default), and allow the user to explicitly specify something else 
if necessary.

thanks,
--Barry

Rene Pijlman wrote:
 [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
 see comment below]
 
 [insert with JDBC converts Latin-1 umlaut to ?]
 On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
 
You have to set the encoding when you make the connection.

Properties props = new Properties();
props.put(user,user);
props.put(password,password);
props.put(charSet,encoding);
Connection con = DriverManager.getConnection(url,props);
where encoding is the proper encoding for your database

 
 For completeness, I quote the answer Barry Lind gave yesterday. 
 
 [the driver] asks the server what character set is being used
 for the database.  Unfortunatly the server only knows about
 character sets if multibyte support is compiled in. If the
 server is compiled without multibyte, then it always reports to
 the client that the character set is SQL_ASCII (where SQL_ASCII
 is 7bit ascii).  Thus if you don't have multibyte enabled on the
 server you can't support 8bit characters through the jdbc
 driver, unless you specifically tell the connection what
 character set to use (i.e. override the default obtained from
 the server).
 
 This really is confusing and I think PostgreSQL should be able
 to support single byte encoding conversions without enabling
 multi-byte. 
 
 To the very least there should be a --enable-encoding-conversion
 or something similar, even if it just enables the current
 multibyte support.
 
 Bruce, can this be put on the TODO list one way or the other?
 This problem has appeared 4 times in two months or so on the
 JDBC list.
 
 Regards,
 René Pijlman [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-09 Thread Bruce Momjian



Added to TODO.


 Bruce,
 
 I think the TODO item should be:
 
 Ability to set character set for a database without multibyte enabled
 
 Currently createdb -E (and the corresponding create database sql 
 command) only works if multibyte is enabled.  However it is useful to 
 know which single byte character set is being used even when multibyte 
 isn't enabled.  Currently there is no way to specify which single byte 
 character set a database is using (unless you compile with multibyte).
 
 thanks,
 --Barry
 
 
 Bruce Momjian wrote:
  I can add something if people agree there is an issue here.
  
  
 I've added a new section Character encoding to
 http://lab.applinet.nl/postgresql-jdbc/, based on the
 information from Dave and Barry.
 
 I haven't seen a confirmation from pgsql-hackers or Bruce yet
 that this issue will be added to the Todo list. I'm under the
 impression that the backend developers don't see this as a
 problem.
 
 Regards,
 Ren? Pijlman
 
 On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:
 
 I would like to add one additional comment.  In current sources the jdbc 
 driver detects (through a hack) that the server doesn't have multibyte 
 enabled and then ignores the SQL_ASCII return value and defaults to the 
 JVM's character set instead of using SQL_ASCII.
 
 The problem boils down to the fact that without multibyte enabled, the 
 server has know way of specifiying which 8bit character set is being 
 used for a particular database.  Thus a client like JDBC doesn't know 
 what character set to use when converting to UNICODE.  Thus the best we 
 can do in JDBC is use our best guess (JVM character set is probably the 
 best default), and allow the user to explicitly specify something else 
 if necessary.
 
 thanks,
 --Barry
 
 Rene Pijlman wrote:
 
 [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
 see comment below]
 
 [insert with JDBC converts Latin-1 umlaut to ?]
 On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
 
 
 You have to set the encoding when you make the connection.
 
 Properties props = new Properties();
 props.put(user,user);
 props.put(password,password);
 props.put(charSet,encoding);
 Connection con = DriverManager.getConnection(url,props);
 where encoding is the proper encoding for your database
 
 
 For completeness, I quote the answer Barry Lind gave yesterday. 
 
 [the driver] asks the server what character set is being used
 for the database.  Unfortunatly the server only knows about
 character sets if multibyte support is compiled in. If the
 server is compiled without multibyte, then it always reports to
 the client that the character set is SQL_ASCII (where SQL_ASCII
 is 7bit ascii).  Thus if you don't have multibyte enabled on the
 server you can't support 8bit characters through the jdbc
 driver, unless you specifically tell the connection what
 character set to use (i.e. override the default obtained from
 the server).
 
 This really is confusing and I think PostgreSQL should be able
 to support single byte encoding conversions without enabling
 multi-byte. 
 
 To the very least there should be a --enable-encoding-conversion
 or something similar, even if it just enables the current
 multibyte support.
 
 Bruce, can this be put on the TODO list one way or the other?
 This problem has appeared 4 times in two months or so on the
 JDBC list.
 
 Regards,
 Ren? Pijlman [EMAIL PROTECTED]
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
 http://www.postgresql.org/search.mpl
 
 
  
 
 
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-09 Thread Barry Lind

General backend issue.

--Barry

Bruce Momjian wrote:
 Is this a jdbc issue or a general backend issue?
 
 
 
Bruce,

I think the TODO item should be:

Ability to set character set for a database without multibyte enabled

Currently createdb -E (and the corresponding create database sql 
command) only works if multibyte is enabled.  However it is useful to 
know which single byte character set is being used even when multibyte 
isn't enabled.  Currently there is no way to specify which single byte 
character set a database is using (unless you compile with multibyte).

thanks,
--Barry


Bruce Momjian wrote:

I can add something if people agree there is an issue here.



I've added a new section Character encoding to
http://lab.applinet.nl/postgresql-jdbc/, based on the
information from Dave and Barry.

I haven't seen a confirmation from pgsql-hackers or Bruce yet
that this issue will be added to the Todo list. I'm under the
impression that the backend developers don't see this as a
problem.

Regards,
Ren? Pijlman

On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:


I would like to add one additional comment.  In current sources the jdbc 
driver detects (through a hack) that the server doesn't have multibyte 
enabled and then ignores the SQL_ASCII return value and defaults to the 
JVM's character set instead of using SQL_ASCII.

The problem boils down to the fact that without multibyte enabled, the 
server has know way of specifiying which 8bit character set is being 
used for a particular database.  Thus a client like JDBC doesn't know 
what character set to use when converting to UNICODE.  Thus the best we 
can do in JDBC is use our best guess (JVM character set is probably the 
best default), and allow the user to explicitly specify something else 
if necessary.

thanks,
--Barry

Rene Pijlman wrote:


[forwarding to pgsql-hackers and Bruce as Todo list maintainer,
see comment below]

[insert with JDBC converts Latin-1 umlaut to ?]
On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:



You have to set the encoding when you make the connection.

Properties props = new Properties();
props.put(user,user);
props.put(password,password);
props.put(charSet,encoding);
Connection con = DriverManager.getConnection(url,props);
where encoding is the proper encoding for your database



For completeness, I quote the answer Barry Lind gave yesterday. 

[the driver] asks the server what character set is being used
for the database.  Unfortunatly the server only knows about
character sets if multibyte support is compiled in. If the
server is compiled without multibyte, then it always reports to
the client that the character set is SQL_ASCII (where SQL_ASCII
is 7bit ascii).  Thus if you don't have multibyte enabled on the
server you can't support 8bit characters through the jdbc
driver, unless you specifically tell the connection what
character set to use (i.e. override the default obtained from
the server).

This really is confusing and I think PostgreSQL should be able
to support single byte encoding conversions without enabling
multi-byte. 

To the very least there should be a --enable-encoding-conversion
or something similar, even if it just enables the current
multibyte support.

Bruce, can this be put on the TODO list one way or the other?
This problem has appeared 4 times in two months or so on the
JDBC list.

Regards,
Ren? Pijlman [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl





---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


 



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC

2001-09-04 Thread Barry Lind

Rene,

I would like to add one additional comment.  In current sources the jdbc 
driver detects (through a hack) that the server doesn't have multibyte 
enabled and then ignores the SQL_ASCII return value and defaults to the 
JVM's character set instead of using SQL_ASCII.

The problem boils down to the fact that without multibyte enabled, the 
server has know way of specifiying which 8bit character set is being 
used for a particular database.  Thus a client like JDBC doesn't know 
what character set to use when converting to UNICODE.  Thus the best we 
can do in JDBC is use our best guess (JVM character set is probably the 
best default), and allow the user to explicitly specify something else 
if necessary.

thanks,
--Barry

Rene Pijlman wrote:
 [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
 see comment below]
 
 [insert with JDBC converts Latin-1 umlaut to ?]
 On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
 
You have to set the encoding when you make the connection.

Properties props = new Properties();
props.put(user,user);
props.put(password,password);
props.put(charSet,encoding);
Connection con = DriverManager.getConnection(url,props);
where encoding is the proper encoding for your database

 
 For completeness, I quote the answer Barry Lind gave yesterday. 
 
 [the driver] asks the server what character set is being used
 for the database.  Unfortunatly the server only knows about
 character sets if multibyte support is compiled in. If the
 server is compiled without multibyte, then it always reports to
 the client that the character set is SQL_ASCII (where SQL_ASCII
 is 7bit ascii).  Thus if you don't have multibyte enabled on the
 server you can't support 8bit characters through the jdbc
 driver, unless you specifically tell the connection what
 character set to use (i.e. override the default obtained from
 the server).
 
 This really is confusing and I think PostgreSQL should be able
 to support single byte encoding conversions without enabling
 multi-byte. 
 
 To the very least there should be a --enable-encoding-conversion
 or something similar, even if it just enables the current
 multibyte support.
 
 Bruce, can this be put on the TODO list one way or the other?
 This problem has appeared 4 times in two months or so on the
 JDBC list.
 
 Regards,
 René Pijlman [EMAIL PROTECTED]
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
 http://www.postgresql.org/search.mpl
 
 



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]