Davide Romanini wrote:
Barry Lind ha scritto:

The charSet= option will no longer work with the 7.3 driver talking to a 7.3 server, since character set translation is now performed by the server (for performance reasons) in that senario.

The correct solution here is to convert the database to the proper character set for the data it is storing. SQL_ASCII is not a proper character set for storing 8bit data.


Probably I'm not enough clear about the problem. I *cannot* change
charset type. SQL_ASCII really *is* the proper character set for my porpuses, because I actually work using psql and ODBC driver without any problem.

You were clear, however we disagree. SQL_ASCII is *not* the proper character set for your purposes. The characters you are having problems with do not exist in the SQL_ASCII character set. The fact that psql and ODBC work under this misconfiguration doesn't mean that the configuration is correct. Java deals with all characters internally in unicode thus forcing a character set conversion. So the code is converting from SQL_ASCII to UTF8. When it finds characters that are not part of SQL_ASCII character set it doesn't know what to do with them (are they LATIN1, LATIN5, LATIN? characters).

You state that you "*cannot* change" the character set.  Can you explain
why this is the case?

I repeat: psql and ODBC retrives all data (with the accents) in the correct manner. Also, if I change the org.postgresql.core.Encoding.java making the decodeUTF8 method to return simply a new String(data), JDBC retrives the data from my SQL_ASCII database correctly! So my question is: why JDBC calls the decodeUTF8 method also when the string is surely *not* an UTF-8 string?

If you were only storing SQL_ASCII characters it would be a UTF8 string since SQL_ASCII is a subset of UTF8. But since you are storing invalid SQL_ASCII characters this is no longer true.

The logic is as follows:
The driver sets the CLIENT_ENCODING parameter to UNICODE which instructs the server to convert from the character set of the database to UTF8.


The server then sends all data to the client encoded in UTF8.

The jdbc driver reads the UTF8 data and converts it to java's internal unicode representation.

The problem in all of this is that the server has decided as an optimization that if the database character set is SQL_ASCII then no conversion is necessary to UTF8 since SQL_ASCII is a proper subset of UTF8. However when characters that are not SQL_ASCII are stored in the database (i.e. 8bit characters) then this optimization simply sends them on to the client as if they were valid UTF8 characters (which they are not). So the client then tries to read what are supposed to be UTF8 characters and fails because it is receiving non UTF8 data even though it asked the server to only send it UTF8 data.

If jdbc could recognize that the string is *not* an UTF-8 string, then it will simply return a new String that is the right thing to do.
It's obvious that if JDBC receives from postgresql server a byte array representing a non-UTF8 string, and it a calls e method that wants as a parameter a byte array representing an UTF8 string, then it is a *bug*, because for non-UTF8 strings it must return a new String.



As stated above the driver tells the server to send all data as UTF8, but because of the optimization and the non-SQL_ASCII characters you are storing that optimization results in non-UTF8 data being sent to the client.


I hope to be enough clear this time.

As I said ealier you were clear the first time. I hope I have been more clear in my response to explain the issues in greater detail.



Sincerely, I'm getting a bit frustrated from the problem, because I've projects to do and it prevents me to do that projects :-(

I understand that you are frustrated, but frankly I am frustrated too, because I keep telling you what the solution to your problem is and you keep ignoring it :-)


thanks,
--Barry


---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to