Re: [pgadmin-hackers] pgadmin3 clientencoding

Andreas Pflug Tue, 10 Jun 2003 07:41:22 -0700

Jean-Michel POURE wrote:

I am absolutely sure that we cannot rely on recommandations, such as "create UNICODE database for multi-byte data and SQL_ASCII otherwise".

Jean-Michel,

you got me wrong. Client encoding is only about the data transfer, and that includes not only the transfer from the server to the client but also from the client interface to the user interface. pqsql will happily convert any server encoding to unicode, and wxWindows will bring this to the user's attention.

PostgreSQL central feature is the ability to store and manage various encodings. For example, in Japan, many databases are stored under EUC_JP and SJIS. You wron't ask users to migrate their database to UTF-8.

Right; we'll be using unicode for internal communication to the server, whatever the server encoding might be.

Therefore, pgAdmin3 shall manage encodings transparently. This is a ***key feature***. Don't get me wrong, I propose to:

1) Always compile pgAdmin3 with Unicode support. (By the way, I would also be delighted if all .po files were stored in UTF-8).

Yes.

2) Always "set client_encoding=Unicode" in order to recode data streams at backend level. This is 100% safe in case of data viewing. From my point of view, I never had any problem with this feature, which is bug free.

Yes.

PostgreSQL is the only database in the world with such on-the-fly conversion at data stream level. So why not use it.

3) We only need to check whether the data entered in the grid can be (a) converted from UTF-8 into the database encoding and (b) back from the database encoding into Unicode.

Yes.

Iconv (http://www.gnu.org/software/libiconv) or recode (http://www.iro.umontreal.ca/contrib/recode/HTML/readme.html) libraries can be used for that. In case of license incompatibilities, we can always use binary executables of iconv and recode. Iconv and recode are installed by default in all GNU/Linux distributions.

Hmmm... lot of work.

Alternatively, we could borrow PostgreSQL backend validation code. I know this code exists because in some cases PostgreSQL refused to enter euro signs in a Latin1 database and returned an error.

Much better!

There is no other way. The only other way would be to add native multi-byte support (SJSS, etc...) to wxWindows widgets,

Uhhh! Horror!

which is impossible. So, the only remaining solution is to view all data in UTF-8 Unicode.

Phew... agreed.

I doubt a user should know that the Euro sign (€) does not belong to Latin1. There are hundreds of examples like that. Therefore, it is impossible to create a list of legal/forbidden characters.
The only way to test for correct entry is:
- to convert the entry like explained in 3)
or,
- use PostgreSQL backend code.

There's probably such a function like "bool PG_is_compliant_to_server_encoding(text)", which we should use. If not locatable, we can contact pgsql-hackers about this.

Regards,
Andreas


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [pgadmin-hackers] pgadmin3 clientencoding

Reply via email to