I am absolutely sure that we cannot rely on recommandations, such as "create UNICODE database for multi-byte data and SQL_ASCII otherwise".Jean-Michel,
you got me wrong. Client encoding is only about the data transfer, and that includes not only the transfer from the server to the client but also from the client interface to the user interface. pqsql will happily convert any server encoding to unicode, and wxWindows will bring this to the user's attention.
Right; we'll be using unicode for internal communication to the server, whatever the server encoding might be.
PostgreSQL central feature is the ability to store and manage various encodings. For example, in Japan, many databases are stored under EUC_JP and SJIS. You wron't ask users to migrate their database to UTF-8.
Yes.
Therefore, pgAdmin3 shall manage encodings transparently. This is a ***key feature***. Don't get me wrong, I propose to:
1) Always compile pgAdmin3 with Unicode support. (By the way, I would also be delighted if all .po files were stored in UTF-8).
Yes.
2) Always "set client_encoding=Unicode" in order to recode data streams at backend level. This is 100% safe in case of data viewing. From my point of view, I never had any problem with this feature, which is bug free.
Yes.
PostgreSQL is the only database in the world with such on-the-fly conversion at data stream level. So why not use it.
3) We only need to check whether the data entered in the grid can be (a) converted from UTF-8 into the database encoding and (b) back from the database encoding into Unicode.
Hmmm... lot of work.
Iconv (http://www.gnu.org/software/libiconv) or recode (http://www.iro.umontreal.ca/contrib/recode/HTML/readme.html) libraries can be used for that. In case of license incompatibilities, we can always use binary executables of iconv and recode. Iconv and recode are installed by default in all GNU/Linux distributions.
Much better!
Alternatively, we could borrow PostgreSQL backend validation code. I know this code exists because in some cases PostgreSQL refused to enter euro signs in a Latin1 database and returned an error.
Uhhh! Horror!
There is no other way. The only other way would be to add native multi-byte support (SJSS, etc...) to wxWindows widgets,
which is impossible. So, the only remaining solution is to view all data in UTF-8 Unicode.Phew... agreed.
I doubt a user should know that the Euro sign (€) does not belong to Latin1. There are hundreds of examples like that. Therefore, it is impossible to create a list of legal/forbidden characters.There's probably such a function like "bool PG_is_compliant_to_server_encoding(text)", which we should use. If not locatable, we can contact pgsql-hackers about this.
The only way to test for correct entry is: - to convert the entry like explained in 3) or, - use PostgreSQL backend code.
Regards, Andreas
---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?
http://archives.postgresql.org
