Victor>We don't have 190 message catalog translations in the PostgreSQL.
Victor>So problem with encoding for messages is quite limited.
Even though the number of translations is limited, there's a problem when
trying to tell one "one-byte-encoding" from another "one-byte" one.
It would be so much better if ServerErrorMessages included encoding right
in the message itself.
For pgjdbc, I've implemented a workaround that relies on the following:
1) It knows how "FATAL" looks like in several translations, and it knows
often used encodings in those translations. For instance, for Russian it
tries CP1251, KOI8, and ALT encodings. It converts "ВАЖНО" (Russian for
FATAL) using those three encodings and searches that byte sequence in the
error message. If there's a match, then the encoding is identified.
2) Unfortunately, it does not help for Japanese, as "FATAL there is
translated as FATAL". So I hard-coded several typical words like
"database", "user", "role" (see ), so if those byte sequences are
present, the message is assumed to be in Japanese. It would be great if
someone could review those as I do not speak Japanese.
3) Then it tries different LATIN encodings.
Here's the commit
Kyotaro> Is there any source to know the compatibility for any combination
Kyotaro> of language vs encoding? Maybe we need a ground for the list.
I use "locale -a" for that.
For instance, for Japanese it prints the following on my machine (OS X
locale -a | grep ja