Re: [BUGS] 8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding

Heikki Linnakangas Thu, 20 Mar 2008 04:20:12 -0700

Sergey Burladyan wrote:

Thursday 20 March 2008 01:16:34 Heikki Linnakangas:

Here's a patch that does the conversion in the other direction as well.
As I'm not too familiar with cyrillic, can you double-check that this
works? I tested it using the convert() function between different
encodings, and it seems ok to me.


yes, i test it with function like this and it work now :)


Ok, patch applied.

Hmm. We use KOI8-R (or rather, MULE_INTERNAL with KOI8-R ) as an
intermediate encoding, because there's no direct conversion table
between ISO-8859-5 and the other cyrillic encodings. Ideally there would
be. Another possibility would be to use UTF-8 as the intermediate
encoding; that'd probably be much slower, but UTF-8 should have all the
characters needed.

I think that UTF-8 is too complex for translate 8-bit charset to another 8-bitcharset, but other solution is many many translate tables... hard question %)

Yeah. It's probably not worth the effort to change/test it. Apparentlythere's not many people using these conversion functions, as the bug hasbeen there since 7.3 and you're the first one to notice.

Is there any other characters like "YO" that are missing, that exist in
all the encodings?
if we say about alphabet letters, the answer is - No, only "YO" was missing.
if we say about any character, there is 'NO-BREAK SPACE' (U+00A0) it exist in1251, 866, koi8-r and iso but i do not think that it widely used...


Ok, good.

Thanks for the report and the patch!

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] 8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding

Reply via email to