At Tue, 21 Mar 2017 13:10:48 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote in <20170321.131048.150321071.horiguchi.kyot...@lab.ntt.co.jp> > At Fri, 17 Mar 2017 13:03:35 +0200, Heikki Linnakangas <hlinn...@iki.fi> > wrote in <01efd334-b839-0450-1b63-f2dea9326...@iki.fi> > > On 03/17/2017 07:19 AM, Kyotaro HORIGUCHI wrote: > > > I would like to use convert() function. It can be a large > > > PL/PgSQL function or a series of "SELECT convert(...)"s. The > > > latter is doable on-the-fly (by not generating/storing the whole > > > script). > > > > > > | -- Test for SJIS->UTF-8 conversion > > > | ... > > > | SELECT convert('\0000', 'SJIS', 'UTF-8'); -- results in error > > > | ... > > > | SELECT convert('\897e', 'SJIS', 'UTF-8'); > > > > Makes sense. > > > > >> You could then run those SQL statements against old and new server > > >> version, and verify that you get the same results. > > > > > > Including the result files in the repository will make this easy > > > but unacceptably bloats. Put mb/Unicode/README.sanity_check? > > > > Yeah, a README with instructions on how to do sounds good. No need to > > include the results in the repository, you can run the script against > > an older version when you need something to compare with. > > Ok, I'll write a small script to generate a set of "conversion > dump" and try to write README.sanity_check describing how to use > it.
I found that there's no way to identify the character domain of a conversion on SQL interface. Unconditionally giving from 0 to 0xffffffff as a bytea string yields too-bloat result by containg many bogus lines. (If \x40 is a character, convert() also accepts \x4040, \x404040 and \x40404040) One more annoyance is the fact that mappings and conversion procedures are not in one-to-one correspondence. The corresnponcence is hidden in conversion_procs/*.c files so we should extract it from them or provide as knowledge. Both don't seem good. Finally, it seems that I have no choice than resurrecting map_checker. The exactly the same one no longer works but map_dumper.c with almost the same structure will work. If no one objects to adding map_dumper.c and gen_mapdumper_header.pl (tentavie name, of course), I'll make a patch to do that. Any suggestions? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers