At Tue, 21 Mar 2017 13:10:48 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
<horiguchi.kyot...@lab.ntt.co.jp> wrote in 
<20170321.131048.150321071.horiguchi.kyot...@lab.ntt.co.jp>
> At Fri, 17 Mar 2017 13:03:35 +0200, Heikki Linnakangas <hlinn...@iki.fi> 
> wrote in <01efd334-b839-0450-1b63-f2dea9326...@iki.fi>
> > On 03/17/2017 07:19 AM, Kyotaro HORIGUCHI wrote:
> > > I would like to use convert() function. It can be a large
> > > PL/PgSQL function or a series of "SELECT convert(...)"s. The
> > > latter is doable on-the-fly (by not generating/storing the whole
> > > script).
> > >
> > > | -- Test for SJIS->UTF-8 conversion
> > > | ...
> > > | SELECT convert('\0000', 'SJIS', 'UTF-8'); -- results in error
> > > | ...
> > > | SELECT convert('\897e', 'SJIS', 'UTF-8');
> > 
> > Makes sense.
> > 
> > >> You could then run those SQL statements against old and new server
> > >> version, and verify that you get the same results.
> > >
> > > Including the result files in the repository will make this easy
> > > but unacceptably bloats. Put mb/Unicode/README.sanity_check?
> > 
> > Yeah, a README with instructions on how to do sounds good. No need to
> > include the results in the repository, you can run the script against
> > an older version when you need something to compare with.
> 
> Ok, I'll write a small script to generate a set of "conversion
> dump" and try to write README.sanity_check describing how to use
> it.

I found that there's no way to identify the character domain of a
conversion on SQL interface. Unconditionally giving from 0 to
0xffffffff as a bytea string yields too-bloat result by containg
many bogus lines.  (If \x40 is a character, convert() also
accepts \x4040, \x404040 and \x40404040)

One more annoyance is the fact that mappings and conversion
procedures are not in one-to-one correspondence. The
corresnponcence is hidden in conversion_procs/*.c files so we
should extract it from them or provide as knowledge. Both don't
seem good.

Finally, it seems that I have no choice than resurrecting
map_checker. The exactly the same one no longer works but
map_dumper.c with almost the same structure will work.

If no one objects to adding map_dumper.c and
gen_mapdumper_header.pl (tentavie name, of course), I'll make a
patch to do that.

Any suggestions?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to