Hi Heikki, 0001 through 0003 are straightforward, and I think they can be committed now if you like.
0004 is also pretty straightforward. The check you proposed upthread for pg_upgrade seems like the best solution to make that workable. I'll take a look at 0005 soon. I measured the conversions that were rewritten in 0003, and there is indeed a noticeable speedup: Big5 to EUC-TW: head 196ms 0001-3 152ms EUC-TW to Big5: head 190ms 0001-3 144ms I've attached the driver function for reference. Example use: select drive_conversion( 1000, 'euc_tw'::name, 'big5'::name, convert('a few kB of utf8 text here', 'utf8', 'euc_tw') ); I took a look at the test suite also, and the only thing to note is a couple places where the comment doesn't match the code: + -- JIS X 0201: 2-byte encoded chars starting with 0x8e (SS2) + byte1 = hex('0e'); + for byte2 in hex('a1')..hex('df') loop + return next b(byte1, byte2); + end loop; + + -- JIS X 0212: 3-byte encoded chars, starting with 0x8f (SS3) + byte1 = hex('0f'); + for byte2 in hex('a1')..hex('fe') loop + for byte3 in hex('a1')..hex('fe') loop + return next b(byte1, byte2, byte3); + end loop; + end loop; Not sure if it matters , but thought I'd mention it anyway. -- John Naylor EDB: http://www.enterprisedb.com
drive_conversion.c
Description: Binary data