Re: [SPAM] [MessageLimit][lowlimit] Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases
Excerpts from Alvaro Herrera's message of mar jul 10 16:23:57 -0400 2012: Excerpts from Kyotaro HORIGUCHI's message of mar jul 03 04:59:38 -0400 2012: Hello, Here is regression test runs on pg's also built with cygwin-gcc and VC++. The patches attached following, - plperl_sql_ascii-4.patch : fix for pl/perl utf8 vs sql_ascii - plperl_sql_ascii_regress-1.patch : regression test for this patch. I added some tests on encoding to this. I will mark this patch as 'ready for committer' after this. I have pushed these changes to HEAD, 9.2 and 9.1. Instead of the games with plperl_lc_*.out being copied around, I just used the ASCII version as plperl_lc_1.out and the UTF8 one as plperl_lc.out. ... and this story hasn't ended yet, because one of the new tests is failing. See here: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=magpiedt=2012-07-11%2010%3A00%3A04 The interesting part of the diff is: *** *** 34,41 return ($str ne $match ? $code.DIFFER : $code.ab\x{5ddd}cd); $$ LANGUAGE plperl; SELECT encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea, 'escape') ! encode ! -- ! NotUTF8:ab\345\267\235cd ! (1 row) ! --- 34,38 return ($str ne $match ? $code.DIFFER : $code.ab\x{5ddd}cd); $$ LANGUAGE plperl; SELECT encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea, 'escape') ! ERROR: character with byte sequence 0xe5 0xb7 0x9d in encoding UTF8 has no equivalent in encoding LATIN1 ! CONTEXT: PL/Perl function perl_utf_inout I am not sure what can we do here other than remove this function and query from the test. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [SPAM] [MessageLimit][lowlimit] Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases
On Wed, Jul 11, 2012 at 1:42 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: I have pushed these changes to HEAD, 9.2 and 9.1. Instead of the games with plperl_lc_*.out being copied around, I just used the ASCII version as plperl_lc_1.out and the UTF8 one as plperl_lc.out. ... and this story hasn't ended yet, because one of the new tests is failing. See here: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=magpiedt=2012-07-11%2010%3A00%3A04 [...] SELECT encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea, 'escape') ! ERROR: character with byte sequence 0xe5 0xb7 0x9d in encoding UTF8 has no equivalent in encoding LATIN1 ! CONTEXT: PL/Perl function perl_utf_inout I am not sure what can we do here other than remove this function and query from the test. Hrm, me neither. I say drop em. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [SPAM] [MessageLimit][lowlimit] Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases
Hmm... Sorry for immature patch.. ... and this story hasn't ended yet, because one of the new tests is failing. See here: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=magpiedt=2012-07-11%2010%3A00%3A04 The interesting part of the diff is: ... SELECT encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea, 'escape') ! ERROR: character with byte sequence 0xe5 0xb7 0x9d in encoding UTF8 has no equivalent in encoding LATIN1 ! CONTEXT: PL/Perl function perl_utf_inout I am not sure what can we do here other than remove this function and query from the test. I've run the regress only for the environment capable to handle the character U+5ddd (Japanese character which means river)... The byte sequences which can be decoded and the result byte sequences of encoding from a unicode character vary among the encodings. The problem itself which is the aim of this thread could be covered without the additional test. That confirms if encoding/decoding is done as expected on calling the language handler. I suppose that testing for the two cases and additional one case which runs pg_do_encoding_conversion(), say latin1, would be enough to confirm that encoding/decoding is properly done, since the concrete conversion scheme is not significant this case. So I recommend that we should add the test for latin1 and omit the test from other than sql_ascii, utf8 and latin1. This might be archieved by create empty plperl_lc.sql and plperl_lc.out files for those encodings. What do you think about that? regards, -- Kyotaro Horiguchi NTT Open Source Software Center == My e-mail address has been changed since Apr. 1, 2012. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [SPAM] [MessageLimit][lowlimit] Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases
Excerpts from Kyotaro HORIGUCHI's message of mar jul 03 04:59:38 -0400 2012: Hello, Here is regression test runs on pg's also built with cygwin-gcc and VC++. The patches attached following, - plperl_sql_ascii-4.patch : fix for pl/perl utf8 vs sql_ascii - plperl_sql_ascii_regress-1.patch : regression test for this patch. I added some tests on encoding to this. I will mark this patch as 'ready for committer' after this. I have pushed these changes to HEAD, 9.2 and 9.1. Instead of the games with plperl_lc_*.out being copied around, I just used the ASCII version as plperl_lc_1.out and the UTF8 one as plperl_lc.out. I chose to backpatch the whole thing instead of cherry-picking parts of it; that was turning into a tedious and pointless exercise. We'll see how does the buildfarm like the whole thing -- including on MSVC, which I did not test at all. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers