[PATCHES] Libpq COPY optimization patch
Here is a patch against today's code 1/24. As discussed in -hackers consumeInput/parse is removed from being called every single time. It's still there for only when the data is sent to the server. Alon. pq_put_copy_data.patch Description: Binary data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PATCHES] DBMirror.pl performance change
This attached patch greatly enhances DBMirror.pl performance. DBMirror.pl was known to have problems when replicating bytea columns whose data was binary bytes (non printable). (E.g. tiff, pdf, jpeg, bzip2, etc...) Minutes, (or hours) for 500kb columns were not unusual. There has even been an effort by Peter Wilson (petew ( at ) yellowhawk ( dot ) co ( dot ) uk) to write a C alternative, which tries to overcome these problems. However i think another C program is not really needed at this point. This patch changes the way extractData parses the data field. Please have a look at the patch, if you see any potential problems. Thank you. P.S. I also emailed the -sql,-general,-pgreplication lists with a relevant message. Steven Singer (the original author) is not reachable. However i think that some users might already suffer from poor DBMirror.pl performance. -- -Achilleus *** DBMirror.pl Tue Jan 24 09:36:24 2006 --- DBMirror.pl.new Tue Jan 24 09:41:11 2006 *** *** 874,879 --- 874,880 } + sub extractData($$) { my $pendingResult = $_[0]; my $currentTuple = $_[1]; *** *** 881,886 --- 882,888 my %valuesHash; $fnumber = 4; my $dataField = $pendingResult-getvalue($currentTuple,$fnumber); + my $numofbs; while(length($dataField)0) { # Extract the field name that is surronded by double quotes *** *** 902,929 #Recommended in perlsyn manpage. do { my $matchString; #Find the substring ending with the first ' or first \ ! $dataField =~ m/(.*?[\'\\])?/s; $matchString = $1; - $value .= substr $matchString,0,length($matchString)-1; - - if($matchString =~ m/(\'$)/s) { - # $1 runs to the end of the field value. - $dataField = substr $dataField,length($matchString)+1; - last; - - } - else { - #deal with the escape character. - #It The character following the escape gets appended. - $dataField = substr $dataField,length($matchString); - $dataField =~ s/(^.)//s; - $value .= $1; ! } ! } until(length($dataField)==0); } --- 904,930 #Recommended in perlsyn manpage. do { my $matchString; + my $matchString2; #Find the substring ending with the first ' or first \ ! $dataField =~ m/(.*?[\'])?/s; $matchString = $1; + $numofbs = ($matchString =~ tr/\\//) % 2; ! if ($numofbs == 1) { #// odd number of \, i.e. intermediate ' ! $matchString2 = substr $matchString,0, length($matchString)-2; ! $matchString2 =~ s//\\/g; ! $value .= ($matchString2 . \'); ! $dataField = substr $dataField,length($matchString); } ! else { #// even number of \, i.e. found end of data ! $matchString2 = substr $matchString,0, length($matchString)-1; ! $matchString2 =~ s//\\/g; ! $value .= $matchString2; ! $dataField = substr $dataField,length($matchString)+1; ! last; ! } ! } until(length($dataField)==0); } ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] CIDR/INET improvements
On Mon, Jan 23, 2006 at 11:30:58PM -0500, Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: Patch for testing attached. This is an utterly bad idea, because it not only doesn't address the problem (ie, confusion about whether inet and cidr are distinct types or not), but it masks mistakes in that realm by hiding data on output. It'll be almost impossible to debug situations where x is different from y but they display the same. FWIW, I append the patch I've done a few weeks ago. It adds an inettocidr cast function. I updated it to comply to Bruce's recent ip_type - ip_is_cidr change. Joachim -- Joachim Wieland [EMAIL PROTECTED] C/ Usandizaga 12 1°B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available diff -cr cvs/pgsql/src/backend/utils/adt/network.c cvs.build/pgsql/src/backend/utils/adt/network.c *** cvs/pgsql/src/backend/utils/adt/network.c 2006-01-23 23:52:36.0 +0100 --- cvs.build/pgsql/src/backend/utils/adt/network.c 2006-01-24 09:53:37.0 +0100 *** *** 325,330 --- 325,387 PG_RETURN_INET_P(dst); } + Datum + inettocidr(PG_FUNCTION_ARGS) + { + inet *src = PG_GETARG_INET_P(0); + inet *dst; + int byte; + int nbits; + int maxbits; + int maxbytes; + unsigned char mask; + + /* make sure any unused bits are zeroed */ + dst = (inet *) palloc0(VARHDRSZ + sizeof(inet_struct)); + + if (ip_family(src) == PGSQL_AF_INET) + { + maxbits = 32; + maxbytes = 4; + } + else + { + maxbits = 128; + maxbytes = 16; + } + Assert(ip_bits(dst) = maxbits); + + /* copy over */ + ip_family(dst) = ip_family(src); + ip_bits(dst) = ip_bits(src); + ip_is_cidr(dst) = 1; /* 1 represents CIDR */ + memcpy(ip_addr(dst), ip_addr(src), maxbytes); + + /* zero out the bits that are covered by the netmask */ + if (ip_bits(dst) maxbits) + { + byte = ip_bits(dst) / 8; + nbits = ip_bits(dst) % 8; + /* reset the first byte, this might be a partial byte */ + mask = 0xff; + if (ip_bits(dst) != 0) + { + mask = nbits; + ip_addr(dst)[byte] = ~mask; + byte++; + } + + /* from now on we reset only complete bytes */ + while (byte maxbytes) + { + ip_addr(dst)[byte] = 0; + byte++; + } + } + + PG_RETURN_INET_P(dst); + } + /* *Basic comparison function for sorting and inet/cidr comparisons. * diff -cr cvs/pgsql/src/include/catalog/pg_cast.h cvs.build/pgsql/src/include/catalog/pg_cast.h *** cvs/pgsql/src/include/catalog/pg_cast.h 2005-10-21 17:45:06.0 +0200 --- cvs.build/pgsql/src/include/catalog/pg_cast.h 2006-01-24 09:38:15.0 +0100 *** *** 249,255 * INET category */ DATA(insert ( 6508690 i )); ! DATA(insert ( 8696500 i )); /* * BitString category --- 249,255 * INET category */ DATA(insert ( 6508690 i )); ! DATA(insert ( 869650 1265 a )); /* * BitString category diff -cr cvs/pgsql/src/include/catalog/pg_proc.h cvs.build/pgsql/src/include/catalog/pg_proc.h *** cvs/pgsql/src/include/catalog/pg_proc.h 2006-01-20 13:52:12.0 +0100 --- cvs.build/pgsql/src/include/catalog/pg_proc.h 2006-01-24 09:38:15.0 +0100 *** *** 2359,2364 --- 2359,2366 DESCR(I/O); DATA(insert OID = 911 ( inet_out PGNSP PGUID 12 f f t f i 1 2275 869 _null_ _null_ _null_ inet_out - _null_ )); DESCR(I/O); + DATA(insert OID = 1265 ( inettocidr PGNSP PGUID 12 f f t f i 1 650 869 _null_ _null_ _null_ inettocidr - _null_ )); + DESCR(convert inet to cidr); /* for cidr type support */ DATA(insert OID = 1267 ( cidr_in PGNSP PGUID 12 f f t f i 1 650 2275 _null_ _null_ _null_ cidr_in - _null_ )); diff -cr cvs/pgsql/src/include/utils/builtins.h cvs.build/pgsql/src/include/utils/builtins.h *** cvs/pgsql/src/include/utils/builtins.h 2006-01-20 13:52:13.0 +0100 --- cvs.build/pgsql/src/include/utils/builtins.h2006-01-24 09:38:15.0 +0100 *** *** 696,701 --- 696,702 extern Datum inet_out(PG_FUNCTION_ARGS); extern Datum inet_recv(PG_FUNCTION_ARGS); extern Datum inet_send(PG_FUNCTION_ARGS); + extern Datum inettocidr(PG_FUNCTION_ARGS); extern Datum cidr_in(PG_FUNCTION_ARGS); extern Datum cidr_out(PG_FUNCTION_ARGS);
Re: [PATCHES] Libpq COPY optimization patch
Alon Goldshuv [EMAIL PROTECTED] writes: Here is a patch against today's code 1/24. As discussed in -hackers consumeInput/parse is removed from being called every single time. It's still there for only when the data is sent to the server. This appears to be the exact same patch you sent before. Did you test my suggestion of simply removing the PQconsumeInput call? I see no reason to add it inside the loop. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] CIDR/INET improvements
Joachim Wieland [EMAIL PROTECTED] writes: FWIW, I append the patch I've done a few weeks ago. It adds an inettocidr cast function. I think we need to take two steps back and look at the larger picture: the INET/CIDR situation is conceptually a mess and it's going to take more than a localized change to clean it up. I have some ideas about this and will try to post a proposal on -hackers later today. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Libpq COPY optimization patch
Tom, Here is a patch against today's code 1/24. As discussed in -hackers consumeInput/parse is removed from being called every single time. It's still there for only when the data is sent to the server. This appears to be the exact same patch you sent before. Did you test my suggestion of simply removing the PQconsumeInput call? I see no reason to add it inside the loop. My mistake. I'll make the correction. I guess that although parseInput is cheap we could still use a conditional to see when data was sent and only then call it (without PQconsumeInput) instead of calling it every single time PQputCopyData is called. Any objection to that? Alon. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Libpq COPY optimization patch
Alon Goldshuv [EMAIL PROTECTED] writes: I guess that although parseInput is cheap we could still use a conditional to see when data was sent and only then call it (without PQconsumeInput) instead of calling it every single time PQputCopyData is called. Any objection to that? You mean something like if (input-buffer-not-empty) parseInput(); ? This still bothers me a bit since it's a mixing of logic levels; PQputCopyData is an output routine, it shouldn't be getting its fingers dirty with input buffer contents. I'm willing to tolerate this if it can be demonstrated that it provides a useful performance gain compared to the unconditional parseInput call, but let's see some numbers. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Libpq COPY optimization patch
You mean something like if (input-buffer-not-empty) parseInput(); ? This still bothers me a bit since it's a mixing of logic levels; PQputCopyData is an output routine, it shouldn't be getting its fingers dirty with input buffer contents. I'm willing to tolerate this if it can be demonstrated that it provides a useful performance gain compared to the unconditional parseInput call, but let's see some numbers. Yes, I understand. We'll see what the performance gain is like and see if it's worth it, I'll report back. Alon. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Uninstall scripts for contrib
On Sun, Jan 15, 2006 at 09:55:39PM -0800, David Fetter wrote: On Mon, Jan 16, 2006 at 12:13:11AM -0500, Neil Conway wrote: On Sun, 2006-01-15 at 20:08 -0800, David Fetter wrote: ifdef USE_PGXS The change to $PostgreSQL$ is bogus (perhaps due to the way you setup cvsup?), as are all the other $PostgreSQL$ changes in the patch. Also, the patch doesn't actually add any files called uninstall.sql. Oops. My FM R'ing skills need some work. This patch includes the files. Next: naming files so they don't clobber each other. Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! Index: contrib/btree_gist/Makefile === RCS file: /oracle/CVSup/pgsql/contrib/btree_gist/Makefile,v retrieving revision 1.8 diff -c -r1.8 Makefile *** contrib/btree_gist/Makefile 27 Sep 2005 17:12:59 - 1.8 --- contrib/btree_gist/Makefile 24 Jan 2006 23:09:24 - *** *** 7,12 --- 7,13 btree_bytea.o btree_bit.o btree_numeric.o DATA_built = btree_gist.sql + DATA= uninstall_btree_gist.sql DOCS= README.btree_gist REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz time timetz \ Index: contrib/chkpass/Makefile === RCS file: /oracle/CVSup/pgsql/contrib/chkpass/Makefile,v retrieving revision 1.7 diff -c -r1.7 Makefile *** contrib/chkpass/Makefile27 Sep 2005 17:13:00 - 1.7 --- contrib/chkpass/Makefile24 Jan 2006 23:09:24 - *** *** 1,9 ! # $PostgreSQL: pgsql/contrib/chkpass/Makefile,v 1.6 2004/08/20 20:13:02 momjian Exp $ MODULE_big = chkpass OBJS = chkpass.o SHLIB_LINK = $(filter -lcrypt, $(LIBS)) DATA_built = chkpass.sql DOCS = README.chkpass ifdef USE_PGXS --- 1,10 ! # $PostgreSQL: pgsql/contrib/chkpass/Makefile,v 1.7 2005/09/27 17:13:00 tgl Exp $ MODULE_big = chkpass OBJS = chkpass.o SHLIB_LINK = $(filter -lcrypt, $(LIBS)) DATA_built = chkpass.sql + DATA = uninstall_chkpass.sql DOCS = README.chkpass ifdef USE_PGXS Index: contrib/cube/Makefile === RCS file: /oracle/CVSup/pgsql/contrib/cube/Makefile,v retrieving revision 1.15 diff -c -r1.15 Makefile *** contrib/cube/Makefile 18 Oct 2005 01:30:48 - 1.15 --- contrib/cube/Makefile 24 Jan 2006 23:09:24 - *** *** 1,9 ! # $PostgreSQL: pgsql/contrib/cube/Makefile,v 1.14 2005/09/27 17:13:00 tgl Exp $ MODULE_big = cube OBJS= cube.o cubeparse.o DATA_built = cube.sql DOCS = README.cube REGRESS = cube --- 1,10 ! # $PostgreSQL: pgsql/contrib/cube/Makefile,v 1.15 2005/10/18 01:30:48 tgl Exp $ MODULE_big = cube OBJS= cube.o cubeparse.o DATA_built = cube.sql + DATA = uninstall_cube.sql DOCS = README.cube REGRESS = cube Index: contrib/dblink/Makefile === RCS file: /oracle/CVSup/pgsql/contrib/dblink/Makefile,v retrieving revision 1.10 diff -c -r1.10 Makefile *** contrib/dblink/Makefile 27 Sep 2005 17:13:01 - 1.10 --- contrib/dblink/Makefile 24 Jan 2006 23:09:24 - *** *** 1,4 ! # $PostgreSQL: pgsql/contrib/dblink/Makefile,v 1.9 2004/08/20 20:13:03 momjian Exp $ MODULE_big = dblink PG_CPPFLAGS = -I$(libpq_srcdir) --- 1,4 ! # $PostgreSQL: pgsql/contrib/dblink/Makefile,v 1.10 2005/09/27 17:13:01 tgl Exp $ MODULE_big = dblink PG_CPPFLAGS = -I$(libpq_srcdir) *** *** 6,11 --- 6,12 SHLIB_LINK = $(libpq) DATA_built = dblink.sql + DATA = uninstall_dblink.sql DOCS = README.dblink REGRESS = dblink Index: contrib/dbmirror/Makefile === RCS file: /oracle/CVSup/pgsql/contrib/dbmirror/Makefile,v retrieving revision 1.5 diff -c -r1.5 Makefile *** contrib/dbmirror/Makefile 27 Sep 2005 17:13:01 - 1.5 --- contrib/dbmirror/Makefile 24 Jan 2006 23:09:24 - *** *** 1,4 ! # $PostgreSQL: pgsql/contrib/dbmirror/Makefile,v 1.4 2004/11/04 06:09:19 neilc Exp $ MODULES = pending SCRIPTS = clean_pending.pl DBMirror.pl --- 1,4 ! # $PostgreSQL: pgsql/contrib/dbmirror/Makefile,v 1.5 2005/09/27 17:13:01 tgl Exp $ MODULES = pending SCRIPTS = clean_pending.pl DBMirror.pl Index: contrib/earthdistance/Makefile === RCS file: /oracle/CVSup/pgsql/contrib/earthdistance/Makefile,v retrieving revision 1.16 diff -c -r1.16 Makefile *** contrib/earthdistance/Makefile 27 Sep 2005 17:13:02 - 1.16 --- contrib/earthdistance/Makefile 24 Jan 2006 23:09:24 - *** *** 1,7 ! # $PostgreSQL: pgsql/contrib/earthdistance/Makefile,v 1.15 2005/07/24 23:30:09 tgl
Re: [PATCHES] plperl / locale / win32
I have now tested the patch. It passes regression and the test case Tom previously posted. Unless there's an objection I will apply it and backport it in the next few days. cheers andrew I wrote: I was reminded today of the outstanding issue with plperl setting the locale on Windows, and our environment workaround not working there. There has been NO response to the bug I filed (#38193) at rt.perl.org about this issue. The attached patch adapts one I previously tested as working, but instead of calling setlocale() directly it gets perl to do it so that perl and postgres have the same idea of what the locale should be, which should meet Greg's and Tom's objection to the previous patch. My Windows machine is currently doing other work, so I can't test right now - if someone else could that would be nice. *** plperl.c2006-01-08 17:27:52.0 -0500 --- plperl.c.locfix 2006-01-20 19:50:04.0 -0500 *** *** 45,50 --- 45,51 #include ctype.h #include fcntl.h #include unistd.h + #include locale.h /* postgreSQL stuff */ #include commands/trigger.h *** *** 252,257 --- 253,297 , -e, PERLBOOT }; + #ifdef WIN32 + + /* + * The perl library on startup does horrible things like call +* setlocale(LC_ALL,). We have protected against that on most +* platforms by setting the environment appropriately. However, on +* Windows, setlocale() does not consult the environment, so we need + * to save the excisting locale settings before perl has a chance to + * mangle them and restore them after its dirty deeds are done. +* +* MSDN ref: +* http://msdn.microsoft.com/library/en-us/vclib/html/_crt_locale.asp +* +* It appaers that we only need to do this on interpreter startup, and +* subsequent calls to the interpreter don't mess with the locale +* settings. +* +* We restore them using Perl's POSIX::setlocale() function so that +* Perl doesn't have a different idea of the locale from Postgres. +* +*/ + + char *loc; + char *save_collate, *save_ctype, *save_monetary, *save_numeric, *save_time; + char buf[1024]; + + loc = setlocale(LC_COLLATE,NULL); + save_collate = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_CTYPE,NULL); + save_ctype = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_MONETARY,NULL); + save_monetary = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_NUMERIC,NULL); + save_numeric = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_TIME,NULL); + save_time = loc ? pstrdup(loc) : NULL; + + #endif + plperl_interp = perl_alloc(); if (!plperl_interp) elog(ERROR, could not allocate Perl interpreter); *** *** 261,266 --- 301,349 perl_run(plperl_interp); plperl_proc_hash = newHV(); + + #ifdef WIN32 + + eval_pv(use POSIX qw(locale_h);, TRUE); /* croak on failure */ + + if (save_collate != NULL) + { + snprintf(buf, sizeof(buf),setlocale(%s,'%s');, +LC_COLLATE,save_collate); + eval_pv(buf,TRUE); + pfree(save_collate); + } + if (save_ctype != NULL) + { + snprintf(buf, sizeof(buf),setlocale(%s,'%s');, +LC_CTYPE,save_ctype); + eval_pv(buf,TRUE); + pfree(save_ctype); + } + if (save_monetary != NULL) + { + snprintf(buf, sizeof(buf),setlocale(%s,'%s');, +LC_MONETARY,save_monetary); + eval_pv(buf,TRUE); + pfree(save_monetary); + } + if (save_numeric != NULL) + { + snprintf(buf, sizeof(buf),setlocale(%s,'%s');, +LC_NUMERIC,save_numeric); + eval_pv(buf,TRUE); + pfree(save_numeric); + } + if (save_time != NULL) + { + snprintf(buf, sizeof(buf),setlocale(%s,'%s');, +LC_TIME,save_time); + eval_pv(buf,TRUE); + pfree(save_time); + } + + #endif + } ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] pgbench: Support Multiple Simultaneous Runs (with Mean and Std. Dev.)
Thomas F. O'Connell [EMAIL PROTECTED] writes: The main addition is the addition of a -x option that allows specification of a number of successive runs of pgbench for use in sanity-checking basic benchmark results to reduce the potential for noise in a single run. What exactly does this do that increasing the number of transactions doesn't do? Not to say that I'm not all for making pgbench results more trustworthy. But adding still another parameter that people don't know what to do with isn't going to create any progress in that direction. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings