Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes
Tom Lane [EMAIL PROTECTED] writes: The reason the patch is so short is that it's a kluge. If we really cared about supporting this case, more wide-ranging changes would be needed (eg, there's no need to eat maintenance_work_mem worth of RAM for the dead-TIDs array); and a decent respect to the opinions of mankind would require some attention to updating the header comments and function descriptions, too. The only part that seems klugy to me is how it releases the lock and reacquires it rather than wait in the first place until it can acquire the lock. Fixed that and changed lazy_space_alloc to allocate only as much space as is really necessary. Gosh, I've never been accused of offending all mankind before. --- vacuumlazy.c31 Jul 2006 21:09:00 +0100 1.76 +++ vacuumlazy.c28 Aug 2006 09:58:41 +0100 @@ -16,6 +16,10 @@ * perform a pass of index cleanup and page compaction, then resume the heap * scan with an empty TID array. * + * As a special exception if we're processing a table with no indexes we can + * vacuum each page as we go so we don't need to allocate more space than + * enough to hold as many heap tuples fit on one page. + * * We can limit the storage for page free space to MaxFSMPages entries, * since that's the most the free space map will be willing to remember * anyway. If the relation has fewer than that many pages with free space, @@ -106,7 +110,7 @@ TransactionId OldestXmin); static BlockNumber count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats, TransactionId OldestXmin); -static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks); +static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks, unsigned nindexes); static void lazy_record_dead_tuple(LVRelStats *vacrelstats, ItemPointer itemptr); static void lazy_record_free_space(LVRelStats *vacrelstats, @@ -206,7 +210,8 @@ * This routine sets commit status bits, builds lists of dead tuples * and pages with free space, and calculates statistics on the number * of live tuples in the heap. When done, or when we run low on space - * for dead-tuple TIDs, invoke vacuuming of indexes and heap. + * for dead-tuple TIDs, or after every page if the table has no indexes + * invoke vacuuming of indexes and heap. * * It also updates the minimum Xid found anywhere on the table in * vacrelstats-minxid, for later storing it in pg_class.relminxid. @@ -247,7 +252,7 @@ vacrelstats-rel_pages = nblocks; vacrelstats-nonempty_pages = 0; - lazy_space_alloc(vacrelstats, nblocks); + lazy_space_alloc(vacrelstats, nblocks, nindexes); for (blkno = 0; blkno nblocks; blkno++) { @@ -282,8 +287,14 @@ buf = ReadBuffer(onerel, blkno); - /* In this phase we only need shared access to the buffer */ - LockBuffer(buf, BUFFER_LOCK_SHARE); + /* In this phase we only need shared access to the buffer unless we're +* going to do the vacuuming now which we do if there are no indexes +*/ + + if (nindexes) + LockBuffer(buf, BUFFER_LOCK_SHARE); + else + LockBufferForCleanup(buf); page = BufferGetPage(buf); @@ -450,6 +461,12 @@ { lazy_record_free_space(vacrelstats, blkno, PageGetFreeSpace(page)); + } else if (!nindexes) { + /* If there are no indexes we can vacuum the page right now instead +* of doing a second scan */ + lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats); + lazy_record_free_space(vacrelstats, blkno, PageGetFreeSpace(BufferGetPage(buf))); + vacrelstats-num_dead_tuples = 0; } /* Remember the location of the last page with nonremovable tuples */ @@ -891,16 +908,20 @@ * See the comments at the head of this file for rationale. */ static void -lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks) +lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks, unsigned nindexes) { longmaxtuples; int maxpages; - maxtuples = (maintenance_work_mem * 1024L) / sizeof(ItemPointerData); - maxtuples = Min(maxtuples, INT_MAX); - maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData)); - /* stay sane if small maintenance_work_mem */ - maxtuples = Max(maxtuples, MaxHeapTuplesPerPage); + if (nindexes) { +
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Bruce Momjian [EMAIL PROTECTED] wrote: Is there anything to do for 8.2 here? I'm working on Tom's idea. It is not a feature and does not change the on-disk-structures, so I hope it meet the 8.2 deadline... Tom Lane [EMAIL PROTECTED] wrote: I'm wondering about doing something similar to what TransactionIdIsInProgress does, ie, make use of the PGPROC lists of live subtransactions. Suppose that GetSnapshotData copies not only top xids but live subxids into the snapshot, and adds a flag indicating whether the subxids are complete (ie, none of the subxid lists have overflowed). Then if the flag is set, tqual.c doesn't need to do SubTransGetTopmostTransaction() before searching the list. I added a subxid array to Snapshot and running subxids are gathered from PGPROC-subxids cache. There are two overflowed case; any of PGPROC-subxids are overflowed or the number of total subxids exceeds pre-allocated buffers. If overflowed, we cannot avoid to call SubTransGetTopmostTransaction. I tested the patch and did not see any context switch storm which comes from pg_subtrans, but there may be some bugs in the visibility checking. It would be very nice if you could review or test the patch. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center snapshot_subtrans.patch Description: Binary data ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_upgrade: What is changed?
Zdenek Kotala napsal(a): Martijn van Oosterhout wrote: On Wed, Aug 23, 2006 at 10:26:19AM +0200, Zdenek Kotala wrote: 5) Tuples question: Does have data types some disk representation? Does have tupleheader same structure? I think only the inet/cidr types changed format this release. Ways to handle that have been discussed: 1. Have server contain code for old versions under old OID. This was mentioned as a possibility. 2. ALTER TYPE ALTER TYPE from old to new type, possibly using text as intermediary. 3. Fiddle with bits on disk (not sure if this is even possible) It looks like that name is same but data representation is different. I'm not sure if possible use same data type names with different OID. I think there is unique index. We can rename this type to inet_old, but some application should confused. I quickly looked in network.c and It looks that data size is same (4 or 16). Option 3 is possible in this case and I should be easy to implement it. Does anybody know what exactly has been changed? I examined ODS (on disk structure) and inet/cidr use 1byte less size for storage information. PG8.1 uses two bytes for store INET type value instead PG8.2 uses only one byte. I did not find any other change in these types related to the ODS. Because, attribute size is smaller then in the previous version, direct tuple compact is possible. We can move part of tuple one byte left and decrease size of tuple in the ItemIdData list for each not null occurrence inet/cidr attribute in the tuple. I'm working on a prototype. Indexes contain inet/cidr data type will be recreated by REINDEX command or dropped on the old version and create again on the 8.2. Zdenek PS: There is attribute difference between 8.1 and 8.2 tuple1 = 1.2.3.4/24 tuple2 = 2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128 Postgres 8.2 1) Offset=8152 Size= 38 xmin=1217 cmin=0 xmax=0 cmax=0 natts=1 info=902h hoff=28 ctid=(0/1) 0a 00 00 00 02 18 01 02 03 04 2) Offset=8100 Size= 50 xmin=1219 cmin=0 xmax=0 cmax=0 natts=1 info=902h hoff=28 ctid=(0/2) 16 00 00 00 03 80 20 01 04 f8 00 03 00 ba 02 e0 81 ff fe 22 d1 f1 Postgres 8.1 block= 0 pd_lsn=16278166570008576 pd_tli=1 pd_lower=32 pd_upper=8048 pd_special=8192 page_size=8192 page_version=3 1) Offset=8152 Size= 39 xmin=594 cmin=0 xmax=0 cmax=0 natts=1 info=902h hoff=28 ctid=(0/1) 0b 00 00 00 02 18 00 01 02 03 04 3) Offset=8048 Size= 51 xmin=1338 cmin=0 xmax=0 cmax=0 natts=1 info=902h hoff=28 ctid=(0/3) 17 00 00 00 03 80 00 20 01 04 f8 00 03 00 ba 02 e0 81 ff fe 22 d1 f1 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsvector/tsearch equality and/or portability issue
On Thu, Aug 24, 2006 at 09:40:13PM +0400, Teodor Sigaev wrote: devel=# select 'blah foo bar'::tsvector = 'blah foo bar'::tsvector; ?column? -- f (1 row) Fixed in 8.1 and HEAD. Thank you Things still seem to be broken for me. Among other things, the script at http://unununium.org/~indigo/testvectors.sql.bz2 fails. It performs two tests, comparing 1000 random vectors with positions and random weights, and comparing the same vectors, but stripped. Oddly, the unstripped comparisons all pass, which is not consistant with what I am seeing in my database. However, I'm yet unable to reproduce those problems. It's worth noting that in running this script I have seen the number of failures change, which seems to indicate that some uninitialized memory is still being compared. test=# \i testvectors.sql BEGIN CREATE FUNCTION CREATE TABLE total vectors in test set --- 1000 (1 row) failing unstripped equality - 0 (1 row) failing stripped equality --- 389 (1 row) ROLLBACK test=# ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] integration of pgcluster into postgresql
Adding -hackers back in... -Original Message- From: Chahine Hamila [mailto:[EMAIL PROTECTED] Sent: Fri 8/25/2006 8:36 PM To: Jim Nasby Subject: Re: [HACKERS] integration of pgcluster into postgresql First, you need to review all the past discussion about the very intentional decision not to build any replication into the core database. I would gladly do so. Can you send me any pointer? I don't really have any handy, but try searching the hackers archive for 'replication'. Second, pgcluster is (AFAIK) command-based replication, which has some very, very serious drawbacks. If PostgreSQL were to include a replication solution, I'd certainly hope it wouldn't be command-based. It's better than no replication at all... It's good enough for many uses. As is Slony. And dbmirror. And pgpool. So where do we draw the line? Should we include all four? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Rtree circle ops
Hi, Is there any reason for Rtree circle ops not being included in PostgreSQL? -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes
Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: The reason the patch is so short is that it's a kluge. If we really cared about supporting this case, more wide-ranging changes would be needed (eg, there's no need to eat maintenance_work_mem worth of RAM for the dead-TIDs array); and a decent respect to the opinions of mankind would require some attention to updating the header comments and function descriptions, too. The only part that seems klugy to me is how it releases the lock and reacquires it rather than wait in the first place until it can acquire the lock. Fixed that and changed lazy_space_alloc to allocate only as much space as is really necessary. Gosh, I've never been accused of offending all mankind before. Does that feel good or bad? I won't comment on the spirit of the patch but I'll observe that you should respect mankind a little more by observing brace position in if/else ;-) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] selecting large result sets in psql using
To cut the Gordon knot I'm going to suggest we use: \set CURSOR_FETCH fetch_count and \g and ; are modified such that when they see this variable set to fetch_count 0 and the buffer is a select they would use the modified fetch/output code. Does this sound reasonable to everyone? OK with me, but maybe call the variable FETCH_COUNT, to avoid the presupposition that the implementation uses a cursor. As I mentioned before, I expect we'll someday rework it to not use that. regards, tom lane Ok, sounds good. I'm travelling this week, but can send an updated patch during the weekend. I've just submitted an updated patch to pgsql-patches in a new thread :) Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Rtree circle ops
Tatsuo Ishii [EMAIL PROTECTED] writes: Is there any reason for Rtree circle ops not being included in PostgreSQL? rtree is going away (has gone away in HEAD awhile ago, in fact). regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Another VPATH patch for ecpg
Andrew Dunstan wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I just detected another problem with building ecpg in a VPATH environment. This patch fixes it for me. Can't we get some of the buildfarm machines exercising VPATH? This kinda stuff really ought to be found immediately. I will set one up tomorrow. Done. New machine is 'bustard'. But I couldn't get Alvaro's patch nor Peter's suggestion to work :-( Maybe someone with more vpath-fu than me can fix it. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Good news on CPU scaling for latest 8.2
Tom, So, I finally have new SpecJ tests, and between June 8th's 8.2 CVS and last Wednesday's we're seeing a 10% drop in CPU utilization for a fixed-size workload, which could mean that the buffer lock partitioning resulted in less lock conflicts. We'll see once we have the Dtrace reports and try to scale up the workload. --Josh ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Another VPATH patch for ecpg
Andrew Dunstan wrote: Andrew Dunstan wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I just detected another problem with building ecpg in a VPATH environment. This patch fixes it for me. Can't we get some of the buildfarm machines exercising VPATH? This kinda stuff really ought to be found immediately. I will set one up tomorrow. Done. New machine is 'bustard'. But I couldn't get Alvaro's patch nor Peter's suggestion to work :-( Maybe someone with more vpath-fu than me can fix it. That's because this is a different problem. The patch I provided was to fix a problem in the regression tests -- the problem you are seeing is on the `/home/andrew/bf/root/HEAD/pgsql.3694/src/interfaces/ecpg/pgtypeslib' directory. I'll take a look. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Another VPATH patch for ecpg
Andrew Dunstan [EMAIL PROTECTED] writes: Done. New machine is 'bustard'. But I couldn't get Alvaro's patch nor Peter's suggestion to work :-( Maybe someone with more vpath-fu than me can fix it. I got it to build fairly easily, but the ecpg regression tests are a mess. After some fooling around, I'm down to diffs like these in the preprocessor output files: 116c116 #line 36 show.pgc --- #line 36 /home/tgl/pgsql/src/interfaces/ecpg/test/sql/show.pgc 119c119 #line 36 show.pgc --- #line 36 /home/tgl/pgsql/src/interfaces/ecpg/test/sql/show.pgc 122c122 #line 36 show.pgc --- #line 36 /home/tgl/pgsql/src/interfaces/ecpg/test/sql/show.pgc AFAICS there is no very good way to deal with this. I'd suggest providing a way to suppress #line output from the ecpg preprocessor, but perhaps there is another answer. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. Have we measured the performance impact, then? The last time I profiled psql, GetVariable was already a hotspot, and this introduces another call of it into the basic query loop whether you use the feature or not. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
[EMAIL PROTECTED] writes: here comes the latest version (version 7) of the patch to handle large result sets with psql. As previously discussed, a cursor is used for SELECT queries when \set FETCH_COUNT some_value 0 Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's a whole lot bigger behavioral change than I think is warranted. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] autovacuum causing numerous regression-test failures
I think we shall have to reconsider that patch to turn it on by default. So far I've seen two categories of failure: * manual ANALYZE issued by regression tests fails because autovac is analyzing the same table concurrently. * contrib tests fail in their repeated drop/create database operations because autovac is connected to that database. (pl tests presumably have same issue.) There are probably more symptoms we have not seen yet. In the long run it would be good to figure out fixes to make these problems not happen, but I'm not putting that on the must-fix-for-8.2 list. BTW, it would sure be nice to know what happened here: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=waspdt=2006-08-28%2017:05:01 LOG: autovacuum process (PID 26315) was terminated by signal 11 LOG: terminating any other active server processes but even if there was a core file, it got wiped out immediately by the next DROP DATABASE command :-(. This one does look like a must-fix, if we can find out what happened. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum causing numerous regression-test failures
Tom Lane wrote: I think we shall have to reconsider that patch to turn it on by default. So far I've seen two categories of failure: So we turn autovacuum off for regression test instance. * manual ANALYZE issued by regression tests fails because autovac is analyzing the same table concurrently. Or we put manual exceptions for the affected tables into pg_autovacuum. * contrib tests fail in their repeated drop/create database operations because autovac is connected to that database. (pl tests presumably have same issue.) I opine that when a database is to be dropped, the connections should be cut. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Unnecessary rescan for non scrollable holdable cursors
Hi, When persisting a holdable cursor at COMMIT time we currently choose to rewind the executor and re-scan the whole result set into the tuplestore in order to be able to scroll backwards later on. And then, we reposition the cursor to the position we been in. However, unless I am missing something, this seems to be done always, even if the cursor is not scrollable. I suppose adding a simple conditional or two in PersistHoldablePortal() in portalcmds.c could save the rescan and filling up the tuplestore with tuples that will never be looked at, in the case that we never want to scroll back. Anyway, definitely not critical, but should save some time and space in those specific situations. Regards, Alon. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in
On Mon, 2006-08-28 at 13:45 -0400, Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. Have we measured the performance impact, then? The last time I profiled psql, GetVariable was already a hotspot, and this introduces another call of it into the basic query loop whether you use the feature or not. regards, tom lane Hi, after agreeing on using a \set variable, I proposed to have it influence \g as well as ;, because I thought that would be the most expected behaviour. IMHO I'm with Peter, that introducing a difference between \g and ; would go against the principle of least surprise. Performance-wise I took for granted without checking that GetVariable's running time would be negligible. [looks at the code] I see it's it's basically two function calls with a loop over a linked list of values (in the order of 10) doing strcmps and one strtol. It is not quite clear to me what the impact of this is. I could imagine it would show up only if you perform lots of trivial queries through psql. I'm going to benchmark something now and report back. Anyway, regardless the benchmark, I feel it's somehow not clean to have a variable introduce a difference between \g and ;. [goes benchmarking...] Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum causing numerous regression-test failures
Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: I think we shall have to reconsider that patch to turn it on by default. So far I've seen two categories of failure: So we turn autovacuum off for regression test instance. Not a solution for make installcheck, unless you are proposing adding the ability to suppress autovac per-database. Which would be a good new feature ... for 8.3. * manual ANALYZE issued by regression tests fails because autovac is analyzing the same table concurrently. Or we put manual exceptions for the affected tables into pg_autovacuum. New feature? Or does that capability exist already? * contrib tests fail in their repeated drop/create database operations because autovac is connected to that database. (pl tests presumably have same issue.) I opine that when a database is to be dropped, the connections should be cut. Sure, but that's another thing that we're not going to start designing and implementing four weeks after feature freeze. I didn't complain about your proposing two weeks after feature freeze that we turn autovac on by default, because I assumed (same as you no doubt) that it would be a trivial one-liner change. It is becoming clear that that is not the case, and I don't think it makes any sense from a project-management standpoint to try to flush the problems out at this time in the release cycle. We have more than enough problems to fix for 8.2 already. Let's try to do this early in the 8.3 cycle instead. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum causing numerous regression-test failures
Tom Lane wrote: So we turn autovacuum off for regression test instance. Not a solution for make installcheck, Well, for make installcheck we don't have any control over whether autovacuum has been turned on or off manually anyway. If you are concerned about build farm reliability, the build farm scripts can surely be made to initialize or start the instance in a particular way. Another option might be to turn off stats_row_level on the fly. Or we put manual exceptions for the affected tables into pg_autovacuum. New feature? Or does that capability exist already? I haven't ever used the pg_autovacuum table but the documentation certainly makes one believe that this is possible. I opine that when a database is to be dropped, the connections should be cut. Sure, but that's another thing that we're not going to start designing and implementing four weeks after feature freeze. Right. clear that that is not the case, and I don't think it makes any sense from a project-management standpoint to try to flush the problems out at this time in the release cycle. We have more than enough problems to fix for 8.2 already. Let's try to do this early in the 8.3 cycle instead. Let's just consider some of the options a bit more closely, and if they don't work, we'll revert it. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autovacuum causing numerous regression-test failures
I wrote: BTW, it would sure be nice to know what happened here: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=waspdt=2006-08-28%2017:05:01 LOG: autovacuum process (PID 26315) was terminated by signal 11 I was able to cause autovac to crash by repeating contrib/intarray regression test enough times in a row. The cause is not specific to autovac, it's a generic bug created by my recent patch to add waiting status to pg_stat_activity. If we block on a lock during InitPostgres then the stats stuff isn't ready yet ... oops. Patch committed. The other issues remain problems however. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] autovacuum causing numerous regression-test failures
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=ospreydt=2006-08-28%2016:00:17 shows another autovac-induced failure mode: ! psql: FATAL: sorry, too many clients already initdb is choosing max_connections = 20 on this machine, which is sufficient to run the parallel regression tests by themselves, but not regression tests plus autovac. IIRC initdb will go down to 10 or so connections before deciding it's hopeless. I don't really want to change that behavior because it might make it impossible to initdb at all on a small machine. But probably there needs to be a way for pg_regress to set a floor on the acceptable max_connections setting while initializing the test instance for make check. This also ties into the recent discussions about whether autovac needs its own reserved backend slots. Which, again, sounds to me like a fine idea for 8.3 work. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Another VPATH patch for ecpg
On Mon, Aug 28, 2006 at 12:16:40PM -0400, Tom Lane wrote: 122c122 #line 36 show.pgc AFAICS there is no very good way to deal with this. I'd suggest providing a way to suppress #line output from the ecpg preprocessor, but perhaps there is another answer. What about changing those lines before diffing the files? This is already done for different default port settings in order to keep output files in sync. I append a small (untested) patch against pg_regress.sh. Unfortunately, lines like this one in a non-VPATH build: #line 1 ./../../include/sql3types.h would get stripped to `#line 1 sql3types.h' as well but I think this is acceptable. Joachim -- Joachim Wieland [EMAIL PROTECTED] GPG key available Index: pg_regress.sh === RCS file: /projects/cvsroot/pgsql/src/interfaces/ecpg/test/pg_regress.sh,v retrieving revision 1.7 diff -c -r1.7 pg_regress.sh *** pg_regress.sh 28 Aug 2006 16:13:11 - 1.7 --- pg_regress.sh 28 Aug 2006 20:40:56 - *** *** 745,750 --- 745,754 done fi + mv $outfile_source $outfile_source.tmp + cat $outfile_source.tmp | sed -e 's,^\(#line [0-9]*\) .*/\([^/]*\),\1 \2,' $outfile_source + rm $outfile_source.tmp + DIFFER= diff $DIFFFLAGS expected/$outprg.stderr $outfile_stderr /dev/null 21 if [ $? != 0 ]; then ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsvector/tsearch equality and/or portability issue
Phil Frost [EMAIL PROTECTED] writes: Things still seem to be broken for me. Among other things, the script at http://unununium.org/~indigo/testvectors.sql.bz2 fails. It performs two tests, comparing 1000 random vectors with positions and random weights, and comparing the same vectors, but stripped. Oddly, the unstripped comparisons all pass, which is not consistant with what I am seeing in my database. However, I'm yet unable to reproduce those problems. It looks to me like tsvector comparison may be too strong. The strip() function evidently thinks that it's OK to rearrange the string chunks into the same order as the WordEntry items, which suggests to me that the pos fields are not really semantically significant. But silly_cmp_tsvector() considers that a difference in pos values is important. I don't understand the data structure well enough to know which one to believe, but something's not consistent here. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] updated patch for selecting large results
Performance-wise I took for granted without checking that GetVariable's running time would be negligible. [looks at the code] I see it's it's basically two function calls with a loop over a linked list of values (in the order of 10) doing strcmps and one strtol. It is not quite clear to me what the impact of this is. I could imagine it would show up only if you perform lots of trivial queries through psql. I'm going to benchmark something now and report back. Anyway, regardless the benchmark, I feel it's somehow not clean to have a variable introduce a difference between \g and ;. [goes benchmarking...] Ok, so I ran a file containing 1 million lines of select 1; through psql (discarding the output). 5 runs each with the patch and with the patch removed (the if() in SendQuery commented). These are the results in seconds user time of psql on a Pentium M 2.0 GHz (real time was longer, since the postmaster process was on the same machine). patch | count | avg | stddev ---+---+-+--- f | 5 | 16.6722 | 0.359759919946455 t | 5 | 17.2762 | 0.259528803796329 The conclusion is that, yes, the overhead is measurable, albeit with a very synthetic benchmark (of the type MySQL wins ;). Basically I'm loosing 0.6 usec on each query line (when FETCH_COUNT is not there and therefore psql need to scan the whole variables list in GetVariable() for nothing). Not sure if that's acceptable (I'd say yes, but then again, I'm not a cycle counter type of programmer *cough* Java *cough* ;)... Opinions? Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum causing numerous regression-test failures
On Mon, 2006-08-28 at 15:21 -0400, Tom Lane wrote: We have more than enough problems to fix for 8.2 already. Let's try to do this early in the 8.3 cycle instead. I agree -- I think this is exactly the sort of change that is best made at the beginning of a development cycle, so that there's a whole cycle's worth of testing to ensure it plays nicely with the rest of the system. -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Rtree circle ops
Tatsuo Ishii [EMAIL PROTECTED] writes: Is there any reason for Rtree circle ops not being included in PostgreSQL? rtree is going away (has gone away in HEAD awhile ago, in fact). I know. I just want to make sure that there's any technical reason it had not been supported for long time. BTW, I seems there are some users who are willing to continue to use Rtree. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in
Chris Mair [EMAIL PROTECTED] writes: The conclusion is that, yes, the overhead is measurable, albeit with a very synthetic benchmark (of the type MySQL wins ;). OK, so about 3 or 4% overhead added to extremely short queries. That's not enough to kill this patch, but it's still annoying ... and as I mentioned, there are some existing calls of GetVariable that are executed often enough to be a problem too. It strikes me that having to do GetVariable *and* strtol and so on for these special variables is pretty silly; the work should be done during the infrequent times they are set, rather than the frequent times they are read. I propose that we add the equivalent of a GUC assign_hook to psql's variable facility, attach an assign hook function to each special variable, and have the assign hook transpose the value into an internal variable that can be read with no overhead. If we do that then the cost of the FETCH_COUNT patch will be unmeasurable, and I think we'll see a few percent savings overall in psql runtime from improving the existing hotspot uses of GetVariable. Barring objections, I'll hack on this this evening ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] autovacuum causing numerous regression-test failures
Neil Conway wrote: On Mon, 2006-08-28 at 15:21 -0400, Tom Lane wrote: We have more than enough problems to fix for 8.2 already. Let's try to do this early in the 8.3 cycle instead. I agree -- I think this is exactly the sort of change that is best made at the beginning of a development cycle, so that there's a whole cycle's worth of testing to ensure it plays nicely with the rest of the system. On the other hand, the bug Tom found on DROP OWNED a couple of weeks ago was introduced right at the start of this development cycle, which tells us that our testing of the development branch is not very exhaustive. But I agree anyway. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Rtree circle ops
Tatsuo Ishii [EMAIL PROTECTED] writes: Tatsuo Ishii [EMAIL PROTECTED] writes: Is there any reason for Rtree circle ops not being included in PostgreSQL? rtree is going away (has gone away in HEAD awhile ago, in fact). I know. I just want to make sure that there's any technical reason it had not been supported for long time. AFAIR, the code simply wasn't there before 8.1. I threw in an opclass definition for circles while adding the rtree-equivalent opclasses to GIST, because the use case for it was pretty obvious (distance-to-a-point type queries). We could have added the same opclass to rtree at the same time, but I saw no point given that we were planning to drop rtree in the next release cycle. BTW, I seems there are some users who are willing to continue to use Rtree. It's a bit late to be objecting to that decision. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results
The conclusion is that, yes, the overhead is measurable, albeit with a very synthetic benchmark (of the type MySQL wins ;). OK, so about 3 or 4% overhead added to extremely short queries. More accurately, that 3 or 4% overhead is added to about all queries (we're talking *psql*-only running time). It's just that for anything but short queries, psql running time totally dwarfs regarding to postmaster running time anyway. That's not enough to kill this patch, but it's still annoying ... and as I mentioned, there are some existing calls of GetVariable that are executed often enough to be a problem too. It strikes me that having to do GetVariable *and* strtol and so on for these special variables is pretty silly; the work should be done during the infrequent times they are set, rather than the frequent times they are read. I propose that we add the equivalent of a GUC assign_hook to psql's variable facility, attach an assign hook function to each special variable, and have the assign hook transpose the value into an internal variable that can be read with no overhead. If we do that then the cost of the FETCH_COUNT patch will be unmeasurable, and I think we'll see a few percent savings overall in psql runtime from improving the existing hotspot uses of GetVariable. Barring objections, I'll hack on this this evening ... Might help. Take into account the strtol is not critical at all for FETCH_COUNT, since when it's actually set, we're supposed to retrieving big data where a strtol doesn't matter anyway. The overhead comes from scanning the linked list for nothing in the normal case (when it's not set). I don't know how the other uses factor in here, but I see it's called at least twice more on average calls to SendQuery. Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum causing numerous regression-test failures
Peter Eisentraut wrote: Tom Lane wrote: Not a solution for make installcheck, Well, for make installcheck we don't have any control over whether autovacuum has been turned on or off manually anyway. If you are concerned about build farm reliability, the build farm scripts can surely be made to initialize or start the instance in a particular way. Another option might be to turn off stats_row_level on the fly. I'm sure I'm missing some of the subtleties of make installcheck issues, but autovacuum can be enabled / disabled on the fly just as easily as stats_row_level, so I don't see the difference? Or we put manual exceptions for the affected tables into pg_autovacuum. New feature? Or does that capability exist already? I haven't ever used the pg_autovacuum table but the documentation certainly makes one believe that this is possible. Right, if it doesn't work, that would certainly be a bug. This feature was included during the original integration into the backend during the 8.0 dev cycle. Let's just consider some of the options a bit more closely, and if they don't work, we'll revert it. Agreed. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Peter Eisentraut wrote: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. This patch changes SendQuery so that *every* command executed via psql is treated this way. That's what I remembered. I don't think we want to introduce a difference between ; and \g. I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in psql using cursors
Bruce Momjian [EMAIL PROTECTED] writes: Peter Eisentraut wrote: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? Like I said, I thought we were intending to modify \g's behavior only; that was certainly the implication of the discussion of \gc. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Peter Eisentraut wrote: Tom Lane wrote: Wait a minute. What I thought we had agreed to was a patch to make commands sent with \g use a cursor. I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? Like I said, I thought we were intending to modify \g's behavior only; that was certainly the implication of the discussion of \gc. OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] autovacuum causing numerous regression-test failures
Matthew T. O'Connor matthew@zeut.net writes: Tom Lane wrote: Not a solution for make installcheck, I'm sure I'm missing some of the subtleties of make installcheck issues, but autovacuum can be enabled / disabled on the fly just as easily as stats_row_level, so I don't see the difference? Well, just as easily means edit postgresql.conf and SIGHUP, which is not an option available to make installcheck, even if we thought that an invasive change of the server configuration would be acceptable for it to do. It's conceivable that we could invent a per-database autovac-off variable controlled by, say, ALTER DATABASE SET ... but we haven't got one today. My objection here is basically that this proposal passed on the assumption that it would be very nearly zero effort to make it happen. We are now finding out that we have a fair amount of work to do if we want autovac to not mess up the regression tests, and I think that has to mean that the proposal goes back on the shelf until 8.3 development starts. We are already overcommitted in terms of the stuff that was submitted *before* feature freeze. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Bruce Momjian [EMAIL PROTECTED] writes: OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. I'm willing to accept this if we can make sure we aren't adding any overhead --- see my proposal elsewhere in the thread for fixing that. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets in
I am confused. I assume \g and ; should be affected, like Peter says. Tom, what *every* command are you talking about? You mean \d? Like I said, I thought we were intending to modify \g's behavior only; that was certainly the implication of the discussion of \gc. At some point you OKed the \g and ; proposal. I admit I was quick adding the and ; part, but it seemed so natural once we agreed on using a variable. OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. I agree with this statement. If we have a variable that switches just between two versions of \g, we could have gone with using \u (or whatever) in the first place. In the mean time I have been converted by the variable camp, and I think the variable should change \g and ; together, consistently. If we find we can't live with the performance overhead of that if(FETCH_COUNT), it is still not clear why we would be better off moving it into the \g code path only. Is it because presumably \g is used less often in existing psql scripts? Bye, Chris. -- Chris Mair http://www.1006.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] updated patch for selecting large results sets
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: OK, got it. I just don't see the value to doing \g and not ;. I think the \gc case was a hack when he didn't have \set. Now that we have \set, we should be consistent. I'm willing to accept this if we can make sure we aren't adding any overhead --- see my proposal elsewhere in the thread for fixing that. Right, if \g has overhead, I don't want people to start using ; because it is faster. That is the kind of behavior that makes us look sloppy. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Rtree circle ops
Tatsuo Ishii [EMAIL PROTECTED] writes: Tatsuo Ishii [EMAIL PROTECTED] writes: Is there any reason for Rtree circle ops not being included in PostgreSQL? rtree is going away (has gone away in HEAD awhile ago, in fact). I know. I just want to make sure that there's any technical reason it had not been supported for long time. AFAIR, the code simply wasn't there before 8.1. I threw in an opclass definition for circles while adding the rtree-equivalent opclasses to GIST, because the use case for it was pretty obvious (distance-to-a-point type queries). We could have added the same opclass to rtree at the same time, but I saw no point given that we were planning to drop rtree in the next release cycle. BTW, I seems there are some users who are willing to continue to use Rtree. It's a bit late to be objecting to that decision. Yes, probably the only way to help the user would be 1) build a new pgfoundry or whatever open souce project 2) provide a commercial support for him. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Rtree circle ops
Tatsuo Ishii wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: Tatsuo Ishii [EMAIL PROTECTED] writes: Is there any reason for Rtree circle ops not being included in PostgreSQL? rtree is going away (has gone away in HEAD awhile ago, in fact). I know. I just want to make sure that there's any technical reason it had not been supported for long time. AFAIR, the code simply wasn't there before 8.1. I threw in an opclass definition for circles while adding the rtree-equivalent opclasses to GIST, because the use case for it was pretty obvious (distance-to-a-point type queries). We could have added the same opclass to rtree at the same time, but I saw no point given that we were planning to drop rtree in the next release cycle. BTW, I seems there are some users who are willing to continue to use Rtree. It's a bit late to be objecting to that decision. Yes, probably the only way to help the user would be 1) build a new pgfoundry or whatever open souce project 2) provide a commercial support for him. I wonder why isn't an option to migrate to GiST? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Rtree circle ops
Alvaro Herrera [EMAIL PROTECTED] writes: Tatsuo Ishii wrote: Yes, probably the only way to help the user would be 1) build a new pgfoundry or whatever open souce project 2) provide a commercial support for him. I wonder why isn't an option to migrate to GiST? Indeed. The reason we removed rtree was that we could not see any case where it was superior to GIST, and there were plenty of reasons why it was inferior. What's the specific motivation for wanting to stay with rtree? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] stats test on Windows is now failing repeatably?
I just looked over the buildfarm results and was struck by the observation that the stats regression test, which lately had been failing once-in-a-while on Windows and never anywhere else, has a batting average of 0-for-10-or-so over the past 24 hours on the Windows buildfarm machines. I still have no idea what the real problem is there --- but since it suddenly seems to have gotten very repeatable, I trust someone with a Windows box and a debugger will get after it before the source code drifts again. [ urk ... must ... resist ... temptation ... failing ... AUTOVACUUM? ] regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Rtree circle ops
I wonder why isn't an option to migrate to GiST? Indeed. The reason we removed rtree was that we could not see any case where it was superior to GIST, and there were plenty of reasons why it was inferior. What's the specific motivation for wanting to stay with rtree? Don't know. I just heard that they evaluated GIST and decided not to go with it. I'll get back if I get more info. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org