[PATCHES] plpgpsm
Hello I have problem with sending propably too much large patch, so I am sending only link to patch. Is it correct? I sent this patch two times without success. This patch add new PL language to PostgreSQL. You can enable it by setting --with-sqlpsm in configure. Proposal: http://archives.postgresql.org/pgsql-hackers/2007-03/msg01487.php link to patch: http://www.pgsql.cz/patches/plpgpsm.diff.gz Regards Pavel Stehule _ Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [PATCH] add CLUSTER table USING index (take 2)
Holger Schurig wrote: Index: src/doc/src/sgml/ref/cluster.sgml === *** src.orig/doc/src/sgml/ref/cluster.sgml 2007-03-28 23:02:12.0 +0200 --- src/doc/src/sgml/ref/cluster.sgml 2007-03-28 23:03:14.0 +0200 *** *** 20,27 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERindexname/replaceable ON replaceable class=PARAMETERtablename/replaceable ! CLUSTER replaceable class=PARAMETERtablename/replaceable CLUSTER /synopsis /refsynopsisdiv --- 20,26 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERtablename/replaceable [ USING replaceable class=PARAMETERindexname/replaceable ] CLUSTER /synopsis /refsynopsisdiv We still need to document the old syntax, especially if we don't change the example as well. Index: src/src/backend/parser/gram.y === *** src.orig/src/backend/parser/gram.y 2007-03-28 22:58:48.0 +0200 --- src/src/backend/parser/gram.y 2007-03-28 22:59:15.0 +0200 *** *** 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator --- 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name opt_cluster_using %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator Is the placement of opt_cluster_using completely arbitrary? I'm not very familiar with the parser, it really looks like those type-definitions are in random order. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PATCHES] Small addition to PGInterval
Hi all, please find attached a small extension to PGInterval: - getTimeInMills - returns the number of milliseconds in the interval - copying constructor PGInterval(PGInterval) for type-safe, cast-free cloning - now implements ComaparablePGInterval -- Hartmut Benz, TI-WMC, http://www.ti-wmc.nl PGInterval.diff.gz Description: application/gzip ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Make CLUSTER MVCC-safe
Here's an update, fixing conflict by Tom's recent commit of Simon's patch to skip WAL-inserts when archiving is not enabled. Heikki Linnakangas wrote: This patch makes CLUSTER MVCC-safe. Visibility information and update chains are preserved like in VACUUM FULL. I created a new generic rewriteheap-facility to handle rewriting tables in a visibility-preserving manner. All the update chain tracking is done in rewriteheap.c, the caller is responsible for supplying the stream of tuples. CLUSTER is currently the only user of the facility, but I'm envisioning we might have other users in the future. For example, a version of VACUUM FULL that rewrites the whole table. We could also use it to make ALTER TABLE MVCC-safe, but there's some issues with that. For example, what to do if RECENTLY_DEAD tuples don't satisfy a newly added constraint. One complication in the implementation was the fact that heap_insert overwrites the visibility information, and it doesn't write the full tuple header to WAL. I ended up implementing a special-purpose raw_heap_insert function instead, which is optimized for bulk inserting a lot of tuples, knowing that we have exclusive access to the heap. raw_heap_insert keeps the current buffer locked over calls, until it gets full, and inserts the whole page to WAL as a single record using the existing XLOG_HEAP_NEWPAGE record type. This makes CLUSTER a more viable alternative to VACUUM FULL. One motivation for making CLUSTER MVCC-safe is that if some poor soul runs pg_dump to make a backup concurrently with CLUSTER, the clustered tables will appear to be empty in the dump file. The documentation doesn't anything about CLUSTER not being MVCC-safe, so I suppose there's no need to touch the docs. I sent a doc patch earlier to add a note about it, that doc patch should still be applied to older release branches, IMO. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com Index: src/backend/access/heap/heapam.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.230 diff -c -r1.230 heapam.c *** src/backend/access/heap/heapam.c 29 Mar 2007 00:15:37 - 1.230 --- src/backend/access/heap/heapam.c 29 Mar 2007 08:27:21 - *** *** 3280,3285 --- 3280,3322 return log_heap_update(reln, oldbuf, from, newbuf, newtup, true); } + /* + * Perofrm XLogInsert of a full page image to WAL. Caller must make sure + * the page contents on disk are consistent with the page inserted to WAL. + */ + XLogRecPtr + log_newpage(RelFileNode *rnode, BlockNumber blkno, Page page) + { + xl_heap_newpage xlrec; + XLogRecPtr recptr; + XLogRecData rdata[2]; + + /* NO ELOG(ERROR) from here till newpage op is logged */ + START_CRIT_SECTION(); + + xlrec.node = *rnode; + xlrec.blkno = blkno; + + rdata[0].data = (char *) xlrec; + rdata[0].len = SizeOfHeapNewpage; + rdata[0].buffer = InvalidBuffer; + rdata[0].next = (rdata[1]); + + rdata[1].data = (char *) page; + rdata[1].len = BLCKSZ; + rdata[1].buffer = InvalidBuffer; + rdata[1].next = NULL; + + recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata); + + PageSetLSN(page, recptr); + PageSetTLI(page, ThisTimeLineID); + + END_CRIT_SECTION(); + + return recptr; + } + static void heap_xlog_clean(XLogRecPtr lsn, XLogRecord *record) { Index: src/backend/access/nbtree/nbtsort.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/nbtree/nbtsort.c,v retrieving revision 1.110 diff -c -r1.110 nbtsort.c *** src/backend/access/nbtree/nbtsort.c 9 Jan 2007 02:14:10 - 1.110 --- src/backend/access/nbtree/nbtsort.c 20 Mar 2007 17:58:20 - *** *** 64,69 --- 64,70 #include postgres.h + #include access/heapam.h #include access/nbtree.h #include miscadmin.h #include storage/smgr.h *** *** 265,296 if (wstate-btws_use_wal) { /* We use the heap NEWPAGE record type for this */ ! xl_heap_newpage xlrec; ! XLogRecPtr recptr; ! XLogRecData rdata[2]; ! ! /* NO ELOG(ERROR) from here till newpage op is logged */ ! START_CRIT_SECTION(); ! ! xlrec.node = wstate-index-rd_node; ! xlrec.blkno = blkno; ! ! rdata[0].data = (char *) xlrec; ! rdata[0].len = SizeOfHeapNewpage; ! rdata[0].buffer = InvalidBuffer; ! rdata[0].next = (rdata[1]); ! ! rdata[1].data = (char *) page; ! rdata[1].len = BLCKSZ; ! rdata[1].buffer = InvalidBuffer; ! rdata[1].next = NULL; ! ! recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata); ! ! PageSetLSN(page, recptr); ! PageSetTLI(page, ThisTimeLineID); ! ! END_CRIT_SECTION(); } else { --- 266,272 if (wstate-btws_use_wal) { /* We use the heap NEWPAGE record type for this */ ! log_newpage(wstate-index-rd_node, blkno, page); } else
Re: [PATCHES] Small addition to PGInterval
Hartmut Benz wrote: please find attached a small extension to PGInterval: - getTimeInMills - returns the number of milliseconds in the interval - copying constructor PGInterval(PGInterval) for type-safe, cast-free cloning - now implements ComaparablePGInterval This list is for patches to PostgreSQL server. Please send patches to the JDBC driver to the pgsql-jdbc mailing list. PS. Thanks for the patch! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Hi, Here're some feedback to the comment: Simon Riggs wrote: On Wed, 2007-03-28 at 10:54 +0900, Koichi Suzuki wrote: As written below, full page write can be categolized as follows: 1) Needed for crash recovery: first page update after each checkpoint. This has to be kept in WAL. 2) Needed for archive recovery: page update between pg_start_backup and pg_stop_backup. This has to be kept in archive log. 3) For log-shipping slave such as pg_standby: no full page writes will be needed for this purpose. My proposal deals with 2). So, if we mark each full_page_write, I'd rather mark when this is needed. Still need only one bit because the case 3) does not need any mark. I'm very happy with this proposal, though I do still have some points in detailed areas. If you accept that 1 2 are valid goals, then 1 3 or 1, 2 3 are also valid goals, ISTM. i.e. you might choose to use full_page_writes on the primary and yet would like to see optimised data transfer to the standby server. In that case, you would need the mark. Yes, I need the mark. In my proposal, only unmarked full-page-writes, which were written as the first update after a checkpoint, are to be removed offline (pg_compresslog). - Not sure why we need full_page_compress, why not just mark them always? That harms noone. (Did someone else ask for that? If so, keep it) No, no one asked to have a separate option. There'll be no bad influence to do so. So, if we mark each full_page_write, I'd rather mark when this is needed. Still need only one bit because the case 3) does not need any mark. OK, different question: Why would anyone ever set full_page_compress = off? Why have a parameter that does so little? ISTM this is: i) one more thing to get wrong ii) cheaper to mark the block when appropriate than to perform the if() test each time. That can be done only in the path where backup blocks are present. iii) If we mark the blocks every time, it allows us to do an offline WAL compression. If the blocks aren't marked that option is lost. The bit is useful information, so we should have it in all cases. Not only full-page-writes are written as WAL record. In my proposal, both full-page-writes and logical log are written in a WAL record, which will make WAL size slightly bigger (five percent or so). If full_page_compress = off, only a full-page-write will be written in a WAL record. I thought someone will not be happy with this size growth. I agree to make this mandatory if every body is happy with extra logical log in WAL records with full page writes. I'd like to have your opinion. - OTOH I'd like to see an explicit parameter set during recovery since you're asking the main recovery path to act differently in case a single bit is set/unset. If you are using that form of recovery, we should say so explicitly, to keep everybody else safe. Only one thing I had to do is to create dummy full page write to maintain LSNs. Full page writes are omitted in archive log. We have to LSNs same as those in the original WAL. In this case, recovery has to read logical log, not dummy full page writes. On the other hand, if both logical log and real full page writes are found in a log record, the recovery has to use real full page writes. I apologise for not understanding your reply, perhaps my original request was unclear. In recovery.conf, I'd like to see a parameter such as dummy_backup_blocks = off (default) | on to explicitly indicate to the recovery process that backup blocks are present, yet they are garbage and should be ignored. Having garbage data within the system is potentially dangerous and I want to be told by the user that they were expecting that and its OK to ignore that data. Otherwise I want to throw informative errors. Maybe it seems OK now, but the next change to the system may have unintended consequences and it may not be us making the change. It's OK the Alien will never escape from the lab is the starting premise for many good sci-fi horrors and I want to watch them, not be in one myself. :-) We can call it other things, of course. e.g. ignore_dummy_blocks decompressed_blocks apply_backup_blocks So far, we don't need any modification to the recovery and redo functions. They ignore the dummy and apply logical logs. Also, if there are both full page writes and logical log, current recovery selects full page writes to apply. I agree to introduce this option if 8.3 code introduces any conflict to the current. Or, we could introduce this option for future safety. Do you think we should introduce this option? If this should be introduced now, what we should do is to check this option when dummy full-page-write appears. Yes I believe so. As pg_standby does not include any chance to meet partial writes of pages, I believe you can omit all the full page writes. Of course, as Tom Lange suggested in http://archives.postgresql.org/pgsql-hackers/2007-02/msg00034.php removing full
Re: [PATCHES] ecpg threading vs win32
On Mon, Mar 19, 2007 at 09:48:19AM +0100, Magnus Hagander wrote: Q2. Do we need to use PQescapeStringConn() instead of PQescapeString()? PQescapeString() is used to escape literals, and the documentation says PQescapeStringConn() should be used in multi-threaded client programs. http://momjian.us/main/writings/pgsql/sgml/libpq-exec.html#LIBPQ-EXEC-ESCAPE-STRING | PQescapeString can be used safely in single-threaded client programs | that work with only one PostgreSQL connection at a time Seems so, but that's unrelated to this patch ;-) I'll leave the final comment on that up to Michael. Looking at the source code it seems to me that the connection argument is only used once in PQescapeStringInternal which both functions call. Here's the snippet: if (conn) printfPQExpBuffer(conn-errorMessage, libpq_gettext(incomplete multibyte character\n)); So this essantially is only to get an error message into the connection structure. However, having an empty connection makes PQescapeStringConn return an error and an empty string which I consider a problem. I have to check that in detail but we may run into a usage of PQescapeString without an open connection which then would fail. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
On Thu, 2007-03-29 at 17:50 +0900, Koichi Suzuki wrote: Not only full-page-writes are written as WAL record. In my proposal, both full-page-writes and logical log are written in a WAL record, which will make WAL size slightly bigger (five percent or so). If full_page_compress = off, only a full-page-write will be written in a WAL record. I thought someone will not be happy with this size growth. OK, I see what you're doing now and agree with you that we do need a parameter. Not sure about the name you've chosen though - it certainly confused me until you explained. A parameter called ..._compress indicates to me that it would reduce something in size whereas what it actually does is increase the size of WAL slightly. We should have a parameter name that indicates what it actually does, otherwise some people will choose to use this parameter even when they are not using archive_command with pg_compresslog. Some possible names... additional_wal_info = 'COMPRESS' add_wal_info wal_additional_info wal_auxiliary_info wal_extra_data attach_wal_info ... others? I've got some ideas for the future for adding additional WAL info for various purposes, so it might be useful to have a parameter that can cater for multiple types of additional WAL data. Or maybe we go for something more specific like wal_add_compress_info = on wal_add__info ... In recovery.conf, I'd like to see a parameter such as dummy_backup_blocks = off (default) | on to explicitly indicate to the recovery process that backup blocks are present, yet they are garbage and should be ignored. Having garbage data within the system is potentially dangerous and I want to be told by the user that they were expecting that and its OK to ignore that data. Otherwise I want to throw informative errors. Maybe it seems OK now, but the next change to the system may have unintended consequences and it may not be us making the change. It's OK the Alien will never escape from the lab is the starting premise for many good sci-fi horrors and I want to watch them, not be in one myself. :-) We can call it other things, of course. e.g. ignore_dummy_blocks decompressed_blocks apply_backup_blocks So far, we don't need any modification to the recovery and redo functions. They ignore the dummy and apply logical logs. Also, if there are both full page writes and logical log, current recovery selects full page writes to apply. I agree to introduce this option if 8.3 code introduces any conflict to the current. Or, we could introduce this option for future safety. Do you think we should introduce this option? Yes. You are skipping a correctness test and that should be by explicit command only. It's no problem to include that as well, since you are already having to specify pg_... decompress... but the recovery process doesn't know whether or not you've done that. Anyway, could you try to run pg_standby with pg_compresslog and pg_decompresslog? After freeze, yes. Additional recomment on page header removal: I found that it is not simple to keep page header in the compressed archive log. Because we eliminate unmarked full page writes and shift the rest of the WAL file data, it is not simple to keep page header as the page header in the compressed archive log. It is much simpler to remove page header as well and rebuild them. I'd like to keep current implementation in this point. OK. This is a good feature. Thanks for your patience with my comments. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PATCHES] bgwriter stats
Attached is a new version of the bgwriter stats patch. Per previous discussion, now uses the stats system only. Introduces a new stats message for bgwriter, and also introduces a global stats part of the stats file for statistics not bound to a database or table. I've included a couple of more counters per ideas from Greg Smith in his logging patch. Unless there are objections, I'll go ahead and apply this version of the patch (once the documentation is written, of course). //Magnus Index: src/backend/catalog/system_views.sql === RCS file: /cvsroot/pgsql/src/backend/catalog/system_views.sql,v retrieving revision 1.36 diff -c -r1.36 system_views.sql *** src/backend/catalog/system_views.sql16 Mar 2007 17:57:36 - 1.36 --- src/backend/catalog/system_views.sql29 Mar 2007 13:21:54 - *** *** 364,366 --- 364,376 pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted FROM pg_database D; + + CREATE VIEW pg_stat_bgwriter AS + SELECT + pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, + pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, + pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, + pg_stat_get_bgwriter_buf_written_lru() AS buffers_lru, + pg_stat_get_bgwriter_buf_written_all() AS buffers_all, + pg_stat_get_bgwriter_maxwritten_lru() AS maxwritten_lru, + pg_stat_get_bgwriter_maxwritten_all() AS maxwritten_all; Index: src/backend/postmaster/bgwriter.c === RCS file: /cvsroot/pgsql/src/backend/postmaster/bgwriter.c,v retrieving revision 1.36 diff -c -r1.36 bgwriter.c *** src/backend/postmaster/bgwriter.c 17 Jan 2007 16:25:01 - 1.36 --- src/backend/postmaster/bgwriter.c 29 Mar 2007 13:21:57 - *** *** 50,55 --- 50,56 #include access/xlog_internal.h #include libpq/pqsignal.h #include miscadmin.h + #include pgstat.h #include postmaster/bgwriter.h #include storage/fd.h #include storage/freespace.h *** *** 125,130 --- 126,138 static BgWriterShmemStruct *BgWriterShmem; /* + * BgWriter statistic counters, as sent to the stats collector + * Stored directly in a stats message structure so it can be sent + * without nedeing to copy things around. + */ + PgStat_MsgBgWriter BgWriterStats; + + /* * GUC parameters */ int BgWriterDelay = 200; *** *** 243,248 --- 251,261 MemoryContextSwitchTo(bgwriter_context); /* +* Initialize statistics counters to zero +*/ + memset(BgWriterStats, 0, sizeof(BgWriterStats)); + + /* * If an exception is encountered, processing resumes here. * * See notes in postgres.c about the design of this coding. *** *** 354,359 --- 367,373 checkpoint_requested = false; do_checkpoint = true; force_checkpoint = true; + BgWriterStats.m_requested_checkpoints++; } if (shutdown_requested) { *** *** 376,382 --- 390,400 now = time(NULL); elapsed_secs = now - last_checkpoint_time; if (elapsed_secs = CheckPointTimeout) + { do_checkpoint = true; + if (!force_checkpoint) + BgWriterStats.m_timed_checkpoints++; + } /* * Do a checkpoint if requested, otherwise do one cycle of *** *** 474,479 --- 492,502 } /* +* Send of activity statistics to the stats collector +*/ + pgstat_send_bgwriter(); + + /* * Nap for the configured time, or sleep for 10 seconds if there is no * bgwriter activity configured. * *** *** 789,791 --- 812,815 END_CRIT_SECTION(); } + Index: src/backend/postmaster/pgstat.c === RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v retrieving revision 1.151 diff -c -r1.151 pgstat.c *** src/backend/postmaster/pgstat.c 28 Mar 2007 22:17:12 - 1.151 --- src/backend/postmaster/pgstat.c 29 Mar 2007 13:22:00 - *** *** 135,140 --- 135,150 static PgBackendStatus *localBackendStatusTable = NULL; static intlocalNumBackends = 0; + /* + * BgWriter global
Re: [PATCHES] Make CLUSTER MVCC-safe
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Heikki Linnakangas wrote: Here's an update, fixing conflict by Tom's recent commit of Simon's patch to skip WAL-inserts when archiving is not enabled. Heikki Linnakangas wrote: This patch makes CLUSTER MVCC-safe. Visibility information and update chains are preserved like in VACUUM FULL. I created a new generic rewriteheap-facility to handle rewriting tables in a visibility-preserving manner. All the update chain tracking is done in rewriteheap.c, the caller is responsible for supplying the stream of tuples. CLUSTER is currently the only user of the facility, but I'm envisioning we might have other users in the future. For example, a version of VACUUM FULL that rewrites the whole table. We could also use it to make ALTER TABLE MVCC-safe, but there's some issues with that. For example, what to do if RECENTLY_DEAD tuples don't satisfy a newly added constraint. One complication in the implementation was the fact that heap_insert overwrites the visibility information, and it doesn't write the full tuple header to WAL. I ended up implementing a special-purpose raw_heap_insert function instead, which is optimized for bulk inserting a lot of tuples, knowing that we have exclusive access to the heap. raw_heap_insert keeps the current buffer locked over calls, until it gets full, and inserts the whole page to WAL as a single record using the existing XLOG_HEAP_NEWPAGE record type. This makes CLUSTER a more viable alternative to VACUUM FULL. One motivation for making CLUSTER MVCC-safe is that if some poor soul runs pg_dump to make a backup concurrently with CLUSTER, the clustered tables will appear to be empty in the dump file. The documentation doesn't anything about CLUSTER not being MVCC-safe, so I suppose there's no need to touch the docs. I sent a doc patch earlier to add a note about it, that doc patch should still be applied to older release branches, IMO. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] bgwriter stats
Magnus Hagander [EMAIL PROTECTED] writes: Attached is a new version of the bgwriter stats patch. Per previous discussion, now uses the stats system only. Introduces a new stats message for bgwriter, and also introduces a global stats part of the stats file for statistics not bound to a database or table. It looks sane in a quick once-over, but the comments need some love --- too many typos and misspellings for my taste. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] bgwriter stats
Tom Lane wrote: Magnus Hagander [EMAIL PROTECTED] writes: Attached is a new version of the bgwriter stats patch. Per previous discussion, now uses the stats system only. Introduces a new stats message for bgwriter, and also introduces a global stats part of the stats file for statistics not bound to a database or table. It looks sane in a quick once-over, but the comments need some love --- too many typos and misspellings for my taste. Ok. I'll be sure to work them over before I commit. Thanks for the quick check. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] LIMIT/SORT optimization
Updated patch attached: 1) Removes #if 0 optimizations 2) Changes #if 0 to #if NOT_USED for code that's there for completeness and to keep the code self-documenting purposes rather but isn't needed by anything currently 3) Fixed cost model to represent bounded sorts sort-limit-v7.patch.gz Description: Binary data Gregory Stark [EMAIL PROTECTED] writes: Heikki Linnakangas [EMAIL PROTECTED] writes: There's a few blocks of code surrounded with #if 0 - #endif. Are those just leftovers that should be removed, or are things that still need to finished and enabled? Uhm, I don't remember, will go look, thanks. Ok, they were left over code from an optimization that I decided wasn't very important to pursue. The code that was ifdef'd out detected when disk sorts could abort a disk sort merge because it had already generated enough tuples for to satisfy the limit. But I never wrote the code to actually abort the run and it looks a bit tricky. In any case the disk sort use case is extremely narrow, you would need something like LIMIT 5 or more to do it and it would have to be a an input table huge enough to cause multiple rounds of merges. I think I've figured out how to adjust the cost model. It turns out that it doesn't usually matter whether the cost model is correct since any case where the optimization kicks in is a case you're reading a small portion of the input so it's a case where an index would be *much* better if available. So the only times the optimization is used is when there's no index available. Nonetheless it's nice to get the estimates right so that higher levels in the plan get reasonable values. I think I figured out how to do the cost model. At least the results are reasonable. I'm not sure if I've done it the right way though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PATCHES] tsearch_core patch for inclusion
http://www.sigaev.ru/misc/tsearch_core-0.43.gz Changes: 1 Ispell dictionary now supports hunspell dictionary's format which is used by OpenOffice = 2.0.2 http://wiki.services.openoffice.org/wiki/Dictionaries Changes in format is addressed, basically, to better support of compound words ( German, Norwegian ). So, please, test it - we don't know that languages at all. 2 added recent fixes of contrib/tsearch2 3 fix usage of fopen/fclose -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [PATCH] add CLUSTER table USING index (take 2)
FYI, this is a great example of valuable patch review. --- Heikki Linnakangas wrote: Holger Schurig wrote: Index: src/doc/src/sgml/ref/cluster.sgml === *** src.orig/doc/src/sgml/ref/cluster.sgml 2007-03-28 23:02:12.0 +0200 --- src/doc/src/sgml/ref/cluster.sgml 2007-03-28 23:03:14.0 +0200 *** *** 20,27 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERindexname/replaceable ON replaceable class=PARAMETERtablename/replaceable ! CLUSTER replaceable class=PARAMETERtablename/replaceable CLUSTER /synopsis /refsynopsisdiv --- 20,26 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERtablename/replaceable [ USING replaceable class=PARAMETERindexname/replaceable ] CLUSTER /synopsis /refsynopsisdiv We still need to document the old syntax, especially if we don't change the example as well. Index: src/src/backend/parser/gram.y === *** src.orig/src/backend/parser/gram.y 2007-03-28 22:58:48.0 +0200 --- src/src/backend/parser/gram.y 2007-03-28 22:59:15.0 +0200 *** *** 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator --- 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name opt_cluster_using %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator Is the placement of opt_cluster_using completely arbitrary? I'm not very familiar with the parser, it really looks like those type-definitions are in random order. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] tsearch_core patch for inclusion
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Teodor Sigaev wrote: http://www.sigaev.ru/misc/tsearch_core-0.43.gz Changes: 1 Ispell dictionary now supports hunspell dictionary's format which is used by OpenOffice = 2.0.2 http://wiki.services.openoffice.org/wiki/Dictionaries Changes in format is addressed, basically, to better support of compound words ( German, Norwegian ). So, please, test it - we don't know that languages at all. 2 added recent fixes of contrib/tsearch2 3 fix usage of fopen/fclose -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [PATCH] add CLUSTER table USING index (take 2)
FYI, this is a great example of valuable patch review. It would have been better if the TODO entry would have been rigth :-) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [PATCH] add CLUSTER table USING index (take 2)
We still need to document the old syntax, especially if we don't change the example as well. I agree that the example should be re-written. But I'm not sure if I need to have a paragraph about the old syntax. There are two reasons: - I haven't seen any other SQL command where an old syntax was documented - I thought I could come away without writing doc. After all, I'm not a native english speaker. That's a point where I could need some help ... (maybe my english is good enought, but it's not worth to make a take 4 to take 17 patch just for english grammar, typos, subtle meanings, whatever. Index: src/src/backend/parser/gram.y === *** src.orig/src/backend/parser/gram.y 2007-03-28 22:58:48.0 +0200 --- src/src/backend/parser/gram.y 2007-03-28 22:59:15.0 +0200 *** *** 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator --- 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name opt_cluster_using %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator Is the placement of opt_cluster_using completely arbitrary? I'm not very familiar with the parser, it really looks like those type-definitions are in random order. I thought so. As you can see in the above patch, there are things like opt_validator in the next %type list section. There are many other %type str section in gram.y, but I haven't found a structure yet. For example, some tokens are named OptSchemaName, some are named opt_encoding. Let's look at this one. It's used in line 1090, defined in 1218. Before and after the usage there is transaction_mode_list and Colid_or_Sconst. Before and after the definition is zone_value and again ColId_or_Sconst. But neither of this three is defined at the same %type str as opt_encoding is. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PATCHES] [PATCH] add CLUSTER table USING index (take 3)
SGML ref text (swapped parameter list, changed example text) Also, I noticed that the text of the example spoke about a table employees, but the example used the table emp. I fixed this inconsistency. Index: src/doc/src/sgml/ref/cluster.sgml === *** src.orig/doc/src/sgml/ref/cluster.sgml 2007-03-28 23:03:20.0 +0200 --- src/doc/src/sgml/ref/cluster.sgml 2007-03-29 21:32:26.0 +0200 *** *** 20,27 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERindexname/replaceable ON replaceable class=PARAMETERtablename/replaceable ! CLUSTER replaceable class=PARAMETERtablename/replaceable CLUSTER /synopsis /refsynopsisdiv --- 20,26 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERtablename/replaceable [ USING replaceable class=PARAMETERindexname/replaceable ] CLUSTER /synopsis /refsynopsisdiv *** *** 77,95 variablelist varlistentry ! termreplaceable class=PARAMETERindexname/replaceable/term listitem para ! The name of an index. /para /listitem /varlistentry varlistentry ! termreplaceable class=PARAMETERtablename/replaceable/term listitem para ! The name (possibly schema-qualified) of a table. /para /listitem /varlistentry --- 76,94 variablelist varlistentry ! termreplaceable class=PARAMETERtablename/replaceable/term listitem para ! The name (possibly schema-qualified) of a table. /para /listitem /varlistentry varlistentry ! termreplaceable class=PARAMETERindexname/replaceable/term listitem para ! The name of an index. /para /listitem /varlistentry *** *** 172,180 para Cluster the table literalemployees/literal on the basis of !its index literalemp_ind/literal: programlisting ! CLUSTER emp_ind ON emp; /programlisting /para --- 171,179 para Cluster the table literalemployees/literal on the basis of !its index literalemployees_ind/literal: programlisting ! CLUSTER employees USING employees_ind; /programlisting /para *** *** 182,188 Cluster the literalemployees/literal table using the same index that was used before: programlisting ! CLUSTER emp; /programlisting /para --- 181,187 Cluster the literalemployees/literal table using the same index that was used before: programlisting ! CLUSTER employees; /programlisting /para Index: src/src/backend/parser/gram.y === *** src.orig/src/backend/parser/gram.y 2007-03-28 23:03:20.0 +0200 --- src/src/backend/parser/gram.y 2007-03-28 23:03:35.0 +0200 *** *** 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator --- 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name opt_cluster_using %type list func_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator *** *** 5327,5332 --- 5327,5333 * *QUERY: *cluster index_name on qualified_name + *cluster qualified_name USING index_name *cluster qualified_name *cluster * *** *** 5340,5350 n-indexname = $2; $$ = (Node*)n; } ! | CLUSTER qualified_name { ClusterStmt *n = makeNode(ClusterStmt); n-relation = $2; ! n-indexname = NULL; $$ = (Node*)n; } | CLUSTER --- 5341,5351 n-indexname = $2; $$ = (Node*)n; } ! | CLUSTER qualified_name opt_cluster_using { ClusterStmt *n = makeNode(ClusterStmt); n-relation = $2; ! n-indexname = $3;
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Simon, OK, different question: Why would anyone ever set full_page_compress = off? The only reason I can see is if compression costs us CPU but gains RAM I/O. I can think of a lot of applications ... benchmarks included ... which are CPU-bound but not RAM or I/O bound. For those applications, compression is a bad tradeoff. If, however, CPU used for compression is made up elsewhere through smaller file processing, then I'd agree that we don't need a switch. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote: OK, different question: Why would anyone ever set full_page_compress = off? The only reason I can see is if compression costs us CPU but gains RAM I/O. I can think of a lot of applications ... benchmarks included ... which are CPU-bound but not RAM or I/O bound. For those applications, compression is a bad tradeoff. If, however, CPU used for compression is made up elsewhere through smaller file processing, then I'd agree that we don't need a switch. Koichi-san has explained things for me now. I misunderstood what the parameter did and reading your post, ISTM you have as well. I do hope Koichi-san will alter the name to allow everybody to understand what it does. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [PATCH] add CLUSTER table USING index (take 2)
Holger Schurig wrote: FYI, this is a great example of valuable patch review. It would have been better if the TODO entry would have been rigth :-) It was right when I wrote it. ;-) I have updated it now. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] DEALLOCATE ALL
Marko Kreen wrote: Then a pooler argument: there is one pooler where RandomJoe executes queries and another for specific app where the subset of SQL it uses is known. I want to RESET only specific things in app case. So it would be good if the RESET-s for specific areas would be available. Also the objections to the Hans' patch give impression that different pooling solutions want different RESET EVERYTHING, so again, it would be good if RESET-s for different areas are available and the all-encomassing RESET EVERYTHING just ties all the specific RESETs together. Totally agree. Please make the adjustments to DEALLOCATE ALL, and roll that into the 2004 patch for RESET SESSION and post and updated version. Thanks. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [PATCH] add CLUSTER table USING index (take 3)
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Holger Schurig wrote: SGML ref text (swapped parameter list, changed example text) Also, I noticed that the text of the example spoke about a table employees, but the example used the table emp. I fixed this inconsistency. Index: src/doc/src/sgml/ref/cluster.sgml === *** src.orig/doc/src/sgml/ref/cluster.sgml2007-03-28 23:03:20.0 +0200 --- src/doc/src/sgml/ref/cluster.sgml 2007-03-29 21:32:26.0 +0200 *** *** 20,27 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERindexname/replaceable ON replaceable class=PARAMETERtablename/replaceable ! CLUSTER replaceable class=PARAMETERtablename/replaceable CLUSTER /synopsis /refsynopsisdiv --- 20,26 refsynopsisdiv synopsis ! CLUSTER replaceable class=PARAMETERtablename/replaceable [ USING replaceable class=PARAMETERindexname/replaceable ] CLUSTER /synopsis /refsynopsisdiv *** *** 77,95 variablelist varlistentry ! termreplaceable class=PARAMETERindexname/replaceable/term listitem para ! The name of an index. /para /listitem /varlistentry varlistentry ! termreplaceable class=PARAMETERtablename/replaceable/term listitem para ! The name (possibly schema-qualified) of a table. /para /listitem /varlistentry --- 76,94 variablelist varlistentry ! termreplaceable class=PARAMETERtablename/replaceable/term listitem para ! The name (possibly schema-qualified) of a table. /para /listitem /varlistentry varlistentry ! termreplaceable class=PARAMETERindexname/replaceable/term listitem para ! The name of an index. /para /listitem /varlistentry *** *** 172,180 para Cluster the table literalemployees/literal on the basis of !its index literalemp_ind/literal: programlisting ! CLUSTER emp_ind ON emp; /programlisting /para --- 171,179 para Cluster the table literalemployees/literal on the basis of !its index literalemployees_ind/literal: programlisting ! CLUSTER employees USING employees_ind; /programlisting /para *** *** 182,188 Cluster the literalemployees/literal table using the same index that was used before: programlisting ! CLUSTER emp; /programlisting /para --- 181,187 Cluster the literalemployees/literal table using the same index that was used before: programlisting ! CLUSTER employees; /programlisting /para Index: src/src/backend/parser/gram.y === *** src.orig/src/backend/parser/gram.y2007-03-28 23:03:20.0 +0200 --- src/src/backend/parser/gram.y 2007-03-28 23:03:35.0 +0200 *** *** 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name %type listfunc_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator --- 209,215 %type str relation_name copy_file_name database_name access_method_clause access_method attr_name ! index_name name file_name opt_cluster_using %type listfunc_name handler_name qual_Op qual_all_Op subquery_Op opt_class opt_validator *** *** 5327,5332 --- 5327,5333 * * QUERY: * cluster index_name on qualified_name + * cluster qualified_name USING index_name * cluster qualified_name * cluster * *** *** 5340,5350 n-indexname = $2; $$ = (Node*)n; } ! | CLUSTER qualified_name { ClusterStmt *n = makeNode(ClusterStmt); n-relation = $2; !n-indexname = NULL; $$ = (Node*)n; } | CLUSTER --- 5341,5351
Re: [PATCHES] DEALLOCATE ALL
Marko Kreen wrote: When pooling connections where prepared statements are in use, it is hard to give new client totally clean connection as there may be allocated statements that give errors when new client starts preparing statements again. I agree with the other comments that RESET SESSION is the right API for this, although we can also provide DEALLOCATE ALL, I suppose. As to the implementation, calling hash_remove() in a loop seems a pretty unfortunate way to clear a hash table -- adding a hash_reset() function to the dynahash API would be cleaner and faster. -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Small code clean-up
I have applied the initdb part of this patch, and Magnus did the win32_shmem part. Thanks. --- ITAGAKI Takahiro wrote: Here are two small code clean-up in initdb and win32_shmem. pg_char_to_encoding() was redundant in initdb because pg_valid_server_encoding() returns the same result if the encoding is valid, Changes in win32_shmem suppress the following warnings. | pg_shmem.c: In function `PGSharedMemoryCreate': | pg_shmem.c:137: warning: long unsigned int format, Size arg (arg 2) | pg_shmem.c:159: warning: long unsigned int format, Size arg (arg 2) *** initdb.c.orig Mon Mar 19 01:50:43 2007 --- initdb.c Wed Mar 28 10:02:42 2007 *** get_encoding_id(char *encoding_name) *** 709,716 if (encoding_name *encoding_name) { ! if ((enc = pg_char_to_encoding(encoding_name)) = 0 ! pg_valid_server_encoding(encoding_name) = 0) return encodingid_to_string(enc); } fprintf(stderr, _(%s: \%s\ is not a valid server encoding name\n), --- 709,715 if (encoding_name *encoding_name) { ! if ((enc = pg_valid_server_encoding(encoding_name)) = 0) return encodingid_to_string(enc); } fprintf(stderr, _(%s: \%s\ is not a valid server encoding name\n), *** win32_shmem.c.origWed Mar 28 10:17:14 2007 --- win32_shmem.c Wed Mar 28 10:16:36 2007 *** PGSharedMemoryCreate(Size size, bool mak *** 136,142 if (!hmap) ereport(FATAL, (errmsg(could not create shared memory segment: %lu, GetLastError()), ! errdetail(Failed system call was CreateFileMapping(size=%lu, name=%s), size, szShareMem))); /* * If the segment already existed, CreateFileMapping() will return a --- 136,142 if (!hmap) ereport(FATAL, (errmsg(could not create shared memory segment: %lu, GetLastError()), ! errdetail(Failed system call was CreateFileMapping(size=%lu, name=%s), (unsigned long) size, szShareMem))); /* * If the segment already existed, CreateFileMapping() will return a *** PGSharedMemoryCreate(Size size, bool mak *** 158,164 if (!hmap) ereport(FATAL, (errmsg(could not create shared memory segment: %lu, GetLastError()), ! errdetail(Failed system call was CreateFileMapping(size=%lu, name=%s), size, szShareMem))); if (GetLastError() == ERROR_ALREADY_EXISTS) ereport(FATAL, --- 158,164 if (!hmap) ereport(FATAL, (errmsg(could not create shared memory segment: %lu, GetLastError()), ! errdetail(Failed system call was CreateFileMapping(size=%lu, name=%s), (unsigned long) size, szShareMem))); if (GetLastError() == ERROR_ALREADY_EXISTS) ereport(FATAL, Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Patch for circular buffer in tuplestore to optimize merge joins (v1)
Via IM, author says it is ready. Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- stark wrote: This patch implements a circular buffer in tuplestore which drops old tuples as they're no longer needed. It uses this for merge joins to avoid having to spill the tuplestore if no single value exceeds work_mem. It also is what's needed for both recursive query support and OLAP window functions (hence why it implements the more complex circular buffer rather than just moving the single tuple up to the head of the buffer). This was mostly already done by Simon, I just finished the logic in tuplesort.c. This is actually not quite polished so I guess it's still a WIP but it's certainly ready to be reviewed. All that remains is polishing. If there's anything in there people object to now I would like to know. [ Attachment, skipping... ] -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [PATCH] add CLUSTER table USING index (take 2)
Holger Schurig [EMAIL PROTECTED] writes: I agree that the example should be re-written. But I'm not sure if I need to have a paragraph about the old syntax. There are two reasons: - I haven't seen any other SQL command where an old syntax was documented If we were deprecating the old syntax with intention to remove it, that might be a defensible position, but I didn't think we were doing that. IMHO both forms seriously do need to be documented so that people will understand that the index/table order is different. Otherwise there'll be enormous confusion. - I thought I could come away without writing doc. After all, I'm not a native english speaker. That's a point where I could need some help ... (maybe my english is good enought, but it's not worth to make a take 4 to take 17 patch just for english grammar, typos, subtle meanings, whatever. Your English seems fine to me, certainly more than good enough to produce first-draft documentation. Whoever reviews/commits it will help out as needed. Is the placement of opt_cluster_using completely arbitrary? I'm not very familiar with the parser, it really looks like those type-definitions are in random order. I thought so. Yeah, it's just a mess :=(. Somebody might go through and sort them into alphabetical order or something someday, but not today. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Current enums patch
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Tom Dunstan wrote: Hi all Here's the current version of the enums patch. Not much change from last time, the only thought-inducing stuff was fixing up some macros that changed with the VARLENA changes, and adding a regression test to do basic checking of RI behavior, after the discussions that we had recently on the ri_trigger stuff with generic types. The actual behavior was fixed by Tom's earlier patch, so this is just a sanity check. Cheers Tom [ application/x-gzip is not supported, skipping... ] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Josh; I'd like to explain what the term compression in my proposal means again and would like to show the resource consumption comparision with cp and gzip. My proposal is to remove unnecessary full page writes (they are needed in crash recovery from inconsistent or partial writes) when we copy WAL to archive log and rebuilt them as a dummy when we restore from archive log. Dummy is needed to maintain LSN. So it is very very different from general purpose compression such as gzip, although pg_compresslog compresses archive log as a result. As to CPU and I/O consumption, I've already evaluated as follows: 1) Collect all the WAL segment. 2) Copy them by different means, cp, pg_compresslog and gzip. and compared the ellapsed time as well as other resource consumption. Benchmark: DBT-2 Database size: 120WH (12.3GB) Total WAL size: 4.2GB (after 60min. run) Elapsed time: cp:120.6sec gzip: 590.0sec pg_compresslog: 79.4sec Resultant archive log size: cp: 4.2GB gzip: 2.2GB pg_compresslog: 0.3GB Resource consumption: cp: user: 0.5sec system: 15.8sec idle: 16.9sec I/O wait: 87.7sec gzip: user: 286.2sec system: 8.6sec idle: 260.5sec I/O wait: 36.0sec pg_compresslog: user: 7.9sec system: 5.5sec idle: 37.8sec I/O wait: 28.4sec Because the resultant log size is considerably smaller than cp or gzip, pg_compresslog need much less I/O and because the logic is much simpler than gzip, it does not consume CPU. The term compress may not be appropriate. We may call this log optimization instead. So I don't see any reason why this (at least optimization mark in each log record) can't be integrated. Simon Riggs wrote: On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote: OK, different question: Why would anyone ever set full_page_compress = off? The only reason I can see is if compression costs us CPU but gains RAM I/O. I can think of a lot of applications ... benchmarks included ... which are CPU-bound but not RAM or I/O bound. For those applications, compression is a bad tradeoff. If, however, CPU used for compression is made up elsewhere through smaller file processing, then I'd agree that we don't need a switch. As I wrote to Simon's comment, I concern only one thing. Without a switch, because both full page writes and corresponding logical log is included in WAL, this will increase WAL size slightly (maybe about five percent or so). If everybody is happy with this, we don't need a switch. Koichi-san has explained things for me now. I misunderstood what the parameter did and reading your post, ISTM you have as well. I do hope Koichi-san will alter the name to allow everybody to understand what it does. Here're some candidates: full_page_writes_optimize full_page_writes_mark: means it marks full_page_write as needed in crash recovery, needed in archive recovery and so on. I don't insist these names. It's very helpful if you have any suggestion to reflect what it really means. Regards; -- Koichi Suzuki ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Make CLUSTER MVCC-safe
Heikki Linnakangas wrote: Here's an update, fixing conflict by Tom's recent commit of Simon's patch to skip WAL-inserts when archiving is not enabled. Hmm, wouldn't it be better if the rewriteheap.c file was in access/heap/ instead of commands/? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Hi, Here's a patch reflected some of Simon's comments. 1) Removed an elog call in a critical section. 2) Changed the name of the commands, pg_complesslog and pg_decompresslog. 3) Changed diff option to make a patch. -- Koichi Suzuki pg_lesslog.tgz Description: Binary data ---(end of broadcast)--- TIP 6: explain analyze is your friend