Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 08/08/12 21:34, Robert Haas wrote: I think we need to implement buffering both to end of statement or end of transaction, not just one or the other. Another (not necessarily better) idea is to use a buffer that's part of the index, like the GIN fastupdate stuff, so that there's no particular constraint on when the buffer has to be flushed, but competing index scans may be slower until it is. If it is an implementation artifact or an result of this approach I dont know. But currently, when the GIN fastupdate code finally decides to flush the buffer, it is going to stall all other processes doing updates while doing it. If you only have one update process then this doesn't matter. But if you're trying to get user-interactive-updates to flow in with batch-updates from background processes, then you'd better kill off this feature, since you're gauranteed that the user-interactive process is either going to flush the buffer or wait on someone else doing it. I havent done the benchmarking, but I'm actually fairly sure that fastupdate isn't overall faster if you bump concurrency slightly and run of memory or SSD-based backends due to this cross-backend contention of the buffer. A buffer that is backend local, so you can use transactions to batch up changes would get around this, but that may have another huge set of consequenses I dont know if. ... based on my own real-world experience with this feature. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Thu, Aug 9, 2012 at 2:59 AM, Jesper Krogh jes...@krogh.cc wrote: If it is an implementation artifact or an result of this approach I dont know. But currently, when the GIN fastupdate code finally decides to flush the buffer, it is going to stall all other processes doing updates while doing it. If you only have one update process then this doesn't matter. But if you're trying to get user-interactive-updates to flow in with batch-updates from background processes, then you'd better kill off this feature, since you're gauranteed that the user-interactive process is either going to flush the buffer or wait on someone else doing it. I havent done the benchmarking, but I'm actually fairly sure that fastupdate isn't overall faster if you bump concurrency slightly and run of memory or SSD-based backends due to this cross-backend contention of the buffer. Yeah, I've noticed that there are some things that are a little wonky about GIN fastupdate. On the other hand, I believe that MySQL has something along these lines called secondary index buffering which apparently does very good things for random I/O. I am not sure of the details or the implementation, though. A buffer that is backend local, so you can use transactions to batch up changes would get around this, but that may have another huge set of consequenses I dont know if. ... based on my own real-world experience with this feature. Well, the main thing to worry about is transactional consistency. If a backend which has postponed doing the index-inserts does an index scan after the command-counter-id has been bumped, it'll see inconsistent results. We could avoid that by only using the optimization when some set of sanity checks passes and doing the deferred inserts at the end of the statement, or something like that. The other tricky part is figuring out how to actually get a performance improvement out of it. I think Simon's probably right that a lot of the cost is in repeatedly walking the btree, looking up and pinning/unpinning/locking/unlocking buffers along the way. Maybe we could sort the data in index order, walk down to the first insertion point, and the insert as many tuples in a row as precede the next key in the index. Then lather, rinse, repeat. If you're actually just adding everything at the tail of the index, this ought to work pretty well. But if the inserts are all over the place it seems like it might not be any better, or actually a little worse. Of course it's probably premature to speculate too much until someone actually codes something up and tests it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 8 August 2012 03:44, Jeff Janes jeff.ja...@gmail.com wrote: On Tue, Aug 7, 2012 at 1:52 PM, Simon Riggs si...@2ndquadrant.com wrote: On 7 August 2012 20:58, Jeff Janes jeff.ja...@gmail.com wrote: Hi Heikki, Is the bulk index insert still an active area for you? If not, is there some kind of summary of design or analysis work already done, which would be a useful for someone else interested in picking it up? The main cost comes from repeated re-seeking down the index tree to find the insertion point, but repeated lock and pin operations on index buffers are also high. That is optimisable if the index inserts are grouped, as they will be with monotonic indexes, (e.g. SERIAL), or with partial monotonic (i.e. with Parent/Child relationship, on Child table many inserts will be of the form (x,1), (x,2), (x, 3) etc, e.g. Order/OrderLine). All we need do is buffer the inserts in the inserts, before inserting them into the main index. As long as we flush the buffer before end of transaction, we're good. Incidentally, we can also optimise repeated inserts within a normal transaction using this method, by implementing deferred unique constraints. At present we say that unique constraints aren't deferrable, but there's no reason they can't be, if we allow buffering in the index. (Implementing a deferred constraint that won't fail if we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo, which is probably a bad plan, plus you'd need to sort the inputs, so that particular nut is another issue altogether, AFAICS). Suggested implementation is to buffer index tuples at the generic index API, then implement a bulk insert index API call that can then be implemented for each AM separately. Suggested buffer size is work_mem. Note we must extract index tuple from heap tuples - refetching heap blocks to get rows is too costly. OK, thanks for the summary. It looks like your plans are to improve the case where the index maintenance is already CPU limited. I was more interested in cases where the regions of the index(es) undergoing active insertion do not fit into usable RAM, so the limit is the random IO needed to fetch the leaf pages in order to update them or to write them out once dirtied. Here too buffering is probably the answer, but the size of the buffer needed to turn that IO from effectively random to effectively sequential is probably much larger than the size of the buffer you are contemplating. The buffer size can be variable, yes. I was imagining a mechanism that worked for normal INSERTs as well as COPY. Perhaps we could say buffer is work_mem with INSERT and maintenance_work_mem with COPY. Very large index appends are useful, but currently not very easily usable because of the transactional nature of COPY. If we could reject rows without ERROR it would be more practical. I'm not planning to work on this, so all comments for your assistance. I think we need to implement buffering both to end of statement or end of transaction, not just one or the other. With the planner deciding which would be better, or explicit user action? Probably both: on/off/choose. Deferring unique check would change the point at which errors were reported in a transaction, which might not be desirable for some. I think SQL standard has something to say about this also, so that needs care. But in general, if your tables use generated PK values they should be able to benefit from this, so I would suggest a default setting of choose. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Tue, Aug 7, 2012 at 4:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Incidentally, we can also optimise repeated inserts within a normal transaction using this method, by implementing deferred unique constraints. At present we say that unique constraints aren't deferrable, but there's no reason they can't be, if we allow buffering in the index. (Implementing a deferred constraint that won't fail if we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo, which is probably a bad plan, plus you'd need to sort the inputs, so that particular nut is another issue altogether, AFAICS). We've had deferrable unique constraints since 9.0, courtesy of Dean Rasheed. I think we need to implement buffering both to end of statement or end of transaction, not just one or the other. Another (not necessarily better) idea is to use a buffer that's part of the index, like the GIN fastupdate stuff, so that there's no particular constraint on when the buffer has to be flushed, but competing index scans may be slower until it is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 8 August 2012 20:34, Robert Haas robertmh...@gmail.com wrote: On Tue, Aug 7, 2012 at 4:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Incidentally, we can also optimise repeated inserts within a normal transaction using this method, by implementing deferred unique constraints. At present we say that unique constraints aren't deferrable, but there's no reason they can't be, if we allow buffering in the index. (Implementing a deferred constraint that won't fail if we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo, which is probably a bad plan, plus you'd need to sort the inputs, so that particular nut is another issue altogether, AFAICS). We've had deferrable unique constraints since 9.0, courtesy of Dean Rasheed. Yeh, but IIRC there was some issue I can't seem to find detail on about it not working quite right in production. Not sure now. I think we need to implement buffering both to end of statement or end of transaction, not just one or the other. Another (not necessarily better) idea is to use a buffer that's part of the index, like the GIN fastupdate stuff, so that there's no particular constraint on when the buffer has to be flushed, but competing index scans may be slower until it is. I think that works very well for non-unique indexes, though it does increase WAL traffic since you need to do a double hop into the index. Its not possible for unique constraints/pk indexes since they need to fail the transaction in case of duplicates. The buffer I was imagining would be a private buffer within a transaction, so wouldn't increase WAL. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Fri, Aug 12, 2011 at 2:59 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.08.2011 00:17, Simon Riggs wrote: Also, we discussed that you would work on buffering the index inserts, which is where the main problem lies. The main heap is only a small part of the overhead if we have multiple indexes already built on a table - which is the use case that causes the most problem. Sure, if you have indexes on the table already, then the overhead of updating them is more significant. I am actually working on that too, I will make a separate post about that. Hi Heikki, Is the bulk index insert still an active area for you? If not, is there some kind of summary of design or analysis work already done, which would be a useful for someone else interested in picking it up? Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 7 August 2012 20:58, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Aug 12, 2011 at 2:59 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.08.2011 00:17, Simon Riggs wrote: Also, we discussed that you would work on buffering the index inserts, which is where the main problem lies. The main heap is only a small part of the overhead if we have multiple indexes already built on a table - which is the use case that causes the most problem. Sure, if you have indexes on the table already, then the overhead of updating them is more significant. I am actually working on that too, I will make a separate post about that. Hi Heikki, Is the bulk index insert still an active area for you? If not, is there some kind of summary of design or analysis work already done, which would be a useful for someone else interested in picking it up? The main cost comes from repeated re-seeking down the index tree to find the insertion point, but repeated lock and pin operations on index buffers are also high. That is optimisable if the index inserts are grouped, as they will be with monotonic indexes, (e.g. SERIAL), or with partial monotonic (i.e. with Parent/Child relationship, on Child table many inserts will be of the form (x,1), (x,2), (x, 3) etc, e.g. Order/OrderLine). All we need do is buffer the inserts in the inserts, before inserting them into the main index. As long as we flush the buffer before end of transaction, we're good. Incidentally, we can also optimise repeated inserts within a normal transaction using this method, by implementing deferred unique constraints. At present we say that unique constraints aren't deferrable, but there's no reason they can't be, if we allow buffering in the index. (Implementing a deferred constraint that won't fail if we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo, which is probably a bad plan, plus you'd need to sort the inputs, so that particular nut is another issue altogether, AFAICS). Suggested implementation is to buffer index tuples at the generic index API, then implement a bulk insert index API call that can then be implemented for each AM separately. Suggested buffer size is work_mem. Note we must extract index tuple from heap tuples - refetching heap blocks to get rows is too costly. I think we need to implement buffering both to end of statement or end of transaction, not just one or the other. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Tue, Aug 7, 2012 at 1:52 PM, Simon Riggs si...@2ndquadrant.com wrote: On 7 August 2012 20:58, Jeff Janes jeff.ja...@gmail.com wrote: Hi Heikki, Is the bulk index insert still an active area for you? If not, is there some kind of summary of design or analysis work already done, which would be a useful for someone else interested in picking it up? The main cost comes from repeated re-seeking down the index tree to find the insertion point, but repeated lock and pin operations on index buffers are also high. That is optimisable if the index inserts are grouped, as they will be with monotonic indexes, (e.g. SERIAL), or with partial monotonic (i.e. with Parent/Child relationship, on Child table many inserts will be of the form (x,1), (x,2), (x, 3) etc, e.g. Order/OrderLine). All we need do is buffer the inserts in the inserts, before inserting them into the main index. As long as we flush the buffer before end of transaction, we're good. Incidentally, we can also optimise repeated inserts within a normal transaction using this method, by implementing deferred unique constraints. At present we say that unique constraints aren't deferrable, but there's no reason they can't be, if we allow buffering in the index. (Implementing a deferred constraint that won't fail if we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo, which is probably a bad plan, plus you'd need to sort the inputs, so that particular nut is another issue altogether, AFAICS). Suggested implementation is to buffer index tuples at the generic index API, then implement a bulk insert index API call that can then be implemented for each AM separately. Suggested buffer size is work_mem. Note we must extract index tuple from heap tuples - refetching heap blocks to get rows is too costly. OK, thanks for the summary. It looks like your plans are to improve the case where the index maintenance is already CPU limited. I was more interested in cases where the regions of the index(es) undergoing active insertion do not fit into usable RAM, so the limit is the random IO needed to fetch the leaf pages in order to update them or to write them out once dirtied. Here too buffering is probably the answer, but the size of the buffer needed to turn that IO from effectively random to effectively sequential is probably much larger than the size of the buffer you are contemplating. I think we need to implement buffering both to end of statement or end of transaction, not just one or the other. With the planner deciding which would be better, or explicit user action? Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 25.11.2011 23:32, Jeff Janes wrote: On Fri, Nov 25, 2011 at 12:53 PM, Jeff Janesjeff.ja...@gmail.com wrote: Thanks for this patch. Doing bulk copies in parallel for me is now limited by the IO subsystem rather than the CPU. This patch, commit number d326d9e8ea1d69, causes fillfactor to be ignored for the copy command. Is this acceptable collateral damage? Having looked into it a little bit, I think this might be an acceptable fix. Thanks, applied. It was an oversight. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Mon, Oct 24, 2011 at 7:46 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Thanks! Here's an updated version of the patch, fixing that, and all the other issues pointed out this far. I extracted the code that sets oid and tuple headers, and invokes the toaster, into a new function that's shared by heap_insert() and heap_multi_insert(). Tom objected to merging heap_insert() and heap_multi_insert() into one complicated function, and I think he was right on that, but sharing this code to prepare a tuple still makes sense. IMHO it makes heap_insert() slightly more readable too. Hi Heikki, Thanks for this patch. Doing bulk copies in parallel for me is now limited by the IO subsystem rather than the CPU. This patch, commit number d326d9e8ea1d69, causes fillfactor to be ignored for the copy command. Is this acceptable collateral damage? This can be seen by using pgbench -i -s50 -F50 to create the table combined with and select pg_size_pretty(pg_table_size('pgbench_accounts')), or by using the relation_free_space extension to pageinspect proposed elsewhere. Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Fri, Nov 25, 2011 at 12:53 PM, Jeff Janes jeff.ja...@gmail.com wrote: Hi Heikki, Thanks for this patch. Doing bulk copies in parallel for me is now limited by the IO subsystem rather than the CPU. This patch, commit number d326d9e8ea1d69, causes fillfactor to be ignored for the copy command. Is this acceptable collateral damage? Having looked into it a little bit, I think this might be an acceptable fix. Cheers, Jeff bulkwal_copy_1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 25.09.2011 16:03, Dean Rasheed wrote: On 25 September 2011 09:43, Kohei KaiGaikai...@kaigai.gr.jp wrote: Hi Heikki, I checked your patch, then I have a comment and two questions here. 2011/9/14 Heikki Linnakangasheikki.linnakan...@enterprisedb.com: Attached is a new version of the patch. It is now complete, including WAL replay code. Hi, I had a quick look at the patch as well and spotted an oversight: the multi-insert code branch fails to invoke AFTER ROW triggers. Thanks! Here's an updated version of the patch, fixing that, and all the other issues pointed out this far. I extracted the code that sets oid and tuple headers, and invokes the toaster, into a new function that's shared by heap_insert() and heap_multi_insert(). Tom objected to merging heap_insert() and heap_multi_insert() into one complicated function, and I think he was right on that, but sharing this code to prepare a tuple still makes sense. IMHO it makes heap_insert() slightly more readable too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com *** a/src/backend/access/heap/heapam.c --- b/src/backend/access/heap/heapam.c *** *** 24,29 --- 24,30 * heap_getnext - retrieve next tuple in scan * heap_fetch - retrieve tuple with given tid * heap_insert - insert tuple into a relation + * heap_multi_insert - insert multiple tuples into a relation * heap_delete - delete a tuple from a relation * heap_update - replace a tuple in a relation with another tuple * heap_markpos - mark scan position *** *** 79,84 static HeapScanDesc heap_beginscan_internal(Relation relation, --- 80,87 int nkeys, ScanKey key, bool allow_strat, bool allow_sync, bool is_bitmapscan); + static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup, + TransactionId xid, CommandId cid, int options); static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf, ItemPointerData from, Buffer newbuf, HeapTuple newtup, bool all_visible_cleared, bool new_all_visible_cleared); *** *** 1866,1920 heap_insert(Relation relation, HeapTuple tup, CommandId cid, Buffer vmbuffer = InvalidBuffer; bool all_visible_cleared = false; - if (relation-rd_rel-relhasoids) - { - #ifdef NOT_USED - /* this is redundant with an Assert in HeapTupleSetOid */ - Assert(tup-t_data-t_infomask HEAP_HASOID); - #endif - - /* - * If the object id of this tuple has already been assigned, trust the - * caller. There are a couple of ways this can happen. At initial db - * creation, the backend program sets oids for tuples. When we define - * an index, we set the oid. Finally, in the future, we may allow - * users to set their own object ids in order to support a persistent - * object store (objects need to contain pointers to one another). - */ - if (!OidIsValid(HeapTupleGetOid(tup))) - HeapTupleSetOid(tup, GetNewOid(relation)); - } - else - { - /* check there is not space for an OID */ - Assert(!(tup-t_data-t_infomask HEAP_HASOID)); - } - - tup-t_data-t_infomask = ~(HEAP_XACT_MASK); - tup-t_data-t_infomask2 = ~(HEAP2_XACT_MASK); - tup-t_data-t_infomask |= HEAP_XMAX_INVALID; - HeapTupleHeaderSetXmin(tup-t_data, xid); - HeapTupleHeaderSetCmin(tup-t_data, cid); - HeapTupleHeaderSetXmax(tup-t_data, 0); /* for cleanliness */ - tup-t_tableOid = RelationGetRelid(relation); - /* ! * If the new tuple is too big for storage or contains already toasted ! * out-of-line attributes from some other relation, invoke the toaster. * * Note: below this point, heaptup is the data we actually intend to store * into the relation; tup is the caller's original untoasted data. */ ! if (relation-rd_rel-relkind != RELKIND_RELATION) ! { ! /* toast table entries should never be recursively toasted */ ! Assert(!HeapTupleHasExternal(tup)); ! heaptup = tup; ! } ! else if (HeapTupleHasExternal(tup) || tup-t_len TOAST_TUPLE_THRESHOLD) ! heaptup = toast_insert_or_update(relation, tup, NULL, options); ! else ! heaptup = tup; /* * We're about to do the actual insert -- but check for conflict first, --- 1869,1882 Buffer vmbuffer = InvalidBuffer; bool all_visible_cleared = false; /* ! * Fill in tuple header fields, assign an OID, and toast the tuple if ! * necessary. * * Note: below this point, heaptup is the data we actually intend to store * into the relation; tup is the caller's original untoasted data. */ ! heaptup = heap_prepare_insert(relation, tup, xid, cid, options); /* * We're about to do the actual insert -- but check for conflict first, *** *** 2035,2041 heap_insert(Relation relation, HeapTuple tup, CommandId cid, */ CacheInvalidateHeapTuple(relation, heaptup, NULL); ! pgstat_count_heap_insert(relation); /* * If heaptup is a private copy, release it. Don't
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 25.09.2011 19:01, Robert Haas wrote: On Wed, Sep 14, 2011 at 6:52 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Why do you need new WAL replay routines? Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any large, I think we should be avoiding special-purpose WAL entries as much as possible. I tried that, but most of the reduction in WAL-size melts away with that. And if the page you're copying to is not empty, logging the whole page is even more expensive. You'd need to fall back to retail inserts in that case which complicates the logic. Where does it go? I understand why it'd be a problem for partially filled pages, but it seems like it ought to be efficient for pages that are initially empty. A regular heap_insert record leaves out a lot of information that can be deduced at replay time. It can leave out all the headers, including just the null bitmap + data. In addition to that, there's just the location of the tuple (RelFileNode+ItemPointer). At replay, xmin is taken from the WAL record header. For a multi-insert record, you don't even need to store the RelFileNode and the block number for every tuple, just the offsets. In comparison, a full-page image will include the full tuple header, and also the line pointers. If I'm doing my math right, a full-page image takes 25 bytes more data per tuple, than the special-purpose multi-insert record. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Thu, Oct 6, 2011 at 7:33 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: A regular heap_insert record leaves out a lot of information that can be deduced at replay time. It can leave out all the headers, including just the null bitmap + data. In addition to that, there's just the location of the tuple (RelFileNode+ItemPointer). At replay, xmin is taken from the WAL record header. For a multi-insert record, you don't even need to store the RelFileNode and the block number for every tuple, just the offsets. In comparison, a full-page image will include the full tuple header, and also the line pointers. If I'm doing my math right, a full-page image takes 25 bytes more data per tuple, than the special-purpose multi-insert record. Interesting. It's always seemed to me fairly inefficient in general to store the whole RelFileNode. For many people, the database and tablespace OID will be constants, and even if they aren't, there certainly aren't going to be 96 bits of entropy in the relfilenode. I thought about whether we could create some sort of mapping layer, where say once per checkpoint we'd allocate a 4-byte integer to denote a relfilenode, and WAL-log that mapping. Then after that everyone could just refer to the 4-byte integer instead of the whole relfilenode. But it seems like a lot of work for 8 bytes per record. Then again, if you're getting that much benefit from shaving off 25 bytes per tuple, maybe it is, although I feel like FPW is the elephant in the room. But I digress... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 06.10.2011 15:11, Robert Haas wrote: On Thu, Oct 6, 2011 at 7:33 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: A regular heap_insert record leaves out a lot of information that can be deduced at replay time. It can leave out all the headers, including just the null bitmap + data. In addition to that, there's just the location of the tuple (RelFileNode+ItemPointer). At replay, xmin is taken from the WAL record header. For a multi-insert record, you don't even need to store the RelFileNode and the block number for every tuple, just the offsets. In comparison, a full-page image will include the full tuple header, and also the line pointers. If I'm doing my math right, a full-page image takes 25 bytes more data per tuple, than the special-purpose multi-insert record. Interesting. It's always seemed to me fairly inefficient in general to store the whole RelFileNode. For many people, the database and tablespace OID will be constants, and even if they aren't, there certainly aren't going to be 96 bits of entropy in the relfilenode. I thought about whether we could create some sort of mapping layer, where say once per checkpoint we'd allocate a 4-byte integer to denote a relfilenode, and WAL-log that mapping. Then after that everyone could just refer to the 4-byte integer instead of the whole relfilenode. But it seems like a lot of work for 8 bytes per record. Then again, if you're getting that much benefit from shaving off 25 bytes per tuple, maybe it is, although I feel like FPW is the elephant in the room. A very simple optimization would be to leave out tablespace OID altogether if it's DEFAULTTABLESPACE_OID, and just set a flag somewhere. Then again, we could also just compress the WAL wholesale. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
Hi Heikki, I checked your patch, then I have a comment and two questions here. The heap_prepare_insert() seems a duplication of code with earlier half of existing heap_insert(). I think it is a good question to consolidate these portion of the code. I'm not clear the reason why the argument of CheckForSerializableConflictIn() was changed from the one in heap_insert(). Its source code comment describes as: : * For a heap insert, we only need to check for table-level SSI locks. * Our new tuple can't possibly conflict with existing tuple locks, and * heap page locks are only consolidated versions of tuple locks; they do * not lock gaps as index page locks do. So we don't need to identify * a buffer before making the call. */ Is it feasible that newly inserted tuples conflict with existing tuple locks when we expand insert to support multiple tuples at once? It is a bit confusing for me. This patch disallow multiple-insertion not only when before-row trigger is configured, but volatile functions are used to compute a default value also. Is it a reasonable condition to avoid multiple-insertion? All the datums to be delivered to heap_form_tuple() is calculated in NextCopyFrom, and default values are also constructed here; sequentially. So, it seems to me we have no worry about volatile functions are not invoked toward each tuples. Or, am I missing something? Thanks, 2011/9/14 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: On 13.08.2011 17:33, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: The patch is WIP, mainly because I didn't write the WAL replay routines yet, but please let me know if you see any issues. Why do you need new WAL replay routines? Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any large, I think we should be avoiding special-purpose WAL entries as much as possible. I tried that, but most of the reduction in WAL-size melts away with that. And if the page you're copying to is not empty, logging the whole page is even more expensive. You'd need to fall back to retail inserts in that case which complicates the logic. Also, I strongly object to turning regular heap_insert into a wrapper around some other more complicated operation. That will likely have bad consequences for the performance of every other operation. Ok. I doubt it makes any noticeable difference for performance, but having done that, it's more readable, too, to duplicate the code. Attached is a new version of the patch. It is now complete, including WAL replay code. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 25 September 2011 09:43, Kohei KaiGai kai...@kaigai.gr.jp wrote: Hi Heikki, I checked your patch, then I have a comment and two questions here. 2011/9/14 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: Attached is a new version of the patch. It is now complete, including WAL replay code. Hi, I had a quick look at the patch as well and spotted an oversight: the multi-insert code branch fails to invoke AFTER ROW triggers. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Wed, Sep 14, 2011 at 6:52 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Why do you need new WAL replay routines? Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any large, I think we should be avoiding special-purpose WAL entries as much as possible. I tried that, but most of the reduction in WAL-size melts away with that. And if the page you're copying to is not empty, logging the whole page is even more expensive. You'd need to fall back to retail inserts in that case which complicates the logic. Where does it go? I understand why it'd be a problem for partially filled pages, but it seems like it ought to be efficient for pages that are initially empty. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
Kohei KaiGai wrote: I'm not clear the reason why the argument of CheckForSerializableConflictIn() was changed from the one in heap_insert(). The code was probably just based on heap_insert() before this recent commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9d306c66e63eb7f45eab9475b3f96c3134bacac6 Is it feasible that newly inserted tuples conflict with existing tuple locks when we expand insert to support multiple tuples at once? No. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 13.08.2011 17:33, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: The patch is WIP, mainly because I didn't write the WAL replay routines yet, but please let me know if you see any issues. Why do you need new WAL replay routines? Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any large, I think we should be avoiding special-purpose WAL entries as much as possible. I tried that, but most of the reduction in WAL-size melts away with that. And if the page you're copying to is not empty, logging the whole page is even more expensive. You'd need to fall back to retail inserts in that case which complicates the logic. Also, I strongly object to turning regular heap_insert into a wrapper around some other more complicated operation. That will likely have bad consequences for the performance of every other operation. Ok. I doubt it makes any noticeable difference for performance, but having done that, it's more readable, too, to duplicate the code. Attached is a new version of the patch. It is now complete, including WAL replay code. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 06db65d..b479613 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -24,6 +24,7 @@ * heap_getnext - retrieve next tuple in scan * heap_fetch - retrieve tuple with given tid * heap_insert - insert tuple into a relation + * heap_multi_insert - insert multiple tuples into a relation * heap_delete - delete a tuple from a relation * heap_update - replace a tuple in a relation with another tuple * heap_markpos - mark scan position @@ -2030,7 +2031,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, */ CacheInvalidateHeapTuple(relation, heaptup, NULL); - pgstat_count_heap_insert(relation); + pgstat_count_heap_insert(relation, 1); /* * If heaptup is a private copy, release it. Don't forget to copy t_self @@ -2046,6 +2047,276 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, } /* + * Subroutine for heap_multi_insert(). Prepares a tuple for insertion. This + * sets the tuple header fields, assigns an OID, and toasts the tuple if + * necessary. Returns a toasted version of the tuple if it was toasted, or + * the original tuple if not. + */ +static HeapTuple +heap_prepare_insert(Relation relation, HeapTuple tup, CommandId cid, + int options) +{ + TransactionId xid = GetCurrentTransactionId(); + HeapTuple heaptup; + + if (relation-rd_rel-relhasoids) + { +#ifdef NOT_USED + /* this is redundant with an Assert in HeapTupleSetOid */ + Assert(tup-t_data-t_infomask HEAP_HASOID); +#endif + + /* + * If the object id of this tuple has already been assigned, trust the + * caller. There are a couple of ways this can happen. At initial db + * creation, the backend program sets oids for tuples. When we define + * an index, we set the oid. Finally, in the future, we may allow + * users to set their own object ids in order to support a persistent + * object store (objects need to contain pointers to one another). + */ + if (!OidIsValid(HeapTupleGetOid(tup))) + HeapTupleSetOid(tup, GetNewOid(relation)); + } + else + { + /* check there is not space for an OID */ + Assert(!(tup-t_data-t_infomask HEAP_HASOID)); + } + + tup-t_data-t_infomask = ~(HEAP_XACT_MASK); + tup-t_data-t_infomask2 = ~(HEAP2_XACT_MASK); + tup-t_data-t_infomask |= HEAP_XMAX_INVALID; + HeapTupleHeaderSetXmin(tup-t_data, xid); + HeapTupleHeaderSetCmin(tup-t_data, cid); + HeapTupleHeaderSetXmax(tup-t_data, 0); /* for cleanliness */ + tup-t_tableOid = RelationGetRelid(relation); + + /* + * If the new tuple is too big for storage or contains already toasted + * out-of-line attributes from some other relation, invoke the toaster. + * + * Note: below this point, heaptup is the data we actually intend to store + * into the relation; tup is the caller's original untoasted data. + */ + if (relation-rd_rel-relkind != RELKIND_RELATION) + { + /* toast table entries should never be recursively toasted */ + Assert(!HeapTupleHasExternal(tup)); + heaptup = tup; + } + else if (HeapTupleHasExternal(tup) || tup-t_len TOAST_TUPLE_THRESHOLD) + heaptup = toast_insert_or_update(relation, tup, NULL, options); + else + heaptup = tup; + + return heaptup; +} + +/* + * heap_multi_insert - insert multiple tuple into a heap + * + * This is like heap_insert(), but inserts multiple tuples in one operation. + * That's faster than calling heap_insert() in a loop, because when multiple + * tuples can be inserted on a single page, we can write just a single WAL + * record covering all of them, and only need to lock/unlock the page once. + * + * Note: this leaks memory into the current memory context. You can create a + * temporary context before calling this, if that's a problem. + */ +void
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 12 August 2011 23:19, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of the tuples we're about to insert exist. If we run BEFORE ROW triggers for a bunch of tuples first, and only then insert them, none of the trigger invocations will see the other rows as inserted yet. Similarly, if we run AFTER ROW triggers after inserting a bunch of tuples, the trigger for each of the insertions would see all the inserted rows. So at least for now, the patch simply falls back to inserting one row at a time if there are any triggers on the table. I guess DEFAULT values could also suffer from a similar problem to BEFORE ROW triggers though (in theory a DEFAULT could be based on a function that SELECTs from the table being copied to). Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: The patch is WIP, mainly because I didn't write the WAL replay routines yet, but please let me know if you see any issues. Why do you need new WAL replay routines? Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any large, I think we should be avoiding special-purpose WAL entries as much as possible. Also, I strongly object to turning regular heap_insert into a wrapper around some other more complicated operation. That will likely have bad consequences for the performance of every other operation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Fri, Aug 12, 2011 at 3:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. No kidding! So at least for now, the patch simply falls back to inserting one row at a time if there are any triggers on the table. Maybe we want to change that to fall back to old ways if there are any FOR EACH ROW triggers, since FOR EACH STATEMENT triggers won't be bothered by this optimization. -- Gurjeet Singh EnterpriseDB Corporation The Enterprise PostgreSQL Company
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Aug12, 2011, at 21:16 , Heikki Linnakangas wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of the tuples we're about to insert exist. If we run BEFORE ROW triggers for a bunch of tuples first, and only then insert them, none of the trigger invocations will see the other rows as inserted yet. Similarly, if we run AFTER ROW triggers after inserting a bunch of tuples, the trigger for each of the insertions would see all the inserted rows. Don't we run AFTER ROW triggers after inserting *all* the tuples anyway? At least this is what we do in the case of INSERT/UPDATE/DELETE if I'm not mistaken. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 12.08.2011 22:57, Florian Pflug wrote: On Aug12, 2011, at 21:16 , Heikki Linnakangas wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of the tuples we're about to insert exist. If we run BEFORE ROW triggers for a bunch of tuples first, and only then insert them, none of the trigger invocations will see the other rows as inserted yet. Similarly, if we run AFTER ROW triggers after inserting a bunch of tuples, the trigger for each of the insertions would see all the inserted rows. Don't we run AFTER ROW triggers after inserting *all* the tuples anyway? At least this is what we do in the case of INSERT/UPDATE/DELETE if I'm not mistaken. Um, yes, you're right. Now I feel silly. The above still applies to BEFORE ROW triggers, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Fri, Aug 12, 2011 at 3:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. Let's make it faster. One obvious optimization is to insert heap tuples in bigger chunks, instead of calling heap_insert() separately for every tuple. That saves the overhead of pinning and locking the buffer for every tuple, and you only need to write one WAL record for all the tuples written to the same page, instead of one for each tuple. Attached is a WIP patch to do that. It adds a new function, heap_multi_insert, which does the same thing as heap_insert, but works in bulk. It takes an array of tuples as argument, and tries to cram as many of them into the chosen targe page as it can, and only writes a single WAL record of the operation. This gives a significant speedup to COPY, particularly for narrow tables, with small tuples. Grouping multiple tuples into one WAL record reduces the WAL volume significantly, and the time spent in writing that WAL. The reduced overhead of repeatedly locking the buffer is also most noticeable on narrow tables. On wider tables, the effects are smaller. See copytest-results.txt, containing test results with three tables of different widths. The scripts used to get those numbers are also attached. Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of the tuples we're about to insert exist. If we run BEFORE ROW triggers for a bunch of tuples first, and only then insert them, none of the trigger invocations will see the other rows as inserted yet. Similarly, if we run AFTER ROW triggers after inserting a bunch of tuples, the trigger for each of the insertions would see all the inserted rows. So at least for now, the patch simply falls back to inserting one row at a time if there are any triggers on the table. The patch is WIP, mainly because I didn't write the WAL replay routines yet, but please let me know if you see any issues. I thought about trying to do this at one point in the past, but I couldn't figure out exactly how to make it work. I think the approach you've taken here is good. Aside from the point already raised about needing to worry only about BEFORE ROW triggers, I don't see any showstoppers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 08/12/2011 04:57 PM, Robert Haas wrote: I thought about trying to do this at one point in the past, but I couldn't figure out exactly how to make it work. I think the approach you've taken here is good. Aside from the point already raised about needing to worry only about BEFORE ROW triggers, I don't see any showstoppers. Yeah, this looks very promising indeed. Well done. In fact, I'm asking myself how hard it will be to backport for one particular customer of ours, for whom the WAL load is a major hotspot. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Fri, Aug 12, 2011 at 8:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. Let's make it faster. One obvious optimization is to insert heap tuples in bigger chunks, instead of calling heap_insert() separately for every tuple. That saves the overhead of pinning and locking the buffer for every tuple, and you only need to write one WAL record for all the tuples written to the same page, instead of one for each tuple. We don't pin the buffer for every tuple, that optimisation is already done... When we discussed this before you said that it wasn't worth trying to do this additional work - it was certainly a smaller gain than the one we achieved by removing the pinning overhead. Also, we discussed that you would work on buffering the index inserts, which is where the main problem lies. The main heap is only a small part of the overhead if we have multiple indexes already built on a table - which is the use case that causes the most problem. So I'm a little surprised to see you working on this and I'm guessing that the COPY improvement with indexes is barely noticeable. This would be a nice improvement, but not until the bulk index inserts are done. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On Fri, Aug 12, 2011 at 2:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. Let's make it faster. One obvious optimization is to insert heap tuples in bigger chunks, instead of calling heap_insert() separately for every tuple. That saves the overhead of pinning and locking the buffer for every tuple, and you only need to write one WAL record for all the tuples written to the same page, instead of one for each tuple. Attached is a WIP patch to do that. It adds a new function, heap_multi_insert, which does the same thing as heap_insert, but works in bulk. It takes an array of tuples as argument, and tries to cram as many of them into the chosen targe page as it can, and only writes a single WAL record of the operation. This gives a significant speedup to COPY, particularly for narrow tables, with small tuples. Grouping multiple tuples into one WAL record reduces the WAL volume significantly, and the time spent in writing that WAL. The reduced overhead of repeatedly locking the buffer is also most noticeable on narrow tables. On wider tables, the effects are smaller. See copytest-results.txt, containing test results with three tables of different widths. The scripts used to get those numbers are also attached. Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of the tuples we're about to insert exist. If we run BEFORE ROW triggers for a bunch of tuples first, and only then insert them, none of the trigger invocations will see the other rows as inserted yet. Similarly, if we run AFTER ROW triggers after inserting a bunch of tuples, the trigger for each of the insertions would see all the inserted rows. So at least for now, the patch simply falls back to inserting one row at a time if there are any triggers on the table. But generic RI triggers would be ok, right? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 13.08.2011 00:17, Simon Riggs wrote: Also, we discussed that you would work on buffering the index inserts, which is where the main problem lies. The main heap is only a small part of the overhead if we have multiple indexes already built on a table - which is the use case that causes the most problem. Sure, if you have indexes on the table already, then the overhead of updating them is more significant. I am actually working on that too, I will make a separate post about that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 13.08.2011 00:26, Merlin Moncure wrote: On Fri, Aug 12, 2011 at 2:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of the tuples we're about to insert exist. If we run BEFORE ROW triggers for a bunch of tuples first, and only then insert them, none of the trigger invocations will see the other rows as inserted yet. Similarly, if we run AFTER ROW triggers after inserting a bunch of tuples, the trigger for each of the insertions would see all the inserted rows. So at least for now, the patch simply falls back to inserting one row at a time if there are any triggers on the table. But generic RI triggers would be ok, right? RI triggers are AFTER ROW triggers, which we concluded to be OK after all, so they would be ok. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers