Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-09 Thread Jesper Krogh
On 08/08/12 21:34, Robert Haas wrote: I think we need to implement buffering both to end of statement or end of transaction, not just one or the other. Another (not necessarily better) idea is to use a buffer that's part of the index, like the GIN fastupdate stuff, so that there's no particular

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-09 Thread Robert Haas
On Thu, Aug 9, 2012 at 2:59 AM, Jesper Krogh jes...@krogh.cc wrote: If it is an implementation artifact or an result of this approach I dont know. But currently, when the GIN fastupdate code finally decides to flush the buffer, it is going to stall all other processes doing updates while doing

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-08 Thread Simon Riggs
On 8 August 2012 03:44, Jeff Janes jeff.ja...@gmail.com wrote: On Tue, Aug 7, 2012 at 1:52 PM, Simon Riggs si...@2ndquadrant.com wrote: On 7 August 2012 20:58, Jeff Janes jeff.ja...@gmail.com wrote: Hi Heikki, Is the bulk index insert still an active area for you? If not, is there some kind

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-08 Thread Robert Haas
On Tue, Aug 7, 2012 at 4:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Incidentally, we can also optimise repeated inserts within a normal transaction using this method, by implementing deferred unique constraints. At present we say that unique constraints aren't deferrable, but there's no

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-08 Thread Simon Riggs
On 8 August 2012 20:34, Robert Haas robertmh...@gmail.com wrote: On Tue, Aug 7, 2012 at 4:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Incidentally, we can also optimise repeated inserts within a normal transaction using this method, by implementing deferred unique constraints. At present

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-07 Thread Jeff Janes
On Fri, Aug 12, 2011 at 2:59 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.08.2011 00:17, Simon Riggs wrote: Also, we discussed that you would work on buffering the index inserts, which is where the main problem lies. The main heap is only a small part of the

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-07 Thread Simon Riggs
On 7 August 2012 20:58, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Aug 12, 2011 at 2:59 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.08.2011 00:17, Simon Riggs wrote: Also, we discussed that you would work on buffering the index inserts, which is where the main

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-07 Thread Jeff Janes
On Tue, Aug 7, 2012 at 1:52 PM, Simon Riggs si...@2ndquadrant.com wrote: On 7 August 2012 20:58, Jeff Janes jeff.ja...@gmail.com wrote: Hi Heikki, Is the bulk index insert still an active area for you? If not, is there some kind of summary of design or analysis work already done, which

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-11-26 Thread Heikki Linnakangas
On 25.11.2011 23:32, Jeff Janes wrote: On Fri, Nov 25, 2011 at 12:53 PM, Jeff Janesjeff.ja...@gmail.com wrote: Thanks for this patch. Doing bulk copies in parallel for me is now limited by the IO subsystem rather than the CPU. This patch, commit number d326d9e8ea1d69, causes fillfactor to

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-11-25 Thread Jeff Janes
On Mon, Oct 24, 2011 at 7:46 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Thanks! Here's an updated version of the patch, fixing that, and all the other issues pointed out this far. I extracted the code that sets oid and tuple headers, and invokes the toaster, into a new

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-11-25 Thread Jeff Janes
On Fri, Nov 25, 2011 at 12:53 PM, Jeff Janes jeff.ja...@gmail.com wrote: Hi Heikki, Thanks for this patch.  Doing bulk copies in parallel for me is now limited by the IO subsystem rather than the CPU. This patch, commit number d326d9e8ea1d69, causes fillfactor to be ignored for the copy

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-10-24 Thread Heikki Linnakangas
On 25.09.2011 16:03, Dean Rasheed wrote: On 25 September 2011 09:43, Kohei KaiGaikai...@kaigai.gr.jp wrote: Hi Heikki, I checked your patch, then I have a comment and two questions here. 2011/9/14 Heikki Linnakangasheikki.linnakan...@enterprisedb.com: Attached is a new version of the

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-10-06 Thread Heikki Linnakangas
On 25.09.2011 19:01, Robert Haas wrote: On Wed, Sep 14, 2011 at 6:52 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Why do you need new WAL replay routines? Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any large, I think we should be avoiding

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-10-06 Thread Robert Haas
On Thu, Oct 6, 2011 at 7:33 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: A regular heap_insert record leaves out a lot of information that can be deduced at replay time. It can leave out all the headers, including just the null bitmap + data. In addition to that, there's

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-10-06 Thread Heikki Linnakangas
On 06.10.2011 15:11, Robert Haas wrote: On Thu, Oct 6, 2011 at 7:33 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: A regular heap_insert record leaves out a lot of information that can be deduced at replay time. It can leave out all the headers, including just the null

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-09-25 Thread Kohei KaiGai
Hi Heikki, I checked your patch, then I have a comment and two questions here. The heap_prepare_insert() seems a duplication of code with earlier half of existing heap_insert(). I think it is a good question to consolidate these portion of the code. I'm not clear the reason why the argument of

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-09-25 Thread Dean Rasheed
On 25 September 2011 09:43, Kohei KaiGai kai...@kaigai.gr.jp wrote: Hi Heikki, I checked your patch, then I have a comment and two questions here. 2011/9/14 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: Attached is a new version of the patch. It is now complete, including WAL

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-09-25 Thread Robert Haas
On Wed, Sep 14, 2011 at 6:52 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Why do you need new WAL replay routines?  Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any large, I think we should be avoiding special-purpose WAL entries as much as possible. I

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-09-25 Thread Kevin Grittner
Kohei KaiGai wrote: I'm not clear the reason why the argument of CheckForSerializableConflictIn() was changed from the one in heap_insert(). The code was probably just based on heap_insert() before this recent commit:

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-09-14 Thread Heikki Linnakangas
On 13.08.2011 17:33, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: The patch is WIP, mainly because I didn't write the WAL replay routines yet, but please let me know if you see any issues. Why do you need new WAL replay routines? Can't you just use the

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-13 Thread Dean Rasheed
On 12 August 2011 23:19, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-13 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: The patch is WIP, mainly because I didn't write the WAL replay routines yet, but please let me know if you see any issues. Why do you need new WAL replay routines? Can't you just use the existing XLOG_HEAP_NEWPAGE support? By any

[HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Heikki Linnakangas
COPY is slow. Let's make it faster. One obvious optimization is to insert heap tuples in bigger chunks, instead of calling heap_insert() separately for every tuple. That saves the overhead of pinning and locking the buffer for every tuple, and you only need to write one WAL record for all the

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Gurjeet Singh
On Fri, Aug 12, 2011 at 3:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. No kidding! So at least for now, the patch simply falls back to inserting one row at a time if there are any triggers on the table. Maybe we want to change that to fall back to

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Florian Pflug
On Aug12, 2011, at 21:16 , Heikki Linnakangas wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check if some of the tuples we're about to

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Heikki Linnakangas
On 12.08.2011 22:57, Florian Pflug wrote: On Aug12, 2011, at 21:16 , Heikki Linnakangas wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT on the table being copied to, and check

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Robert Haas
On Fri, Aug 12, 2011 at 3:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. Let's make it faster. One obvious optimization is to insert heap tuples in bigger chunks, instead of calling heap_insert() separately for every tuple. That saves the overhead of pinning

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Andrew Dunstan
On 08/12/2011 04:57 PM, Robert Haas wrote: I thought about trying to do this at one point in the past, but I couldn't figure out exactly how to make it work. I think the approach you've taken here is good. Aside from the point already raised about needing to worry only about BEFORE ROW

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Simon Riggs
On Fri, Aug 12, 2011 at 8:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. Let's make it faster. One obvious optimization is to insert heap tuples in bigger chunks, instead of calling heap_insert() separately for every tuple. That saves the overhead of pinning

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Merlin Moncure
On Fri, Aug 12, 2011 at 2:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: COPY is slow. Let's make it faster. One obvious optimization is to insert heap tuples in bigger chunks, instead of calling heap_insert() separately for every tuple. That saves the overhead of pinning

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Heikki Linnakangas
On 13.08.2011 00:17, Simon Riggs wrote: Also, we discussed that you would work on buffering the index inserts, which is where the main problem lies. The main heap is only a small part of the overhead if we have multiple indexes already built on a table - which is the use case that causes the

Re: [HACKERS] Inserting heap tuples in bulk in COPY

2011-08-12 Thread Heikki Linnakangas
On 13.08.2011 00:26, Merlin Moncure wrote: On Fri, Aug 12, 2011 at 2:16 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Triggers complicate this. I believe it is only safe to group tuples together like this if the table has no triggers. A BEFORE ROW trigger might run a SELECT