On Wed, 27 Mar 2019 at 18:49, Andres Freund <and...@anarazel.de> wrote: > Here's a version that applies onto HEAD. It needs a bit more cleanup > (I'm not sure I like the addition of ri_batchInsertSlots to store slots > for each partition, there's some duplicated code around slot array and > slot creation, docs for the multi insert callback). > > When measuring inserting into unlogged partitions (to remove the > overhead of WAL logging, which makes it harder to see the raw > performance difference), I see a few percent gains of the slotified > version over the current code. Same with insertions that have fewer > partition switches. > > Sorry for posting this here - David, it'd be cool if you could take a > look anyway. You've played a lot with this code.
Hi, I had a look at this and performance has improved again, thanks. However, I'm not sure if the patch is exactly what we need, let me explain. When I was working on 0d5f05cde01, I think my original idea was just to use the bufferedTuples[] array and always multi insert into partitioned tables unless something like on insert triggers stopped that. The reason I ended up with all this CIM_MULTI and CIM_MULTI_CONDITIONAL stuff is due to Melanie finding a regression when the target partition changed on each tuple. At the time I wondered if it might be worth trying to have multiple buffers or to partition the existing buffer and store tuples belonging to different partitions, then just flush them when they're full. Looking at the patch it seems that you have added an array of slots per partition, so does that not kinda make all that CIM_MULTI / CIM_MULTI_CONDITIONAL code redundant? ... we could just flush each partition's buffer when it gets full. There's not much of a need to flush the buffer when the partition changes since we're using a different buffer for the next partition anyway. I also wonder about memory usage here. If we're copying to a partitioned table with 10k partitions and we get over MAX_BUFFERED_TUPLES in a row for each partition, we'll end up with quite a lot of slots. So, in short. I think the patch either is going to cause possible memory problems during COPY FROM, or it leaves code that is pretty much redundant. ... Or I've just misunderstood the patch :) Does this make sense? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services