Hi,
On 2019-04-02 04:23:20 +1300, David Rowley wrote:
> On Mon, 1 Apr 2019 at 08:05, Andres Freund <[email protected]> wrote:
> > I'll work on pushing all the other pending tableam patches today -
> > leaving COPY the last non pluggable part. You'd written in a private
> > email that you might try to work on this on Monday, so I think I'll give
> > this a shot on Tuesday if you've not gotten around till then? I'd like
> > to push this sooner than the exact end of the freeze...
>
> I worked on this, but I've not got the patch finished yet.
Thanks! I'm not quite clear whether you planning to continue working on
this, or whether this is a handoff? Either is fine with me, just trying
to avoid unnecessary work / delay.
> Of course, it is possible that some of the bugs account for some of
> the improved time... but the rows did seem to be in the table
> afterwards.
The speedup likely seems to be from having much larger batches, right?
Unless we insert into enough partitions that batching doesn't ever
encounter two tuples, we'll now efficiently use multi_insert even if
there's no consecutive tuples.
> if (cstate->rel)
> {
> - Datum *values;
> - bool *nulls;
> + TupleTableSlot *slot;
> TableScanDesc scandesc;
> - HeapTuple tuple;
> -
> - values = (Datum *) palloc(num_phys_attrs * sizeof(Datum));
> - nulls = (bool *) palloc(num_phys_attrs * sizeof(bool));
>
> scandesc = table_beginscan(cstate->rel, GetActiveSnapshot(), 0,
> NULL);
> + slot = table_slot_create(cstate->rel, NULL);
>
> processed = 0;
> - while ((tuple = heap_getnext(scandesc, ForwardScanDirection))
> != NULL)
> + while (table_scan_getnextslot(scandesc, ForwardScanDirection,
> slot))
> {
> CHECK_FOR_INTERRUPTS();
>
> - /* Deconstruct the tuple ... faster than repeated
> heap_getattr */
> - heap_deform_tuple(tuple, tupDesc, values, nulls);
> + /* Deconstruct the tuple ... */
> + slot_getallattrs(slot);
>
> /* Format and send the data */
> - CopyOneRowTo(cstate, values, nulls);
> + CopyOneRowTo(cstate, slot);
> processed++;
> }
>
> + ExecDropSingleTupleTableSlot(slot);
> table_endscan(scandesc);
> -
> - pfree(values);
> - pfree(nulls);
> }
Hm, I should just have committed this part separately. Not sure why I
didn't...
> +#define MAX_BUFFERED_TUPLES 1000
> +#define MAX_BUFFERED_BYTES 65535
> +#define MAX_PARTITION_BUFFERS 16
> +
> +typedef struct {
> + Oid relid;
> + TupleTableSlot **slots;
> + ResultRelInfo *resultRelInfo;
> + int nused;
> +} CopyMultiInsertBuffer;
> +
> +typedef struct {
> + HTAB *multiInsertBufferTab;
> + CopyMultiInsertBuffer *buffer;
> + int bufferedTuples;
> + int bufferedBytes;
> + int nbuffers;
> +} CopyMultiInsertInfo;
To me it seems a bit better to have this as generic facility - we really
should use multi_insert in more places (like inserting the pg_attribute
rows). But admittedly that can just be done later.
> +static inline void
> +CopyMultiInsertBuffer_RemoveBuffer(CopyMultiInsertInfo *miinfo,
> CopyMultiInsertBuffer *buffer)
> +{
> + Oid relid = buffer->relid;
> + int i = 0;
> +
> + while (buffer->slots[i] != NULL)
> + ExecClearTuple(buffer->slots[i++]);
> + pfree(buffer->slots);
> +
> + hash_search(miinfo->multiInsertBufferTab, (void *) &relid, HASH_REMOVE,
> + NULL);
> + miinfo->nbuffers--;
> +}
Hm, this leaves dangling slots, right?
> +static inline void
> +CopyMultiInsertInfo_SetCurrentBuffer(CopyMultiInsertInfo *miinfo,
> +
> ResultRelInfo *rri)
> +{
> + CopyMultiInsertBuffer *buffer;
> + Oid relid = RelationGetRelid(rri->ri_RelationDesc);
> + bool found;
> +
> + Assert(miinfo->multiInsertBufferTab != NULL);
> +
> + buffer = hash_search(miinfo->multiInsertBufferTab, &relid, HASH_ENTER,
> &found);
> +
> + if (!found)
> + {
> + buffer->relid = relid;
> + buffer->slots = palloc0(MAX_BUFFERED_TUPLES *
> sizeof(TupleTableSlot *));
> + buffer->resultRelInfo = rri;
> + buffer->nused = 0;
> + miinfo->nbuffers++;
> + }
> +
> + miinfo->buffer = buffer;
> +}
It still seems wrong to me to just perform a second hashtable search
here, givent that we've already done the partition dispatch.
> if (proute)
> insertMethod = CIM_MULTI_CONDITIONAL;
> else
> insertMethod = CIM_MULTI;
Hm, why do we still have CIM_MULTI and CIM_MULTI_CONDITIONAL?
> /*
> - * Tests have shown that using multi-inserts
> when the
> - * partition changes on every tuple slightly
> decreases the
> - * performance, however, there are benefits
> even when only
> - * some batches have just 2 tuples, so let's
> enable
> - * multi-inserts even when the average is quite
> low.
> + * Enable multi-inserts when the partition has
> BEFORE/INSTEAD
> + * OF triggers, or if the partition is a
> foreign partition.
> */
> leafpart_use_multi_insert = insertMethod ==
> CIM_MULTI_CONDITIONAL &&
> - avgTuplesPerPartChange >= 1.3 &&
> - !has_before_insert_row_trig &&
> - !has_instead_insert_row_trig &&
> - resultRelInfo->ri_FdwRoutine == NULL;
> +
> !has_before_insert_row_trig &&
> +
> !has_instead_insert_row_trig &&
> +
> resultRelInfo->ri_FdwRoutine == NULL;
Not for now / this commit, but I kinda feel like we should determine
whether it's safe to multi-insert at a more centralized location.
Greetings,
Andres Freund