On 2018-06-27 14:46:26 +0900, Amit Langote wrote:
> Hi Andres,
>
> On 2018/06/27 14:09, Andres Freund wrote:
> > Hi,
> >
> > (sorry if I CCed the wrong folks, the code has changed a fair bit)
> >
> > I noticed that several places in the partitioning code look like:
> >
> > /*
> > * If the tuple has been routed, it's been converted to the
> > * partition's rowtype, which might differ from the root
> > * table's. We must convert it back to the root table's
> > * rowtype so that val_desc shown error message matches the
> > * input tuple.
> > */
> > if (resultRelInfo->ri_PartitionRoot)
> > {
> > TableTuple tuple = ExecFetchSlotTuple(slot);
> > TupleConversionMap *map;
> >
> > rel = resultRelInfo->ri_PartitionRoot;
> > tupdesc = RelationGetDescr(rel);
> > /* a reverse map */
> > map = convert_tuples_by_name(orig_tupdesc, tupdesc,
> > gettext_noop("could not convert row
> > type"));
> > if (map != NULL)
> > {
> > tuple = do_convert_tuple(tuple, map);
> > ExecSetSlotDescriptor(slot, tupdesc);
> > ExecStoreTuple(tuple, slot, InvalidBuffer, false);
> > }
> > }
>
> This particular code block (and a couple of others like it) are executed
> only if a tuple fails partition constraint check, so maybe not a very
> frequently executed piece of code.
>
> There is however similar code that runs in non-error paths, such as in
> ExecPrepareTupleRouting(), that is executed *if* the leaf partition and
> the root parent have differing attribute numbers. So, one might say that
> that code's there to handle the special cases which may not arise much in
> practice, but it may still be a good idea to make improvements if we can
> do so.
Yea, I was referring to all do_convert_tuple() callers, and some of them
are more hot-path than the specific one above. E.g. the
ConvertPartitionTupleSlot() call in CopyFrom().
> > Unfortunately calling ExecSetSlotDescriptor() is far from cheap, it has
> > to reallocate the values/isnull arrays (and potentially do desc pinning
> > etc). Nor is it always cheap to call ExecFetchSlotTuple(slot) - it might
> > be a virtual slot. Calling heap_deform_tuple(), as done in
> > do_convert_tuple(), might be superfluous work, because the input tuple
> > might already be entirely deformed in the input slot.
> >
> > I think it'd be good to rewrite the code so there's an input and an
> > output slot that each will keep their slot descriptors set.
> >
> > Besides performance as the code stands, this'll also be important for
> > pluggable storage (as we'd otherwise get a *lot* of back/forth
> > conversions between tuple formats).
> >
> > Anybody interesting in writing a patch that redoes this?
>
> Thanks for the pointers above, I will see if I can produce a patch and
> will also be interested in hearing what others may have to say.
Cool.
Greetings,
Andres Freund