I've previously mentioned (e.g. [1]) that tuple deforming is a serious
bottlneck. I've also experimented successfully [2] making
slot_deform_tuple() faster.

But nontheless, tuple deforming is still a *major* bottleneck in many
cases, if not the *the* major bottleneck.

We could partially address that by JITing the work slot_deform_tuple
does. Various people have, with good but not raving success, played with

Alternatively/Additionally we can change the tuple format to make
deforming faster.

But I think the bigger issue than the above is actually that we're just
performing a lot of useless work in a number of common scenarios. We're
always deforming all columns up to the one needed. Very often that's a
lot of useless work.  I've experimented with selectively replacing
slot_getattr calls heap_getattr(), and for some queries that can yield
massive speedups. And obviously significant slowdowns in others.  That's
the case even when preceding columns are varlena and/or contain nulls.
I.e. a good chunk of the problem is storing the results of deforming,
not accessing the data.

ISTM, we need to change slots so that they contain information about
which columns are interesting. For the hot paths we'd then only ever
allow access to those columns, and we'd only ever deform them.  Combined
with the approach in [2] that allows us to deform tuples a lot more

What I'm basically thinking is that expression evaluation would always
make sure the slots have computed the relevant column set, and deform at
the beginning. There's some cases where we likely would still need to
fall back to a slower path (e.g. whole row refs), but that seems fine.

That then also allows us to nearly always avoid the slot_getattr() call,
and instead look at tts_values/nulls directly. The checks slot_getattr()
performs, and the call itself, are quite expensive.

What I'm thinking about is
a) a new ExecInitExpr()/ExecBuildProjectionInfo() which always compute a set of
   interesting columns.
b) replacing all accesses to tts_values/isnull with an inline
   function. In optimized builds that functions won't do anything but
   reference the relevant element, but in assert enabled builds it'd
   check whether said column is actually known to be accessed.
c) Make ExecEvalExpr(), ExecProject(), ExecQual() (and perhaps some
   other places) call the new deforming function which ensures the
   relevant columns are available.
d) Replace nearly all slot_getattr/slot_getsomeattrs calls with the
   function introduced in b).

To me it seems this work will be a good bit easier once [2] is actually
implemented instead of prototyped, because treating ExecInitExpr()
non-recursively allows to build such 'column sets' more easily /

Comments? Alternative suggestions?


Andres Freund


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to