Hi, I've previously mentioned (e.g. ) that tuple deforming is a serious bottlneck. I've also experimented successfully  making slot_deform_tuple() faster.
But nontheless, tuple deforming is still a *major* bottleneck in many cases, if not the *the* major bottleneck. We could partially address that by JITing the work slot_deform_tuple does. Various people have, with good but not raving success, played with that. Alternatively/Additionally we can change the tuple format to make deforming faster. But I think the bigger issue than the above is actually that we're just performing a lot of useless work in a number of common scenarios. We're always deforming all columns up to the one needed. Very often that's a lot of useless work. I've experimented with selectively replacing slot_getattr calls heap_getattr(), and for some queries that can yield massive speedups. And obviously significant slowdowns in others. That's the case even when preceding columns are varlena and/or contain nulls. I.e. a good chunk of the problem is storing the results of deforming, not accessing the data. ISTM, we need to change slots so that they contain information about which columns are interesting. For the hot paths we'd then only ever allow access to those columns, and we'd only ever deform them. Combined with the approach in  that allows us to deform tuples a lot more efficiently. What I'm basically thinking is that expression evaluation would always make sure the slots have computed the relevant column set, and deform at the beginning. There's some cases where we likely would still need to fall back to a slower path (e.g. whole row refs), but that seems fine. That then also allows us to nearly always avoid the slot_getattr() call, and instead look at tts_values/nulls directly. The checks slot_getattr() performs, and the call itself, are quite expensive. What I'm thinking about is a) a new ExecInitExpr()/ExecBuildProjectionInfo() which always compute a set of interesting columns. b) replacing all accesses to tts_values/isnull with an inline function. In optimized builds that functions won't do anything but reference the relevant element, but in assert enabled builds it'd check whether said column is actually known to be accessed. c) Make ExecEvalExpr(), ExecProject(), ExecQual() (and perhaps some other places) call the new deforming function which ensures the relevant columns are available. d) Replace nearly all slot_getattr/slot_getsomeattrs calls with the function introduced in b). To me it seems this work will be a good bit easier once  is actually implemented instead of prototyped, because treating ExecInitExpr() non-recursively allows to build such 'column sets' more easily / naturally. Comments? Alternative suggestions? Greetings, Andres Freund  http://archives.postgresql.org/20160624232953.beub22r6yqux4...@alap3.anarazel.de  http://archives.postgresql.org/message-id/20160714011850.bd5zhu35szle3n3c%40alap3.anarazel.de -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers