Attached is the current state of a patch to reduce the overhead of passing tuple data up through many levels of plan nodes. It's not tested enough to apply yet, but I thought I'd put it out for comment. It seems to get about a factor of 4 speedup on Miroslav's nested-joins example (above and beyond what we got from Atsushi Ogawa's patch).
The basic point of the patch is to allow a TupleTableSlot to contain a "virtual" tuple instead of a regular heap tuple. The virtual tuple is just an array of Datums, with any pass-by-reference Datums pointing at original storage (either a lower-level slot or expression result storage). This representation is essentially the raw output of ExecProject. This not only avoids the overhead of forming the data into a tuple (heap_formtuple) but also saves cycles when extracting the data at the next level up, since we can just grab the Datums directly. (This behavior builds on and shares code with Ogawa's patch to cache extracted Datums in TupleTableSlots. When a slot contains a physical tuple, the same Datum arrays cache any Datums extracted from it.) Since a slot may or may not contain a regular tuple, you can't just grab slot->val anymore; there are new API functions ExecCopySlotTuple() and ExecFetchSlotTuple() (the former when you want to make your own copy, the latter when you don't). These force construction of a real tuple if the slot is virtual. I also made an ExecCopySlot() convenience routine for the common case of copying one slot's contents into another slot. A related API modification is to change tuple receivers (DestReceivers) to receive a TupleTableSlot instead of separate tuple and tuple descriptor parameters. This makes it possible to avoid an unnecessary tuple construction/deconstruction at the final output phase as well. It also turned out to be useful to make a short-circuit path for ExecProject when the targetlist is entirely simple Vars. This only requires copying Datums from lower to upper slots, and we can implement it that way instead of going through ExecEvalExpr. Finally, I have made some progress towards making the tuple access routines consistently use "bool isNull" arrays as null markers, instead of the char 'n' or ' ' convention that was previously used in some but not all contexts. I don't think we can retire heap_formtuple or heap_modifytuple for a long time, if ever, but we can deprecate them in favor of the parallel new routines with the bool interface. Comments? regards, tom lane
---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])