On 2017-03-15 16:07:14 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2017-03-15 15:41:22 -0400, Tom Lane wrote: > >> Color me dubious. Which specific other places have you got in mind, and > >> do they have expression trees at hand that would tell them which columns > >> they really need to pull out? > > > I was thinking of execGrouping.c's execTuplesMatch(), > > TupleHashTableHash() (and unequal, but doubt that matters > > performancewise). There's also nodeHash.c's ExecHashGetValue(), but I > > think that'd possibly better fixed differently. > > The execGrouping.c functions don't have access to an expression tree > instructing them which columns to pull out of the tuple, so I fail to see > how get_last_attnums() would be of any use to them.
I presume most of the callers do. We'd have to change the API somewhat, unless we just have a small loop in execTuplesMatch() determining the biggest column index (which might be worthwhile / acceptable). TupleHashTableHash() should be able to have that pre-computed in BuildTupleHashTable(). Might be more viable to go that way. > As for ExecHashGetHashValue, it's most likely going to be working from > virtual tuples passed up to the join, which won't benefit from > predetermination of the last column to be accessed. The > tuple-deconstruction would have happened while projecting in the scan > node below. I think the physical tuple stuff commonly thwarts that argument? On master for tpch's Q5 you can e.g. see the following profile (master): + 29.38% postgres postgres [.] ExecScanHashBucket + 16.72% postgres postgres [.] slot_getattr + 5.51% postgres postgres [.] heap_getnext - 5.50% postgres postgres [.] slot_deform_tuple - 98.07% slot_deform_tuple - 85.98% slot_getattr - 96.59% ExecHashGetHashValue - ExecHashJoin - ExecProcNode + 85.12% ExecHashJoin + 14.88% MultiExecHash + 3.41% ExecMakeFunctionResultNoSets + 14.02% slot_getsomeattrs + 1.58% ExecEvalScalarVarFast I.e. nearly all calls for slot_deform_tuple are from slot_getattrs in ExecHashGetHashValue(). And nearly all the time in slot_getattr is spent on code only executed for actual tuples: │ if (tuple == NULL) /* internal error */ 0.18 │ test %rax,%rax │ ↓ je 223 │ * │ * (We have to check this separately because of various inheritance and │ * table-alteration scenarios: the tuple could be either longer or shorter │ * than the tupdesc.) │ */ │ tup = tuple->t_data; 0.47 │ mov 0x10(%rax),%rsi │ if (attnum > HeapTupleHeaderGetNatts(tup)) 75.42 │ movzwl 0x12(%rsi),%eax 0.70 │ and $0x7ff,%eax 0.47 │ cmp %eax,%ebx │ ↓ jg e8 - Andres -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers