On 2017-03-15 16:07:14 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > On 2017-03-15 15:41:22 -0400, Tom Lane wrote:
> >> Color me dubious.  Which specific other places have you got in mind, and
> >> do they have expression trees at hand that would tell them which columns
> >> they really need to pull out?
>
> > I was thinking of execGrouping.c's execTuplesMatch(),
> > TupleHashTableHash() (and unequal, but doubt that matters
> > performancewise).  There's also nodeHash.c's ExecHashGetValue(), but I
> > think that'd possibly better fixed differently.
>
> The execGrouping.c functions don't have access to an expression tree
> instructing them which columns to pull out of the tuple, so I fail to see
> how get_last_attnums() would be of any use to them.

I presume most of the callers do.  We'd have to change the API somewhat,
unless we just have a small loop in execTuplesMatch() determining the
biggest column index (which might be worthwhile / acceptable).
TupleHashTableHash() should be able to have that pre-computed in
BuildTupleHashTable().  Might be more viable to go that way.


> As for ExecHashGetHashValue, it's most likely going to be working from
> virtual tuples passed up to the join, which won't benefit from
> predetermination of the last column to be accessed.  The
> tuple-deconstruction would have happened while projecting in the scan
> node below.

I think the physical tuple stuff commonly thwarts that argument?  On
master for tpch's Q5 you can e.g. see the following profile (master):

+   29.38%  postgres  postgres          [.] ExecScanHashBucket
+   16.72%  postgres  postgres          [.] slot_getattr
+    5.51%  postgres  postgres          [.] heap_getnext
-    5.50%  postgres  postgres          [.] slot_deform_tuple
   - 98.07% slot_deform_tuple
      - 85.98% slot_getattr
         - 96.59% ExecHashGetHashValue
            - ExecHashJoin
               - ExecProcNode
                  + 85.12% ExecHashJoin
                  + 14.88% MultiExecHash
         + 3.41% ExecMakeFunctionResultNoSets
      + 14.02% slot_getsomeattrs
   + 1.58% ExecEvalScalarVarFast

I.e. nearly all calls for slot_deform_tuple are from slot_getattrs in
ExecHashGetHashValue().  And nearly all the time in slot_getattr is
spent on code only executed for actual tuples:

       │               if (tuple == NULL)                      /* internal 
error */
  0.18 │         test   %rax,%rax
       │       ↓ je     223
       │                *
       │                * (We have to check this separately because of various 
inheritance and
       │                * table-alteration scenarios: the tuple could be either 
longer or shorter
       │                * than the tupdesc.)
       │                */
       │               tup = tuple->t_data;
  0.47 │         mov    0x10(%rax),%rsi
       │               if (attnum > HeapTupleHeaderGetNatts(tup))
 75.42 │         movzwl 0x12(%rsi),%eax
  0.70 │         and    $0x7ff,%eax
  0.47 │         cmp    %eax,%ebx
       │       ↓ jg     e8

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to