Thank you for the review, I will address those shortly, but will answer some
questions in the meantime.
> > First, the changes are lacking any explanatory comments. Probably we
> > should follow how nodeAgg does this and add both comments to the
> > ExecSort function header as well as specific comments above the "if"
> > around the new tuplesort_begin_datum explaining the specific
> > conditions that are required for the optimization to be useful and
> > safe.
Done, since I lifted the restrictions following your questions, there isn't
much left to comment. (see below)
> >
> > That leads to a question I had: I don't follow why bounded mode (when
> > using byval) needs to be excluded. Comments should be added if there's
> > a good reason (as noted above), but maybe it's a case we can handle
> > safely?
I had test failures when trying to move the Datum around when performing a
bounded sort, but did not look into it at first.
Now I've looked into it, and the switch to a heapsort when using bounded mode
just unconditionnaly tried to free a tuple that was never there to begin with.
So if the SortTuple does not contain an actual tuple, but only a single datum,
do not do that.
I've updated the patch to fix this and enable the optimization in the case of
bounded sort.
> >
> > A second question: at first glance it's intuitively the case we might
> > not be able to handle byref values. But nodeAgg doesn't seem to have
> > that restriction. What's the difference here?
>
> I think tuplesort_begin_datum, doesn't have any such limitation, it
> can handle any type of Datum so I think we don't need to consider the
> only attbyval, we can consider any type of attribute for this
> optimization.
I've restricted the optimization to byval types because of the following
comment in nodeAgg.c:
/*
* Note: if input type is pass-by-ref, the datums returned by the
sort are
* freshly palloc'd in the per-query context, so we must be careful
to
* pfree them when they are no longer needed.
*/
As I was not sure how to handle that, I prefered the safety of not enabling
it. Since you both told me it should be safe, I've lifted that restriction
too.
> A few small code observations:
> - In my view the addition of unlikely() in ExecSort is unlikely to be
> of benefit because it's a single call for the entire node's execution
> (not in the tuple loop).
Done.
> - It seems clearer to change the "if (!node->is_single_val)" to flip
> the true/false cases so we don't need the negation.
Agreed, done.
> - I assume there are tests that likely already cover this case, but
> it'd be worth verifying that.
Yes many test cases cover that, but maybe it would be better to explictly
check for it on some cases: do you think we could add a debug message that can
be checked for ?
> Finally, I believe the same optimization likely ought to be added to
> nodeIncrementalSort. It's less likely the tests there are sufficient
> for both this and the original case, but we'll see.
I will look into it, but isn't incrementalsort used to sort tuples on several
keys, when they are already sorted on the first ? In that case, I doubt we
would ever have a single-valued tuple here, except if there is a projection to
strip the tuple from extraneous attributes.
--
Ronan Dunklau
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index b99027e0d7..ff443d15a9 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -29,6 +29,10 @@
* which saves the results in a temporary file or memory. After the
* initial call, returns a tuple from the file with each call.
*
+ * The tuplesort can either occur on the whole tuple (this is the nominal
+ * case) or, when the input / output tuple consists of only one attribute,
+ * we switch to the tuplesort_*_datum API, optimized for that specific case.
+ *
* Conditions:
* -- none.
*
@@ -85,33 +89,59 @@ ExecSort(PlanState *pstate)
outerNode = outerPlanState(node);
tupDesc = ExecGetResultType(outerNode);
-
- tuplesortstate = tuplesort_begin_heap(tupDesc,
- plannode->numCols,
- plannode->sortColIdx,
- plannode->sortOperators,
- plannode->collations,
- plannode->nullsFirst,
- work_mem,
- NULL,
- node->randomAccess);
+ /*
+ * Switch to the tuplesort_*_datum interface when we have only one key,
+ * as it is much more efficient especially when the type is pass-by-value.
+ */
+ if (tupDesc->natts == 1)
+ {
+ node->is_single_val = true;
+ tuplesortstate = tuplesort_begin_datum(TupleDescAttr(tupDesc, 0)->atttypid,
+ plannode->sortOperators[0],
+ plannode->collations[0],
+ plannode->nullsFirst[0],
+ work_mem,
+ NULL,
+ node->randomAccess);
+ } else
+ tuplesortstate = tuplesort_begin_heap(tupDesc,
+ plannode->numCols,
+ plannode->sortColIdx,
+ plannode->sortOperators,
+ plannode->collations,
+ plannode->nullsFirst,
+ work_mem,
+ NULL,
+ node->randomAccess);
if (node->bounded)
tuplesort_set_bound(tuplesortstate, node->bound);
node->tuplesortstate = (void *) tuplesortstate;
/*
- * Scan the subplan and feed all the tuples to tuplesort.
+ * Scan the subplan and feed all the tuples to tuplesort,
+ * using either the _putdatum or _puttupleslot API as appropriate.
*/
-
- for (;;)
- {
- slot = ExecProcNode(outerNode);
-
- if (TupIsNull(slot))
- break;
-
- tuplesort_puttupleslot(tuplesortstate, slot);
- }
+ if (node->is_single_val)
+ for (;;)
+ {
+ slot = ExecProcNode(outerNode);
+
+ if (TupIsNull(slot))
+ break;
+ slot_getsomeattrs(slot, 1);
+ tuplesort_putdatum(tuplesortstate,
+ slot->tts_values[0],
+ slot->tts_isnull[0]);
+ }
+ else
+ for (;;)
+ {
+ slot = ExecProcNode(outerNode);
+
+ if (TupIsNull(slot))
+ break;
+ tuplesort_puttupleslot(tuplesortstate, slot);
+ }
/*
* Complete the sort.
@@ -150,9 +180,18 @@ ExecSort(PlanState *pstate)
* next fetch from the tuplesort.
*/
slot = node->ss.ps.ps_ResultTupleSlot;
- (void) tuplesort_gettupleslot(tuplesortstate,
- ScanDirectionIsForward(dir),
- false, slot, NULL);
+ if (node->is_single_val)
+ {
+ ExecClearTuple(slot);
+ if (tuplesort_getdatum(tuplesortstate, ScanDirectionIsForward(dir),
+ &(slot->tts_values[0]), &(slot->tts_isnull[0]), NULL))
+ ExecStoreVirtualTuple(slot);
+ }
+ else
+ (void) tuplesort_gettupleslot(tuplesortstate,
+ ScanDirectionIsForward(dir),
+ false, slot, NULL);
+
return slot;
}
@@ -191,6 +230,7 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
sortstate->bounded = false;
sortstate->sort_Done = false;
sortstate->tuplesortstate = NULL;
+ sortstate->is_single_val = false;
/*
* Miscellaneous initialization
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 22972071ff..226a723167 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -4773,6 +4773,13 @@ leader_takeover_tapes(Tuplesortstate *state)
static void
free_sort_tuple(Tuplesortstate *state, SortTuple *stup)
{
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
+ /* If the SortTuple is actually only a single Datum, which was not copied as
+ * it is a byval type, do not try to free it nor account for it in memory
+ * used.
+ */
+ if(stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }
}
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 0ec5509e7e..643f416c54 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2151,6 +2151,7 @@ typedef struct SortState
int64 bound_Done; /* value of bound we did the sort with */
void *tuplesortstate; /* private state of tuplesort.c */
bool am_worker; /* are we a worker? */
+ bool is_single_val; /* are we using the single value optimization ? */
SharedSortInfo *shared_info; /* one entry per worker */
} SortState;