On Wed, Dec 23, 2015 at 9:37 AM, Robert Haas <robertmh...@gmail.com> wrote:
> The point is, nobody can tell WHAT effects this is modeling.
> Increasing the tuple size makes the crossover go up.  Or down.

There are multiple, competing considerations.

> This analogy is faulty.  It's true that when we run across a qual
> whose selectivity we cannot estimate in any meaningful way, we have to
> just take a stab in the dark and hope for the best.  Similarly, if we
> have no information about what the crossover point for a given sort
> is, we'd have to take some arbitrary estimate, like 75%, and hope for
> the best.  But in this case, we DO have information.  We have an
> estimated row count and an estimated row width.  And those values are
> not being ignored, they are getting used.  The problem is that they
> are being used in an arbitrary way that is not justified by any chain
> of reasoning.

There is a chain of reasoning. It's not particularly satisfactory that
it's so fuzzy, certainly, but the competing considerations here are
substantive (and include erring towards not proceeding with
replacement selection/"quicksort with spillover" when the benefits are
low relative to the costs, which, to repeat myself, is itself novel).

I am more than open to suggestions on alternatives. As I said, I don't
particularly care for my current approach, either. But doing something
analogous to cost_sort() for our private "Do we quicksort with
spillover?"/useselection() model is going to be strictly worse than
what I have proposed.

Any cost model will have to be sensitive to different types of CPU
costs at the level that matters here -- such as the size of the heap,
and its cache efficiency. That's really important, but very
complicated, and variable enough that erring against using replacement
selection seems like a good idea with bigger heaps especially. That
(cache efficiency) is theoretically the only difference that matters
here (other than I/O, of course, but avoiding I/O is only the upside
of proceeding, and if we only weigh that then the cost model always
gives the same answer).

Perhaps you can suggest an alternative model that weighs these
factors. Most sorts are less than 1GB, and it seems worthwhile to
avoid I/O at the level where an internal sort is just out of reach.
Really big CREATE INDEX sorts are not really what I have in mind with
"quicksort with spillover".

This cost_sort() code seems pretty bogus to me, FWIW:

        /* Assume 3/4ths of accesses are sequential, 1/4th are not */
        startup_cost += npageaccesses *
            (seq_page_cost * 0.75 + random_page_cost * 0.25);

I think we can afford to be a lot more optimistic about the proportion
of sequential accesses.

>> Merging is still sorting. The 10 tuples are not very cheap to merge
>> against the 1000 tuples, because you'll probably still end up reading
>> most of the 1000 tuples to do so.
> You're going to read all of the 1000 tuples no matter what, because
> you need to return them, but you will also need to make comparisons on
> most of them, unless the data distribution is favorable.   Assuming no
> special good luck, it'll take something close to X + Y - 1 comparisons
> to do the merge, so something around 1009 comparisons here.
> Maintaining the heap property is not free either, but it might be
> cheaper.

I'm pretty sure that it's cheaper. Some of the really good cases for
"quicksort with spillover" where only a little bit slower than a fully
internal sort when the work_mem threshold was just crossed.

>> Another factor is that the heap could be useful for other stuff in the
>> future. As Simon Riggs pointed out, for deduplicating values as
>> they're read in by tuplesort. (Okay, that's really the only other
>> thing, but it's a good one).
> Not sure how that would work?

Tuplesort would have license to discard tuples with matching existing
values, because the caller gave it permission to. This is something
that you can easily imagine occurring with ordered set aggregates, for
example. It would work in a way not unlike a top-N heapsort does
today. This would work well when it can substantially lower the use of
memory (initially heapification when the threshold is crossed would
probably measure the number of duplicates, and proceed only when it
looked like a promising strategy).

By the way, I think the heap currently does quite badly with many
duplicated values. That case seemed significantly slower than a
similar case with high cardinality tuples.

Peter Geoghegan

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to