Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

Ron Thu, 16 Feb 2006 07:53:54 -0800

At 09:48 AM 2/16/2006, Martijn van Oosterhout wrote:

On Thu, Feb 16, 2006 at 08:22:55AM -0500, Ron wrote:
> 3= Especially in modern systems where the gap between internal CPU
> bandwidth and memory bandwidth is so great, the overhead of memory
> accesses for comparisons and moves is the majority of the overhead
> for both the pivot choosing and the partitioning algorithms within
> quicksort.  Particularly random memory accesses.  The reason (#GPRs -
> 1) is a magic constant is that it's the most you can compare and move
> using only register-to-register operations.


But how much of this applies to us? We're not sorting arrays of
integers, we're sorting pointers to tuples. So while moves cost very
little, a comparison costs hundreds, maybe thousands of cycles. A tuple
can easily be two or three cachelines and you're probably going to
access all of it, not to mention the Fmgr structures and the Datums
themselves.

Pointers are simply fixed size 32b or 64b quantities. They areessentially integers. Comparing and moving pointers or fixed sizekeys to those pointers is exactly the same problem as comparing andmoving integers.

Comparing =or= moving the actual data structures is a much moreexpensive and variable cost proposition. I'm sure that pg's sortfunctionality is written intelligently enough that the only real datamoves are done in a final pass after the exact desired order has beenfound using pointer compares and (re)assignments during the sortingprocess. That's a standard technique for sorting data whose "key" orpointer is much smaller than a datum.

Your cost comment basically agrees with mine regarding the cost ofrandom memory accesses. The good news is that the number of datumsto be examined during the pivot choosing process is small enough thatthe datums can fit into CPU cache while the pointers to them can beassigned to registers: making pivot choosing +very+ fast when done correctly.

As you've noted, actual partitioning is going to be more expensivesince it involves accessing enough actual datums that they can't allfit into CPU cache. The good news is that QuickSort has a verysequential access pattern within its inner loop. So while we must goto memory for compares, we are at least keeping the cost for it downit a minimum. In addition, said access is nice enough to be veryprefetch and CPU cache hierarchy friendly.

None of this is cache friendly. The actual tuples themselves could be
spread all over memory (I don't think any particular effort is expended
trying to minimize fragmentation).

It probably would be worth it to spend some effort on memory layoutjust as we do for HD layout.

Do these algorithms discuss the case where a comparison is more than
1000 times the cost of a move?

A move is always more expensive than a compare when the datum islarger than its pointer or key. A move is always more expensive thana compare when it involves memory to memory movement rather than CPUlocation to CPU location movement. A move is especially moreexpensive than a compare when it involves both factors. Most movesdo involve both.

What I suspect you meant is that a key comparison that involvesaccessing the data in memory is more expensive than reassigning thepointers associated with those keys. That is certainly true.


Yes.  The problem has been extensively studied. ;-)

Where this does become interesting is where we can convert a datum to
an integer such that if f(A) > f(B) then A > B. Then we can sort on
f(X) first with just integer comparisons and then do a full tuple
comparison only if f(A) = f(B). This would be much more cache-coherent
and make these algorithms much more applicable in my mind.

In fact we can do better.

Using hash codes or what-not to map datums to keys and then sortingjust the keys and the pointers to those datums followed by anoptional final pass where we do the actual data movement is also astandard technique for handling large data structures.

Regardless of what tweaks beyond the basic algorithms we use, thealgorithms themselves have been well studied and their performancewell established. QuickSort is the best performing of the O(nlgn)comparison based sorts and it uses less resources than HeapSort or MergeSort.


Ron



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

Reply via email to