Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-03-21 Thread Tom Lane
Last month I wrote: It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I re-ran Gary's test script using the just-committed improvements to qsort.c, and got pretty nice numbers (attached --- compare to

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-03-02 Thread Bruce Momjian
: -Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Wednesday, February 15, 2006 5:22 PM To: Ron Cc: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] qsort again (was Re: [PERFORM

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-03-02 Thread Jonah H. Harris
, February 15, 2006 5:22 PM To: Ron Cc: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour) Ron [EMAIL PROTECTED] writes: How are we choosing our pivots? See qsort.c: it looks like median of nine

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-18 Thread Ron
At 08:37 PM 2/15/2006, Dann Corbit wrote: Adding some randomness to the selection of the pivot is a known technique to fix the oddball partitions problem. True, but it makes QuickSort slower than say MergeSort because of the expense of the PRNG being called ~O(lgN) times during a sort.

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Jens-Wolfhard Schicke
--On Donnerstag, Februar 16, 2006 10:39:45 -0800 Dann Corbit [EMAIL PROTECTED] wrote: He refers to counting sort and radix sort (which comes in most significant digit and least significant digit format). These are also called distribution (as opposed to comparison) sorts. These sorts are

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Martijn van Oosterhout
On Fri, Feb 17, 2006 at 09:18:39AM +0100, Jens-Wolfhard Schicke wrote: What I think as the biggest problem is the digit representation necessary for Radix-Sort in cases of locales which sort without looking at spaces. I assume that would be hard to implement. The same goes for the proposed

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Markus Schaber
Hi, David, David Lang schrieb: In SQL_ASCII, just take the first 4 characters (or 8, if using a 64-bit sortKey as elsewhere suggested). The sorting key doesn't need to be a one-to-one mapping. that would violate your second contraint ( f(a)==f(b) iff (a==b) ) no, it doesn't. When both

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Markus Schaber
Hi, Ron, Ron schrieb: OK, so here's _a_ way (there are others) to obtain a mapping such that if a b then f(a) f (b) and if a == b then f(a) == f(b) Pretend each row is a integer of row size (so a 2KB row becomes a 16Kb integer; a 4KB row becomes a 32Kb integer; etc) Since even a 1TB

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Ron
At 04:24 AM 2/17/2006, Ragnar wrote: On fös, 2006-02-17 at 01:20 -0500, Ron wrote: OK, so here's _a_ way (there are others) to obtain a mapping such that if a b then f(a) f (b) and if a == b then f(a) == f(b) By scanning the table once, we can map say 001h (Hex used to ease

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Ron
At 05:19 AM 2/17/2006, Markus Schaber wrote: Hi, Ron, Ron schrieb: OK, so here's _a_ way (there are others) to obtain a mapping such that if a b then f(a) f (b) and if a == b then f(a) == f(b) Pretend each row is a integer of row size (so a 2KB row becomes a 16Kb integer; a 4KB row

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Martijn van Oosterhout
On Fri, Feb 17, 2006 at 08:23:40AM -0500, Ron wrote: For this mapping, you need a full table sort. One physical IO pass should be all that's needed. However, let's pretend you are correct and that we do need to sort the table to get the key mapping. Even so, we would only need to do it

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Scott Lamb
On Feb 16, 2006, at 2:17 PM, Mark Lewis wrote:Data types which could probably provide a useful function for f would be int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). ...and with some work, floats (I think just the exponent would work, if nothing else). bytea. Probably just

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Martijn van Oosterhout
On Fri, Feb 17, 2006 at 08:18:41AM -0800, Scott Lamb wrote: On Feb 16, 2006, at 2:17 PM, Mark Lewis wrote: Data types which could probably provide a useful function for f would be int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). ...and with some work, floats (I think

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Ron
At 10:53 AM 2/17/2006, Martijn van Oosterhout wrote: On Fri, Feb 17, 2006 at 08:23:40AM -0500, Ron wrote: For this mapping, you need a full table sort. One physical IO pass should be all that's needed. However, let's pretend you are correct and that we do need to sort the table to get the

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Mark Lewis
On Thu, 2006-02-16 at 21:33 -0800, David Lang wrote: In SQL_ASCII, just take the first 4 characters (or 8, if using a 64-bit sortKey as elsewhere suggested). The sorting key doesn't need to be a one-to-one mapping. that would violate your second contraint ( f(a)==f(b) iff (a==b) ) if

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Ragnar
On fös, 2006-02-17 at 08:01 -0500, Ron wrote: At 04:24 AM 2/17/2006, Ragnar wrote: On fös, 2006-02-17 at 01:20 -0500, Ron wrote: OK, so here's _a_ way (there are others) to obtain a mapping such that if a b then f(a) f (b) and if a == b then f(a) == f(b) By scanning the

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Tom Lane
Mark Lewis [EMAIL PROTECTED] writes: I think we're actually on the same page here; you're right that the constraint above ( f(a)==f(b) iff a==b ) can't be extended to data types with more than 32 bits of value space. But the constraint I listed was actually: if a==b then f(a)==f(b) I

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Gregory Maxwell
On 2/17/06, Ragnar [EMAIL PROTECTED] wrote: Say again ? Let us say you have 1 billion rows, where the column in question contains strings like baaaaaa baaaaab baaaaac ... not necessarily in this order on disc of course The minimum value

Re: [HACKERS] qsort again

2006-02-16 Thread Florian Weimer
* Neil Conway: On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote: It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't looked at the glibc code yet to see what they are doing differently.

Re: [HACKERS] qsort again

2006-02-16 Thread Martijn van Oosterhout
On Thu, Feb 16, 2006 at 01:10:48PM +0100, Florian Weimer wrote: * Neil Conway: On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote: It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't

Re: [PERFORM] [HACKERS] qsort again

2006-02-16 Thread Sven Geisler
Martijn van Oosterhout schrieb: Last time around there were a number of different algorithms tested. Did anyone run those tests while getting it to count the number of actual comparisons (which could easily swamp the time taken to do the actual sort in some cases)? The last time I did such

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Ron
At 06:35 AM 2/16/2006, Steinar H. Gunderson wrote: On Wed, Feb 15, 2006 at 11:30:54PM -0500, Ron wrote: Even better (and more easily scaled as the number of GPR's in the CPU changes) is to use the set {L; L+1; L+2; t1; R-2; R-1; R} This means that instead of 7 random memory accesses, we have

Re: [PERFORM] [HACKERS] qsort again

2006-02-16 Thread Ron
At 07:10 AM 2/16/2006, Florian Weimer wrote: * Neil Conway: On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote: It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't looked at the glibc code yet

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Markus Schaber
Hi, Ron, Ron wrote: ...and of course if you know enough about the data to be sorted so as to constrain it appropriately, one should use a non comparison based O(N) sorting algorithm rather than any of the general comparison based O(NlgN) methods. Sounds interesting, could you give us some

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Jonah H. Harris
Last night I implemented a non-recursive introsort in C... let me test it a bit more and then I'll post it here for everyone else to try out.On 2/16/06, Markus Schaber [EMAIL PROTECTED] wrote:Hi, Ron, Ron wrote: ...and of course if you know enough about the data to be sorted so as to constrain it

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Martijn van Oosterhout
On Thu, Feb 16, 2006 at 08:22:55AM -0500, Ron wrote: 3= Especially in modern systems where the gap between internal CPU bandwidth and memory bandwidth is so great, the overhead of memory accesses for comparisons and moves is the majority of the overhead for both the pivot choosing and the

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-16 Thread Ron
At 09:48 AM 2/16/2006, Martijn van Oosterhout wrote: On Thu, Feb 16, 2006 at 08:22:55AM -0500, Ron wrote: 3= Especially in modern systems where the gap between internal CPU bandwidth and memory bandwidth is so great, the overhead of memory accesses for comparisons and moves is the majority of

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-16 Thread Tom Lane
Ron [EMAIL PROTECTED] writes: Your cost comment basically agrees with mine regarding the cost of random memory accesses. The good news is that the number of datums to be examined during the pivot choosing process is small enough that the datums can fit into CPU cache while the pointers to

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Craig A. James
Markus Schaber wrote: Ron wrote: ...and of course if you know enough about the data to be sorted so as to constrain it appropriately, one should use a non comparison based O(N) sorting algorithm rather than any of the general comparison based O(NlgN) methods. Sounds interesting, could you

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-16 Thread Ron
At 10:52 AM 2/16/2006, Ron wrote: At 09:48 AM 2/16/2006, Martijn van Oosterhout wrote: Where this does become interesting is where we can convert a datum to an integer such that if f(A) f(B) then A B. Then we can sort on f(X) first with just integer comparisons and then do a full tuple

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-16 Thread Martijn van Oosterhout
On Thu, Feb 16, 2006 at 11:32:55AM -0500, Ron wrote: At 10:52 AM 2/16/2006, Ron wrote: In fact we can do better. Using hash codes or what-not to map datums to keys and then sorting just the keys and the pointers to those datums followed by an optional final pass where we do the actual data

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Tom Lane
Craig A. James [EMAIL PROTECTED] writes: You can also use this trick when the optimizer is asked for fastest first result. Say you have a cursor on a column of numbers with good distribution. If you do a bucket sort on the first two or three digits only, you know the first page of results

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-16 Thread Scott Lamb
On Feb 16, 2006, at 8:32 AM, Ron wrote: Let's pretend that we have the typical DB table where rows are ~2-4KB apiece. 1TB of storage will let us have 256M-512M rows in such a table. A 32b hash code can be assigned to each row value such that only exactly equal rows will have the same

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Dann Corbit
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Markus Schaber Sent: Thursday, February 16, 2006 5:45 AM To: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-16 Thread Ron
At 12:19 PM 2/16/2006, Scott Lamb wrote: On Feb 16, 2006, at 8:32 AM, Ron wrote: Let's pretend that we have the typical DB table where rows are ~2-4KB apiece. 1TB of storage will let us have 256M-512M rows in such a table. A 32b hash code can be assigned to each row value such that only

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Neil Conway
On Thu, 2006-02-16 at 12:35 +0100, Steinar H. Gunderson wrote: glibc-2.3.5/stdlib/qsort.c: /* Order size using quicksort. This implementation incorporates four optimizations discussed in Sedgewick: I can't see any references to merge sort in there at all. stdlib/qsort.c defines

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 12:15 -0500, Tom Lane wrote: Once or twice we've kicked around the idea of having some datatype-specific sorting code paths alongside the general-purpose one, but I can't honestly see this as being workable from a code maintenance standpoint.

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Markus Schaber
Hi, Mark, Mark Lewis schrieb: It seems that instead of maintaining a different sorting code path for each data type, you could get away with one generic path and one (hopefully faster) path if you allowed data types to optionally support a 'sortKey' interface by providing a function f which

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Martijn van Oosterhout
On Thu, Feb 16, 2006 at 02:17:36PM -0800, Mark Lewis wrote: It seems that instead of maintaining a different sorting code path for each data type, you could get away with one generic path and one (hopefully faster) path if you allowed data types to optionally support a 'sortKey' interface by

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Greg Stark
Markus Schaber [EMAIL PROTECTED] writes: Hmm, to remove redundancy, I'd change the = to a and define: if a==b then f(a)==f(b) if ab then f(a)=f(b) Data types which could probably provide a useful function for f would be int2, int4, oid, and possibly int8 and text (at least for

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 17:51 -0500, Greg Stark wrote: Data types which could probably provide a useful function for f would be int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). How exactly do you imagine doing this for text? I could see doing it for

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread David Lang
On Thu, 16 Feb 2006, Mark Lewis wrote: On Thu, 2006-02-16 at 17:51 -0500, Greg Stark wrote: Data types which could probably provide a useful function for f would be int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). How exactly do you imagine doing this for text? I could

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-16 Thread Ron
At 01:47 PM 2/16/2006, Ron wrote: At 12:19 PM 2/16/2006, Scott Lamb wrote: On Feb 16, 2006, at 8:32 AM, Ron wrote: Let's pretend that we have the typical DB table where rows are ~2-4KB apiece. 1TB of storage will let us have 256M-512M rows in such a table. A 32b hash code can be assigned to

[HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades [EMAIL PROTECTED] writes: If I run the script again, it is not always the first case that is slow, it varies from run to run, which is why I repeated it quite a few times for the test. For some reason I hadn't immediately twigged to the fact that your test script is just N

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Gary Doades
Tom Lane wrote: For some reason I hadn't immediately twigged to the fact that your test script is just N repetitions of the exact same structure with random data. So it's not so surprising that you get random variations in behavior with different test data sets. It seems clear that our

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades [EMAIL PROTECTED] writes: Is this likely to hit me in a random fashion during normal operation, joins, sorts, order by for example? Yup, anytime you're passing data with that kind of distribution through a sort. So the options are: 1) Fix the included qsort.c code and use that

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades [EMAIL PROTECTED] writes: Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on them.

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Ron
This behavior is consistent with the pivot choosing algorithm assuming certain distribution(s) for the data. For instance, median-of-three partitioning is known to be pessimal when the data is geometrically or hyper-geometrically distributed. Also, care must be taken that sometimes is not

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
I wrote: Gary Doades [EMAIL PROTECTED] writes: Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Ron [EMAIL PROTECTED] writes: How are we choosing our pivots? See qsort.c: it looks like median of nine equally spaced inputs (ie, the 1/8th points of the initial input array, plus the end points), implemented as two rounds of median-of-three choices. With half of the data inputs zero, it's not

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Dann Corbit
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Wednesday, February 15, 2006 5:22 PM To: Ron Cc: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] qsort again (was Re: [PERFORM

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Christopher Kings-Lynne
Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on them. I have to say that I restored a few gigabyte

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 19:59 -0500, Tom Lane wrote: I get amazingly stable runtimes now --- I didn't have the patience to run 100 trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec. So this code path is definitely not very sensitive to this data distribution. The

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Neil Conway
On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote: It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't looked at the glibc code yet to see what they are doing differently. glibc qsort is

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Qingqing Zhou
Tom Lane [EMAIL PROTECTED] wrote I did this 100 times and sorted the reported runtimes. I'd say this puts a considerable damper on my enthusiasm for using our qsort all the time, as was recently debated in this thread: http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php 100

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes: By did this 100 times do you mean generate a sequence of at most 20*100 numbers, and for every 20 numbers, the first half are all zeros and the other half are uniform random numbers? No, I mean I ran the bit of SQL script I gave 100 separate

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Qingqing Zhou
Tom Lane [EMAIL PROTECTED] wrote Qingqing Zhou [EMAIL PROTECTED] writes: By did this 100 times do you mean generate a sequence of at most 20*100 numbers, and for every 20 numbers, the first half are all zeros and the other half are uniform random numbers? No, I mean I ran the bit

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Qingqing Zhou
Qingqing Zhou [EMAIL PROTECTED] wrote I must misunderstand something here -- I can't figure out that why the cost of the same procedure keep climbing? Ooops, I mis-intepret the sentence -- you sorted the results ... Regards, Qingqing ---(end of

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote No, I mean I ran the bit of SQL script I gave 100 separate times. I must misunderstand something here -- I can't figure out that why the cost of the same procedure keep climbing? No, the run cost varies randomly

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Ron
At 08:21 PM 2/15/2006, Tom Lane wrote: Ron [EMAIL PROTECTED] writes: How are we choosing our pivots? See qsort.c: it looks like median of nine equally spaced inputs (ie, the 1/8th points of the initial input array, plus the end points), implemented as two rounds of median-of-three choices.