Re: [HACKERS] optimizer questions
--On Dienstag, Februar 14, 2006 10:35:12 -0600 hector Corrada Bravo [EMAIL PROTECTED] wrote: Before I start trying this (creating aggregate paths seems the reasonable thing to do) I would like your counsel. 1) Regardless of the optimization problem, is the executor able to execute aggregate nodes within join trees (that is, not as the result of subqueries)? 2) Has anyone tried something like this before? I did and failed because I did not quite understand how the Postgres internal variables should be initialized. My approach was to supply other join pathes if one of the two tables was an aggregate. Mit freundlichem Gruß Jens Schicke -- Jens Schicke [EMAIL PROTECTED] asco GmbH http://www.asco.de Mittelweg 7 Tel 0531/3906-127 38106 BraunschweigFax 0531/3906-400 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Patch Submission Guidelines
On Tue, Feb 14, 2006 at 05:28:54PM -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: How much time would you need? I think having every patch built before anyone even looks at the code would sort out most of the issues I mentioned. If I ran a buildfarm machine, I'd turn it off immediately if anyone proposed setting up a system that would cause it to run code no one had vetted... so I don't think the above will fly. It might or might not be worth doing something with patches that have passed some kind of initial review but aren't yet applied. Ofcourse not totally unvetted code, but something like Bruce's patch queue. Something that would compile them and tell you if they pass regression, or even note when a patch no longer applied cleanly to -HEAD. I was thinking it might be useful to have a level between committer and just a regular person. Sort of like we don't trust this guy to commit to -HEAD but enough to run basic tests on the patches. IMHO the thing we are really seriously short of is patch reviewers. Neil and Bruce and I seem to be the only ones doing that much at all, and the main burden is falling on Bruce. More eyeballs would help much more than throwing machines at the problem. Yeah. Unfortunatly the parts of the code I am familiar with are not the parts people submit patches on :(. There a lot of code there... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. signature.asc Description: Digital signature
Re: [HACKERS] Patch Submission Guidelines
On Tue, 2006-02-14 at 21:47 -0500, Neil Conway wrote: On Tue, 2006-02-14 at 22:54 +, Simon Riggs wrote: On Tue, 2006-02-14 at 17:28 -0500, Tom Lane wrote: IMHO the thing we are really seriously short of is patch reviewers. [...] Well that was the basis of my original suggestion. Publish some guidelines and everybody becomes a patch reviewer. I agree guidelines would be help, but I hope (and doubt!) that is not what is stopping people from reviewing patches. Anyone with the time and inclination can review patches, guidelines or not Yes, anyone can review patches, but will the patch submitter listen to what has been said by the reviewer? Will a committer need to correct the review comments? If there is a park with a rule like Keep Off the Grass then it seems most sensible to put up a sign that says that, rather than increase the number of park keepers to explain the rules. Not everybody will take notice of the sign, true, but it does allow non-park keepers to point out that a guideline has not been followed. (Fairly sure that KOtG should not be part of the PostgreSQL FAQ though). [BTW, your patch reviewers guidelines were very good - FAQ also...] Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Patch Submission Guidelines
[EMAIL PROTECTED] (Robert Treat) writes: On Tuesday 14 February 2006 16:00, Martijn van Oosterhout wrote: I would like to suggest that we increase substantially the FAQ entries relating to patch submission. By we, I actually mean please could the committers sit down and agree some clarified written guidelines? As I remember, there is a disinclination to increase the size of the FAQ very much. This suggests maintaining it as a seperate document. Or alternatively attach it as an appendix to the main documentation. Huh? The current developers FAQ is at least 1/2 the size of the main FAQ. I think adding a submission on patch submission guidelines is a great idea. I'll have a patch based on Simon's post to -patches ready in the next 24 hours unless someone is really going to object. If it were to be a new document, it would seem pretty sweet to call it The Hitchhiker's Guide To Getting Patches Accepted. One of the preface points would be along the lines of... Here are some guidelines as to what things to do to make it as easy as possible for proposed patches to be accepted with minimal change. To not follow them all does not forcibly guarantee rejection; it just increases the likelihood that the the amount of time and effort it takes to handle it increases... -- select 'cbbrowne' || '@' || 'acm.org'; http://cbbrowne.com/info/spiritual.html When campaigning, be swift as the wind; in leisurely march, majestic as the forest; in raiding and plundering, like fire; in standing, firm as the mountains. As unfathomable as the clouds, move like a thunderbolt. -- Sun Tzu, The Art of War ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Gary Doades [EMAIL PROTECTED] writes: If I run the script again, it is not always the first case that is slow, it varies from run to run, which is why I repeated it quite a few times for the test. For some reason I hadn't immediately twigged to the fact that your test script is just N repetitions of the exact same structure with random data. So it's not so surprising that you get random variations in behavior with different test data sets. I did some experimentation comparing the qsort from Fedora Core 4 (glibc-2.3.5-10.3) with our src/port/qsort.c. For those who weren't following the pgsql-performance thread, the test case is just this repeated a lot of times: create table atest(i int4, r int4); insert into atest (i,r) select generate_series(1,10), 0; insert into atest (i,r) select generate_series(1,10), random()*10; \timing create index idx on atest(r); \timing drop table atest; I did this 100 times and sorted the reported runtimes. (Investigation with trace_sort = on confirms that the runtime is almost entirely spent in qsort() called from our performsort --- the Postgres overhead is about 100msec on this machine.) Results are below. It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't looked at the glibc code yet to see what they are doing differently. I'd say this puts a considerable damper on my enthusiasm for using our qsort all the time, as was recently debated in this thread: http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php We need to fix our qsort.c before pushing ahead with that idea. regards, tom lane 100 runtimes for glibc qsort, sorted ascending: Time: 459.860 ms Time: 460.209 ms Time: 460.704 ms Time: 461.317 ms Time: 461.538 ms Time: 461.652 ms Time: 461.988 ms Time: 462.573 ms Time: 462.638 ms Time: 462.716 ms Time: 462.917 ms Time: 463.219 ms Time: 463.455 ms Time: 463.650 ms Time: 463.723 ms Time: 463.737 ms Time: 463.750 ms Time: 463.852 ms Time: 463.964 ms Time: 463.988 ms Time: 464.003 ms Time: 464.135 ms Time: 464.372 ms Time: 464.458 ms Time: 464.496 ms Time: 464.551 ms Time: 464.599 ms Time: 464.655 ms Time: 464.656 ms Time: 464.722 ms Time: 464.814 ms Time: 464.827 ms Time: 464.878 ms Time: 464.899 ms Time: 464.905 ms Time: 464.987 ms Time: 465.055 ms Time: 465.138 ms Time: 465.159 ms Time: 465.194 ms Time: 465.310 ms Time: 465.316 ms Time: 465.375 ms Time: 465.450 ms Time: 465.535 ms Time: 465.595 ms Time: 465.680 ms Time: 465.769 ms Time: 465.865 ms Time: 465.892 ms Time: 465.903 ms Time: 466.003 ms Time: 466.154 ms Time: 466.164 ms Time: 466.203 ms Time: 466.305 ms Time: 466.344 ms Time: 466.364 ms Time: 466.388 ms Time: 466.502 ms Time: 466.593 ms Time: 466.725 ms Time: 466.794 ms Time: 466.798 ms Time: 466.904 ms Time: 466.971 ms Time: 466.997 ms Time: 467.122 ms Time: 467.146 ms Time: 467.221 ms Time: 467.224 ms Time: 467.244 ms Time: 467.277 ms Time: 467.587 ms Time: 468.142 ms Time: 468.207 ms Time: 468.237 ms Time: 468.471 ms Time: 468.663 ms Time: 468.700 ms Time: 469.235 ms Time: 469.840 ms Time: 470.472 ms Time: 471.140 ms Time: 472.811 ms Time: 472.959 ms Time: 474.858 ms Time: 477.210 ms Time: 479.571 ms Time: 479.671 ms Time: 482.797 ms Time: 488.852 ms Time: 514.639 ms Time: 529.287 ms Time: 612.185 ms Time: 660.748 ms Time: 742.227 ms Time: 866.814 ms Time: 1234.848 ms Time: 1267.398 ms 100 runtimes for port/qsort.c, sorted ascending: Time: 418.905 ms Time: 420.611 ms Time: 420.764 ms Time: 420.904 ms Time: 421.706 ms Time: 422.466 ms Time: 422.627 ms Time: 423.189 ms Time: 423.302 ms Time: 425.096 ms Time: 425.731 ms Time: 425.851 ms Time: 427.253 ms Time: 430.113 ms Time: 432.756 ms Time: 432.963 ms Time: 440.502 ms Time: 440.640 ms Time: 450.452 ms Time: 458.143 ms Time: 459.212 ms Time: 467.706 ms Time: 468.006 ms Time: 468.574 ms Time: 470.003 ms Time: 472.313 ms Time: 483.622 ms Time: 492.395 ms Time: 509.564 ms Time: 531.037 ms Time: 533.366 ms Time: 535.610 ms Time: 575.523 ms Time: 582.688 ms Time: 593.545 ms Time: 647.364 ms Time: 660.612 ms Time: 677.312 ms Time: 680.288 ms Time: 697.626 ms Time: 833.066 ms Time: 834.511 ms Time: 851.819 ms Time: 920.443 ms Time: 926.731 ms Time: 954.289 ms Time: 1045.214 ms Time: 1059.200 ms Time: 1062.328 ms Time: 1136.018 ms Time: 1260.091 ms Time: 1276.883 ms Time: 1319.351 ms Time: 1438.854 ms Time: 1475.457 ms Time: 1538.211 ms Time: 1549.004 ms Time: 1744.642 ms Time: 1771.258 ms Time: 1959.530 ms Time: 2300.140 ms Time: 2589.641 ms Time: 2612.780 ms Time: 3100.024 ms Time: 3284.125 ms Time: 3379.792 ms Time: 3750.278 ms Time: 4302.278 ms Time: 4780.624 ms Time: 5000.056 ms Time: 5092.604 ms Time: 5168.722 ms Time: 5292.941 ms Time: 5895.964 ms Time: 7003.164 ms Time: 7099.449 ms Time: 7115.083 ms Time: 7384.940 ms Time: 8214.010 ms Time: 8700.771 ms Time: 9331.225 ms Time: 10503.360 ms Time: 12496.026 ms Time: 12982.474 ms Time: 15192.390 ms
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Tom Lane wrote: For some reason I hadn't immediately twigged to the fact that your test script is just N repetitions of the exact same structure with random data. So it's not so surprising that you get random variations in behavior with different test data sets. It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't looked at the glibc code yet to see what they are doing differently. I'd say this puts a considerable damper on my enthusiasm for using our qsort all the time, as was recently debated in this thread: http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php We need to fix our qsort.c before pushing ahead with that idea. [snip] Time: 28314.182 ms Time: 29400.278 ms Time: 34142.534 ms Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on them. Examining the dump and the associated times during restore it looks like I have 7 tables with this approximate distribution, thus the ridiculously long restore time. Better not re-index soon! Is this likely to hit me in a random fashion during normal operation, joins, sorts, order by for example? So the options are: 1) Fix the included qsort.c code and use that 2) Get FreeBSD to fix their qsort code 3) Both I guess that 1 is the real solution in case anyone else's qsort is broken in the same way. Then at least you *could* use it all the time :) Regards, Gary. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Gary Doades [EMAIL PROTECTED] writes: Is this likely to hit me in a random fashion during normal operation, joins, sorts, order by for example? Yup, anytime you're passing data with that kind of distribution through a sort. So the options are: 1) Fix the included qsort.c code and use that 2) Get FreeBSD to fix their qsort code 3) Both I guess that 1 is the real solution in case anyone else's qsort is broken in the same way. Then at least you *could* use it all the time :) It's reasonable to assume that most of the *BSDen have basically the same qsort code. Ours claims to have come from NetBSD sources, but I don't doubt that they all trace back to a common ancestor. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Gary Doades [EMAIL PROTECTED] writes: Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on them. Actually... we only use qsort when we have a sorting problem that fits within the allowed sort memory. The external-sort logic doesn't go through that code at all. So all the analysis we just did on your test case doesn't necessarily apply to sort problems that are too large for the sort_mem setting. The test case would be sorting 20 index entries, which'd probably occupy at least 24 bytes apiece of sort memory, so probably about 5 meg. A problem 20 times that size would definitely not fit in the default 16MB maintenance_work_mem. Were you using a large value of maintenance_work_mem for your restore? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index
This behavior is consistent with the pivot choosing algorithm assuming certain distribution(s) for the data. For instance, median-of-three partitioning is known to be pessimal when the data is geometrically or hyper-geometrically distributed. Also, care must be taken that sometimes is not when there are many equal values in the data. Even pseudo random number generator based pivot choosing algorithms are not immune if the PRNG is flawed in some way. How are we choosing our pivots? At 06:28 PM 2/15/2006, Tom Lane wrote: I did some experimentation comparing the qsort from Fedora Core 4 (glibc-2.3.5-10.3) with our src/port/qsort.c. For those who weren't following the pgsql-performance thread, the test case is just this repeated a lot of times: create table atest(i int4, r int4); insert into atest (i,r) select generate_series(1,10), 0; insert into atest (i,r) select generate_series(1,10), random()*10; \timing create index idx on atest(r); \timing drop table atest; I did this 100 times and sorted the reported runtimes. (Investigation with trace_sort = on confirms that the runtime is almost entirely spent in qsort() called from our performsort --- the Postgres overhead is about 100msec on this machine.) Results are below. It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't looked at the glibc code yet to see what they are doing differently. I'd say this puts a considerable damper on my enthusiasm for using our qsort all the time, as was recently debated in this thread: http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php We need to fix our qsort.c before pushing ahead with that idea. regards, tom lane 100 runtimes for glibc qsort, sorted ascending: Time: 459.860 ms snip Time: 488.852 ms Time: 514.639 ms Time: 529.287 ms Time: 612.185 ms Time: 660.748 ms Time: 742.227 ms Time: 866.814 ms Time: 1234.848 ms Time: 1267.398 ms 100 runtimes for port/qsort.c, sorted ascending: Time: 418.905 ms snip Time: 20865.979 ms Time: 21000.907 ms Time: 21297.585 ms Time: 21714.518 ms Time: 25423.235 ms Time: 27543.052 ms Time: 28314.182 ms Time: 29400.278 ms Time: 34142.534 ms ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
I wrote: Gary Doades [EMAIL PROTECTED] writes: Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on them. Actually... we only use qsort when we have a sorting problem that fits within the allowed sort memory. The external-sort logic doesn't go through that code at all. So all the analysis we just did on your test case doesn't necessarily apply to sort problems that are too large for the sort_mem setting. I increased the size of the test case by 10x (basically s/10/100/) which is enough to push it into the external-sort regime. I get amazingly stable runtimes now --- I didn't have the patience to run 100 trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec. So this code path is definitely not very sensitive to this data distribution. While these numbers aren't glittering in comparison to the best-case qsort times (~450 msec to sort 10% as much data), they are sure a lot better than the worst-case times. So maybe a workaround for you is to decrease maintenance_work_mem, counterintuitive though that be. (Now, if you *weren't* using maintenance_work_mem of 100MB or more for your problem restore, then I'm not sure I know what's going on...) We still ought to try to fix qsort of course. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Ron [EMAIL PROTECTED] writes: How are we choosing our pivots? See qsort.c: it looks like median of nine equally spaced inputs (ie, the 1/8th points of the initial input array, plus the end points), implemented as two rounds of median-of-three choices. With half of the data inputs zero, it's not too improbable for two out of the three samples to be zeroes in which case I think the med3 result will be zero --- so choosing a pivot of zero is much more probable than one would like, and doing so in many levels of recursion causes the problem. I think. I'm not too sure if the code isn't just being sloppy about the case where many data values are equal to the pivot --- there's a special case there to switch to insertion sort, and maybe that's getting invoked too soon. It'd be useful to get a line-level profile of the behavior of this code in the slow cases... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Generating config stuff from single source
We are currently maintaining information about configuration parameters in at least three places: the documentation, guc.c, and postgresql.conf.sample. I would like to generate these from a single source. Computationally, this is not very challenging, it's just a bit of work. I imagine as the source an XML file with a custom schema; see below for an example. I think this is the best source format because it allows integrating the DocBook-formatted descriptions without too much trouble and it allows for file format validation. An alternative might be m4 but that would not offer these features. To process this we'd use XSLT stylesheets run through xsltproc. We'd run this part during the tarball building phase, so users would not need it. Obviously, all of this will need some fine-tuning, but can we agree on this general direction? parameters group titleQuery Tuning/title subgroup titlePlaner Method Configuration/title parameter nameenable_hashagg/name contextuserset/context shortdescEnables the planner's use of hashed aggregation plans./shortdesc longdescblah/longdesc vartypebool/vartype variableenable_hashagg/variable resetvaltrue/resetval min.../min max.../max assignhook.../assignhook showhook.../showhook /parameter /subgroup /group /parameters -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Wednesday, February 15, 2006 5:22 PM To: Ron Cc: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour) Ron [EMAIL PROTECTED] writes: How are we choosing our pivots? See qsort.c: it looks like median of nine equally spaced inputs (ie, the 1/8th points of the initial input array, plus the end points), implemented as two rounds of median-of-three choices. With half of the data inputs zero, it's not too improbable for two out of the three samples to be zeroes in which case I think the med3 result will be zero --- so choosing a pivot of zero is much more probable than one would like, and doing so in many levels of recursion causes the problem. Adding some randomness to the selection of the pivot is a known technique to fix the oddball partitions problem. However, Bentley and Sedgewick proved that every quick sort algorithm has some input set that makes it go quadratic (hence the recent popularity of introspective sort, which switches to heapsort if quadratic behavior is detected. The C++ template I submitted was an example of introspective sort, but PostgreSQL does not use C++ so it was not helpful). I think. I'm not too sure if the code isn't just being sloppy about the case where many data values are equal to the pivot --- there's a special case there to switch to insertion sort, and maybe that's getting invoked too soon. Here are some cases known to make qsort go quadratic: 1. Data already sorted 2. Data reverse sorted 3. Data organ-pipe sorted or ramp 4. Almost all data of the same value There are probably other cases. Randomizing the pivot helps some, as does check for in-order or reverse order partitions. Imagine if 1/3 of the partitions fall into a category that causes quadratic behavior (have one of the above formats and have more than CUTOFF elements in them). It is doubtful that the switch to insertion sort is causing any sort of problems. It is only going to be invoked on tiny sets, for which it has a fixed cost that is probably less that qsort() function calls on sets of the same size. It'd be useful to get a line-level profile of the behavior of this code in the slow cases... I guess that my in-order or presorted tests [which often arise when there are very few distinct values] may solve the bad partition problems. Don't forget that the algorithm is called recursively. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on them. I have to say that I restored a few gigabyte dump on freebsd the other day, and most of the restore time was in index creation - I didn't think too much of it though at the time. FreeBSD 4.x. Chris ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index
On Wed, 2006-02-15 at 19:59 -0500, Tom Lane wrote: I get amazingly stable runtimes now --- I didn't have the patience to run 100 trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec. So this code path is definitely not very sensitive to this data distribution. The worst-case behavior of replacement-selection is very close to its average behavior, while the worst-case behavior of QuickSort is terrible (N2) – a strong argument in favor of replacement-selection. Despite this risk, QuickSort is widely used because, in practice, it has superior performance. p.8, AlphaSort: A Cache-Sensitive Parallel External Sort, Nyberg et al, VLDB Journal 4(4): 603-627 (1995) I think your other comment about flipping to insertion sort too early (and not returning...) is a plausible cause for the poor pg qsort behaviour, but the overall spread of values seems as expected. Some test results I've seen seem consistent with the view that increasing memory also increases run-time for larger settings of work_mem/maintenance_work_mem. Certainly, as I observed a while back, having a large memory settings doesn't help you at all when you are doing final run merging on the external sort. Whatever we do, we should look at the value high memory settings bring to each phase of a sort separately from the other phases. There is work underway on improving external sorts, so I hear (not me). Plus my WIP on randomAccess requirements. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index
On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote: It seems clear that our qsort.c is doing a pretty awful job of picking qsort pivots, while glibc is mostly managing not to make that mistake. I haven't looked at the glibc code yet to see what they are doing differently. glibc qsort is actually merge sort, so I'm not surprised it avoids this problem. -Neil ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Tom Lane [EMAIL PROTECTED] wrote I did this 100 times and sorted the reported runtimes. I'd say this puts a considerable damper on my enthusiasm for using our qsort all the time, as was recently debated in this thread: http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php 100 runtimes for glibc qsort, sorted ascending: Time: 866.814 ms Time: 1234.848 ms Time: 1267.398 ms 100 runtimes for port/qsort.c, sorted ascending: Time: 28314.182 ms Time: 29400.278 ms Time: 34142.534 ms By did this 100 times do you mean generate a sequence of at most 20*100 numbers, and for every 20 numbers, the first half are all zeros and the other half are uniform random numbers? I tried to confirm it by patching the program mentioned in the link, but seems BSDqsort is still a little bit leading. Regards, Qingqing --- Result sort#./sort [3] [glibc qsort]: nelem(2000), range(4294901760) distr(halfhalf) ccost(2) : 18887.285000 ms [3] [BSD qsort]: nelem(2000), range(4294901760) distr(halfhalf) ccost(2) : 18801.018000 ms [3] [qsortG]: nelem(2000), range(4294901760) distr(halfhalf) ccost(2) : 22997.004000 ms --- Patch to sort.c sort#diff -c sort.c sort1.c *** sort.c Thu Dec 15 12:18:59 2005 --- sort1.c Wed Feb 15 22:21:15 2006 *** *** 35,43 {BSD qsort, qsortB}, {qsortG, qsortG} }; ! static const size_t d_nelem[] = {1000, 1, 10, 100, 500}; ! static const size_t d_range[] = {2, 32, 1024, 0xL}; ! static const char *d_distr[] = {uniform, gaussian, 95sorted, 95reversed}; static const size_t d_ccost[] = {2}; /* factor index */ --- 35,43 {BSD qsort, qsortB}, {qsortG, qsortG} }; ! static const size_t d_nelem[] = {500, 1000, 2000}; ! static const size_t d_range[] = {0xL}; ! static const char *d_distr[] = {halfhalf}; static const size_t d_ccost[] = {2}; /* factor index */ *** *** 180,185 --- 180,192 swap(karray[i], karray[nelem-i-1]); } } + else if (!strcmp(distr, halfhalf)) + { + int j; + for (i = 0; i nelem/20; i++) + for (j = 0; j 10; j++) + karray[i*20 + j] = 0; + } return array; } ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Qingqing Zhou [EMAIL PROTECTED] writes: By did this 100 times do you mean generate a sequence of at most 20*100 numbers, and for every 20 numbers, the first half are all zeros and the other half are uniform random numbers? No, I mean I ran the bit of SQL script I gave 100 separate times. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Tom Lane [EMAIL PROTECTED] wrote Qingqing Zhou [EMAIL PROTECTED] writes: By did this 100 times do you mean generate a sequence of at most 20*100 numbers, and for every 20 numbers, the first half are all zeros and the other half are uniform random numbers? No, I mean I ran the bit of SQL script I gave 100 separate times. I must misunderstand something here -- I can't figure out that why the cost of the same procedure keep climbing? Regards, Qingqing ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Qingqing Zhou [EMAIL PROTECTED] wrote I must misunderstand something here -- I can't figure out that why the cost of the same procedure keep climbing? Ooops, I mis-intepret the sentence -- you sorted the results ... Regards, Qingqing ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Qingqing Zhou [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote No, I mean I ran the bit of SQL script I gave 100 separate times. I must misunderstand something here -- I can't figure out that why the cost of the same procedure keep climbing? No, the run cost varies randomly depending on the random data supplied by the test script. The reason the numbers are increasing is that I sorted them for ease of inspection. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index
At 08:21 PM 2/15/2006, Tom Lane wrote: Ron [EMAIL PROTECTED] writes: How are we choosing our pivots? See qsort.c: it looks like median of nine equally spaced inputs (ie, the 1/8th points of the initial input array, plus the end points), implemented as two rounds of median-of-three choices. OK, this is a bad way to do median-of-n partitioning for a few reasons. See Sedgewick's PhD thesis for details. Basically, if one is using median-of-n partitioning to choose a pivot, one should do it in =one= pass, and n for that pass should be = the numbers of registers in the CPU. Since the x86 ISA has 8 GPR's, n should be = 8. 7 for instance. Special purposing the median-of-n code so that the minimal number of comparisons and moves is used to sort the sample and then partitioning in place is the best way to do it. In addition, care must be taken to deal with the possibility that many of the keys may be equal. The (pseudo) code looks something like this: qs(a[],L,R){ if((R-L) SAMPLE_SIZE){ // Not worth using qs for too few elements SortSample(SAMPLE_SIZE,a[],L,R); // Sorts SAMPLE_SIZE= n elements and does median-of-n partitioning for small n // using the minimal number of comparisons and moves. // In the process it ends up partitioning the first n/2 and last n/2 elements // SAMPLE_SIZE is a constant chosen to work best for a given CPU. // #GPRs - 1 is a good initial guess. // For the x86 ISA, #GPRs - 1 = 7. For native x86-64, it's 15. // For most RISC CPUs it's 31 or 63. For Itanium, it's 127 (!) pivot= a[(L+R)1]; i= L+(SAMPLE_SIZE1); j= R-(SAMPLE_SIZE1); for(;;){ while(a[++i] pivot); while(a[--j] pivot); if(i = j) break; if(a[i] a[j]) swap(a[i],a[j]); } if((i-R) = (j-L)){qs(a,L,i-1);} else{qs(a,i,R);} else{OofN^2_Sort(a,L,R);} // SelectSort may be better than InsertSort if KeySize in bits RecordSize in bits } // End of qs Given that the most common CPU ISA in existence has 8 GPRs, SAMPLE_SIZE= 7 is probably optimal: t= (L+R); the set would be {L; t/8; t/4; t/2; 3*t/4; 7*t/8; R;} == {L; t3; t2; t1; (3*t)2; (7*t)3; R} as the locations. Even better (and more easily scaled as the number of GPR's in the CPU changes) is to use the set {L; L+1; L+2; t1; R-2; R-1; R} This means that instead of 7 random memory accesses, we have 3; two of which result in a burst access for three elements each. That's much faster; _and_ using a sample of 9, 15, 31, 63, etc (to max of ~GPRs -1) elements is more easily done. It also means that the work we do sorting the sample can be taken advantage of when starting inner loop of quicksort: items L..L+2, t, and R-2..R are already partitioned by SortSample(). Insuring that the minimum number of comparisons and moves is done in SortSample can be down by using a code generator to create a comparison tree that identifies which permutation(s) of n we are dealing with and then moving them into place with the minimal number of moves. SIDE NOTE: IIRC glibc's qsort is actually merge sort. Merge sort performance is insensitive to all inputs, and there are way to optimize it as well. I'll leave the actual coding to someone who knows the pg source better than I do. Ron ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Blog post on EnterpriseDB...maybe off topic
http://www.flamingspork.com/blog/2006/02/16/enterprisedb-where-is-the-source/ Any comments on this? Is he referring to EnterpriseDB extensions that they don't make public? Chris ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly