Re: [HACKERS] Sorting Improvements for 8.4

2008-03-27 Thread Bruce Momjian
Added to TODO: > * Consider being smarter about memory and external files used during > sorts > > http://archives.postgresql.org/pgsql-hackers/2007-11/msg01101.php > http://archives.postgresql.org/pgsql-hackers/2007-12/msg00045.php -

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-27 Thread James Mansion
Ron Mayer wrote: Or do you mean being able to perform parts of the query plan fully in parallel? If this, then one would need a lot more than ParallelSort... I wouldn't recommend that - it seems like a Hard Problem. Isn't it the case that the implicit unions from processing partitioned

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-27 Thread Joris Dobbelsteen
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Ron Mayer >Sent: Wednesday, 19 December 2007 19:26 >To: Mark Mielke; pgsql-hackers@postgresql.org >Subject: Re: [HACKERS] Sorting Improvements for 8.4 > >> Or do you mean bei

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-22 Thread Gregory Stark
"Brian Hurt" <[EMAIL PROTECTED]> writes: > 3) It's possible to perform the sort lazily. You have the initial O(N) pass > over the list, but then each block is only O(log N) cost. If it's likely that > only the first part of the result is needed, then much of the work can be > avoided. Now that'

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-21 Thread Brian Hurt
Brian Hurt wrote: While we're blue skying things, I've had an idea for a sorting algorithm kicking around for a couple of years that might be interesting. It's a variation on heapsort to make it significantly more block-friendly. I have no idea if the idea would work, or how well it'd work,

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Mark Mielke
Jeff Davis wrote: Where parallel processing like this becomes attractive is when you're running a 2 hour query on a machine sequentially running scheduled batch jobs which can be sped up to 30 minutes. But in that case you're almost certainly being limited by your disk bandwidth, not your cpu spe

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Jeff Davis
On Thu, 2007-12-20 at 01:26 +, Gregory Stark wrote: > I suspect most of that is spent just copying the data around. Which would not > be helped by having multiple threads doing the copying -- and in fact might be > exacerbated if it required an extra copy to consolidate all the data in the > en

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Mark Mielke
Jeff Davis wrote: On Wed, 2007-12-19 at 15:51 -0500, Mark Mielke wrote: That sounds possible, but I still feel myself suspecting that disk reads will be much slower than localized text comparison. Perhaps I am overestimating the performance of the comparison function? I think this simpl

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Ron Mayer
Gregory Stark wrote: > Note that speeding up a query from 20s to 5s isn't terribly useful. I disagree totally with that. That is the difference between no chance of someone waiting for a web page to load; vs. a good chance they'd wait. And 2s vs 0.5s is the difference between a web site that f

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Dann Corbit
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Brian Hurt > Sent: Thursday, December 20, 2007 6:42 AM > To: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Sorting Improvements for 8.4 > > While we

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Greg Smith
On Thu, 20 Dec 2007, Martijn van Oosterhout wrote: The way around this is a NUMA architecture, but that's a whole other ball of wax. Quick note for those reading Ulrich's paper: he refers in a couple of places to Intel's upcoming CSI approach to NUMA. This has now been renamed QuickPath, a

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Brian Hurt
While we're blue skying things, I've had an idea for a sorting algorithm kicking around for a couple of years that might be interesting. It's a variation on heapsort to make it significantly more block-friendly. I have no idea if the idea would work, or how well it'd work, but it might be wor

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-20 Thread Martijn van Oosterhout
On Wed, Dec 19, 2007 at 07:17:21PM -0800, Ron Mayer wrote: > > but it doesn't follow that you will get > > 10X improvement with 10 threads, or even 4X with 4. > > Yeah - unless those 10 cores have additional I/O to the > memories compared to a 1 core system (which I'd hope > would be the case or e

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Greg Smith
On Thu, 20 Dec 2007, Gregory Stark wrote: Surely such machines have kickass memory backplanes too though? How could it ever be reasonable to have an i/o controller with more bandwidth than your memory? Dann had the right general numbers here--max of 6.4GB/s between processors and you might co

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Ron Mayer
Tom Lane wrote: > ...I can believe that suitable test cases would show > 2X improvement for 2 threads, One other thing I found interesting is that their test case showed a near 2X improvement for hyperthreading; where I haven't heard of many other ways to get hyperthreading to show improvements f

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Gregory Stark
"Greg Smith" <[EMAIL PROTECTED]> writes: > On Wed, 19 Dec 2007, Dann Corbit wrote: > >> Benchmarking a single system will really only explain that system. >> Someone may have a disk farm with 2GB/Sec throughput >> But such a configuration is very unlikely. > > If you believe comments like those at

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Gregory Stark
"Dann Corbit" <[EMAIL PROTECTED]> writes: >> Note that speeding up a query from 20s to 5s isn't terribly useful. If it's >> OLTP you can't be using all your cores for each user anyways. And if it's >> DSS 20s isn't a problem. > > Unless (of course) there are 20,000 users doing the queries that wou

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Greg Smith
On Wed, 19 Dec 2007, Dann Corbit wrote: Benchmarking a single system will really only explain that system. Someone may have a disk farm with 2GB/Sec throughput But such a configuration is very unlikely. If you believe comments like those at http://www.c0t0d0s0.org/archives/1792-Do-it-yourself

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
TECTED] > Subject: Re: [HACKERS] Sorting Improvements for 8.4 > > > "Jeff Davis" <[EMAIL PROTECTED]> writes: > > > test=> explain analyze select * from sorter order by t; > > test=> explain analyze select * from sorter order by b; > > test=> ex

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Gregory Stark
"Jeff Davis" <[EMAIL PROTECTED]> writes: > test=> explain analyze select * from sorter order by t; > test=> explain analyze select * from sorter order by b; > test=> explain analyze select * from sorter order by f; > > On my machine this table fits easily in memory (so there aren't any disk > re

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Tom Lane
Mark Mielke <[EMAIL PROTECTED]> writes: > Jeff Davis wrote: >> Also, there is probably a lot of memory copying going on, and that >> probably destroys a lot of the effectiveness of L2 caching. When L2 >> caching is ineffective, the CPU spends a lot of time just waiting on >> memory. In that case, i

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 15:19 -0800, Dann Corbit wrote: > The algorithm that I am suggesting will take exactly one pass to merge > all of the files. > >From tuplesort.c: "In the current code we determine the number of tapes M on the basis of workMem: we want workMem/M to be large enough that we re

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
P.S. A beautiful paper on replacement selection is found here: http://students.fim.uni-passau.de/~fickensc/Proseminar/Proseminar.pdf ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
> -Original Message- > From: Jeff Davis [mailto:[EMAIL PROTECTED] > Sent: Wednesday, December 19, 2007 3:10 PM > To: Dann Corbit > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Sorting Improvements for 8.4 > > On Wed, 2007-12-19 at 14:41 -0800, Dann C

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 14:41 -0800, Dann Corbit wrote: > As long as sorting improvements are being considered, may I suggest an > experiment that uses a very simple model? > > Assuming that you have K subfiles created by the initial sorting pass, > insert the top record of each file into a priority

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 15:51 -0500, Mark Mielke wrote: > That sounds possible, but I still feel myself suspecting that disk > reads will be much slower than localized text comparison. Perhaps I am > overestimating the performance of the comparison function? I think this simple test will change you

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
As long as sorting improvements are being considered, may I suggest an experiment that uses a very simple model? Assuming that you have K subfiles created by the initial sorting pass, insert the top record of each file into a priority queue. Then, emit records from the queue until the priority qu

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke
Jeff Davis wrote: On Wed, 2007-12-19 at 12:08 -0500, Mark Mielke wrote: Stupid question #2: Is it well recognized that the CPU is the bottleneck in the PostgreSQL sorting mechanism? Or might it be memory bandwidth and I/O? I think it depends a lot on several factors. It's probably a dif

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke
Ron Mayer wrote: The link in the beginning of the thread points to articles that seem to describe one such algorithm; along with benchmarks. (http://tinyurl.com/3bvu4u, http://tinyurl.com/32wg2m) The improvements were pretty consistent from set sizes ranging from very small sets (hundreds) to qui

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Ron Mayer
Mark Mielke wrote: > I am curious - what algorithms exist to efficiently do a parallel sort? > Do you mean if sorting 1 million items, it is possible to separate this > into 2 sets of 500 thousand each, execute them in separate threads > (with task administration and synchronization overhead) , co

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 12:08 -0500, Mark Mielke wrote: > Stupid question #2: Is it well recognized that the CPU is the > bottleneck in the PostgreSQL sorting mechanism? Or might it be memory > bandwidth and I/O? > I think it depends a lot on several factors. It's probably a different bottleneck fo

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke
Michał Zaborowski wrote: Ok - we want to sort table with quick sort and we want to do it on - N threads. Every thread - gets begin and end of indices of the table. First step starts at 0 and lasts with count -1. Single step: find medium value and move lover before it and bigger after. In normal

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Tom Lane
Andreas Joseph Krogh <[EMAIL PROTECTED]> writes: > And remember; Users don't care about portability-issues, they care about > performance. Nonsense. If Postgres stops working on their machine, they'll care. regards, tom lane ---(end of broadcast)

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Andreas Joseph Krogh
On Tuesday 18 December 2007 10:03:25 Dimitri Fontaine wrote: > Hi, > > Le mardi 18 décembre 2007, Ron Mayer a écrit : > > Has anyone looked into sorting algorithms that could use > > more than one CPU or core at a time? > > [...] > > > PS: Yeah, I know multi-threading is a hot-button on these > > l

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Michał Zaborowski
2007/12/19, Mark Mielke <[EMAIL PROTECTED]>: > Jeff Davis wrote: > > My first thought would be that we would need a new executor node (e.g. > > "ParallelSort") that would only be chosen when the cost of the sort is > > large enough to outweigh other factors (such as process creation time, > > divid

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke
Jeff Davis wrote: My first thought would be that we would need a new executor node (e.g. "ParallelSort") that would only be chosen when the cost of the sort is large enough to outweigh other factors (such as process creation time, dividing available work_mem, and any necessary IPC). It seems to

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-18 Thread Jeff Davis
On Tue, 2007-12-18 at 09:31 +, Simon Riggs wrote: > On Mon, 2007-12-17 at 16:34 -0800, Ron Mayer wrote: > > > PS: Yeah, I know multi-threading is a hot-button on these > > lists; but sorting seems a relatively isolated of the code > > and I'd wonder if it'd be isolate-able enough that multiple

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-18 Thread Simon Riggs
On Mon, 2007-12-17 at 16:34 -0800, Ron Mayer wrote: > PS: Yeah, I know multi-threading is a hot-button on these > lists; but sorting seems a relatively isolated of the code > and I'd wonder if it'd be isolate-able enough that multiple > CPUs could be used there. I'm not sure multi-threading is th

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-18 Thread Dimitri Fontaine
Hi, Le mardi 18 décembre 2007, Ron Mayer a écrit : > Has anyone looked into sorting algorithms that could use > more than one CPU or core at a time? [...] > PS: Yeah, I know multi-threading is a hot-button on these > lists; but sorting seems a relatively isolated of the code > and I'd wonder if it

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-17 Thread Ron Mayer
Has anyone looked into sorting algorithms that could use more than one CPU or core at a time? Benchmarks I see[1][2] suggest that sorting is an area that improves greatly with multiple processors and even with multi-threading on a single core processor. "For 1-processor and 2-threads (1p2t), t

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Jeff Davis
On Mon, 2007-12-03 at 20:40 +, Gregory Stark wrote: > I'm not sure where the idea of keeping the current bounds of the input tapes > comes into it. We only preread when we run out of tuples anyways and then we > don't really have a choice about which tape we want to preread from. And it's > a g

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Jeff Davis
On Mon, 2007-12-03 at 20:40 +, Gregory Stark wrote: > So the question is just how many seeks are we doing during sorting. If we're > doing 0.1% seeks and 99.9% sequential i/o then eliminating the 1% entirely > (which we can't do) isn't going to speed up seeking all that much. If we're > doing 2

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Gregory Stark
"Simon Riggs" <[EMAIL PROTECTED]> writes: > The buffer size at max tapes is an optimum - a trade off between > avoiding intermediate merging and merging efficiently. Freeing more > memory is definitely going to help in the case of low work_mem and lots > of runs. I can't follow these abstract arg

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Simon Riggs
On Mon, 2007-12-03 at 20:40 +, Gregory Stark wrote: > I think the way to do what you're proposing is... Don't understand that. Algorithm F covers that already doesn't it? > So the question is just how many seeks are we doing during sorting. If we're > doing 0.1% seeks and 99.9% sequential i/

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Gregory Stark
"Simon Riggs" <[EMAIL PROTECTED]> writes: > On Mon, 2007-12-03 at 10:32 -0800, Jeff Davis wrote: > >> If I understand correctly, we just keep track of the maximum value of >> the last block read from each run, and then always read from the run in >> which the last block read has the lowest maximu

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Simon Riggs
On Mon, 2007-12-03 at 10:32 -0800, Jeff Davis wrote: > If I understand correctly, we just keep track of the maximum value of > the last block read from each run, and then always read from the run in > which the last block read has the lowest maximum. Yep, sounds like Algorithm F > It seems as if

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Jeff Davis
On Mon, 2007-12-03 at 11:51 +, Simon Riggs wrote: > So Algorithm F is somewhat orthogonal to what I've proposed, though > maybe still interesting. What we do now is fairly close, but you should > look at the code in tuplesort.c and logtape.c to see how well it > matches. That might lead to an i

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-03 Thread Simon Riggs
On Fri, 2007-11-30 at 12:07 -0800, Jeff Davis wrote: > On Tue, 2007-11-27 at 18:03 +, Simon Riggs wrote: > > 5. DYNAMIC RUN HANDLING (in Final Merge) > > > > Another way of addressing a) is to simply make better use of memory > > itself. Let's look at that in more detail: > > > > Number of r

Re: [HACKERS] Sorting Improvements for 8.4

2007-11-30 Thread Jeff Davis
On Tue, 2007-11-27 at 18:03 +, Simon Riggs wrote: > 5. DYNAMIC RUN HANDLING (in Final Merge) > > Another way of addressing a) is to simply make better use of memory > itself. Let's look at that in more detail: > > Number of runs that can be merged at once is currently fixed, based upon > avai

Re: [HACKERS] Sorting Improvements for 8.4

2007-11-28 Thread Sam Mason
On Tue, Nov 27, 2007 at 06:03:46PM +, Simon Riggs wrote: > Just wanted to review a few thoughts and ideas around improving external > sorts, as recently encouraged to do by Jim Nasby. Is there any way of PG knowing that having an index on a subset of the sorted columns is sometimes a win? Fo

[HACKERS] Sorting Improvements for 8.4

2007-11-27 Thread Simon Riggs
Just wanted to review a few thoughts and ideas around improving external sorts, as recently encouraged to do by Jim Nasby. Current issues/opportunities are these: ISSUES a) Memory is always in short supply, so using what we have more effectively is going to be welcome. b) Heap sort has a reaso