Re: [HACKERS] Polyphase merge is obsolete

2017-09-11 Thread Tomas Vondra
Hi, I planned to do some benchmarking on this patch, but apparently the patch no longer applies. Rebase please? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Polyphase merge is obsolete

2017-08-30 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 12:48 PM, Robert Haas wrote: > These are separate topics. They should each be discussed on their own > thread. Please don't hijack this thread to talk about something else. I don't think that that is a fair summary. Heikki has done a number of

Re: [HACKERS] Polyphase merge is obsolete

2017-08-30 Thread Robert Haas
On Wed, Aug 30, 2017 at 2:54 PM, Peter Geoghegan wrote: > I noticed that this is in the upcoming CF 1 for v11. I'm signed up to review. > > I'd like to point out that replacement selection is also obsolete, > which is something I brought up recently [1]. I don't actually have > any

Re: [HACKERS] Polyphase merge is obsolete

2017-08-30 Thread Peter Geoghegan
On Mon, Feb 27, 2017 at 2:45 PM, Peter Geoghegan wrote: > Since we have an awful lot of stuff in the last CF, and this patch > doesn't seem particularly strategic, I've marked it "Returned with > Feedback". I noticed that this is in the upcoming CF 1 for v11. I'm signed up to

Re: [HACKERS] Polyphase merge is obsolete

2017-02-27 Thread Peter Geoghegan
On Mon, Jan 16, 2017 at 5:56 PM, Peter Geoghegan wrote: > On Wed, Oct 12, 2016 at 10:16 AM, Heikki Linnakangas wrote: >> The number of *input* tapes we can use in each merge pass is still limited, >> by the memory needed for the tape buffers and the merge heap,

Re: [HACKERS] Polyphase merge is obsolete

2017-01-16 Thread Peter Geoghegan
On Wed, Oct 12, 2016 at 10:16 AM, Heikki Linnakangas wrote: > The number of *input* tapes we can use in each merge pass is still limited, > by the memory needed for the tape buffers and the merge heap, but only one > output tape is active at any time. The inactive output tapes

Re: [HACKERS] Polyphase merge is obsolete

2016-10-14 Thread Peter Geoghegan
On Wed, Oct 12, 2016 at 10:16 AM, Heikki Linnakangas wrote: > Let's switch over to a simple k-way balanced merge. Because it's simpler. If > you're severely limited on memory, like when sorting 1GB of data with > work_mem='1MB' or less, it's also slightly faster. I'm not too

Re: [HACKERS] Polyphase merge is obsolete

2016-10-12 Thread Heikki Linnakangas
On 10/12/2016 08:27 PM, Tom Lane wrote: Heikki Linnakangas writes: The beauty of the polyphase merge algorithm is that it allows reusing input tapes as output tapes efficiently ... So the whole idea of trying to efficiently reuse input tapes as output tapes is pointless.

Re: [HACKERS] Polyphase merge is obsolete

2016-10-12 Thread Tom Lane
Heikki Linnakangas writes: > The beauty of the polyphase merge algorithm is that it allows reusing > input tapes as output tapes efficiently ... So the whole idea of trying to > efficiently reuse input tapes as output tapes is pointless. It's been awhile since I looked at that

[HACKERS] Polyphase merge is obsolete

2016-10-12 Thread Heikki Linnakangas
The beauty of the polyphase merge algorithm is that it allows reusing input tapes as output tapes efficiently. So if you have N tape drives, you can keep them all busy throughout the merge. That doesn't matter, when we can easily have as many "tape drives" as we want. In PostgreSQL, a tape

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Simon Riggs
On Wed, 2009-02-04 at 14:42 +0800, Don Marvick wrote: 4. any other issue needs consideration? Most attempts to improve sorting further have fallen to nothing because of the lack of repeatable test results. In particular, coming up with test cases *after* writing the code is a good way to get

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Simon Riggs
On Wed, 2009-02-04 at 10:55 +, Greg Stark wrote: Is this basically the same as our current algorithm but without multiplexing the tapes onto single files? I have been wondering whether we multiplex the tapes any better than filesystems can lay out separate files actually. I don't think

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Tom Lane
Greg Stark st...@enterprisedb.com writes: Is this basically the same as our current algorithm but without multiplexing the tapes onto single files? I have been wondering whether we multiplex the tapes any better than filesystems can lay out separate files actually. The reason for the

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Tom Lane
Don Marvick donmarv...@gmail.com writes: Since nobody replied, I would give it a try. I am going to implement the merge pattern described in Knuth Page 365 (5.4.9), essentially it is as follow: - create initial runs using replacement selection (basically this is as in the current

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes: On Wed, 2009-02-04 at 10:55 +, Greg Stark wrote: Is this basically the same as our current algorithm but without multiplexing the tapes onto single files? I have been wondering whether we multiplex the tapes any better than filesystems can lay out

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Greg Stark
Is this basically the same as our current algorithm but without multiplexing the tapes onto single files? I have been wondering whether we multiplex the tapes any better than filesystems can lay out separate files actually. Or is there something else that's different here? -- greg -- Sent via

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Don Marvick
Good point. I would note this issue. Thanks for the advice :). Regards, Don On Wed, Feb 4, 2009 at 6:09 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2009-02-04 at 14:42 +0800, Don Marvick wrote: 4. any other issue needs consideration? Most attempts to improve sorting further

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Don Marvick
This is what I understand from existing implementation: - fix the merge fan in F, and number of tapes=F+1 - for each sorted run generated, we have a formula to determine which tape it belongs to - the sorted run is added to the end of the tape - all sorted runs have been added to tape. There is

Re: [HACKERS] polyphase merge?

2009-02-04 Thread Don Marvick
Hmm probably it's worth to try this. Has anybody try this before? If not, I am interested to give it a try. Regards, Don On Wed, Feb 4, 2009 at 11:25 PM, Gregory Stark st...@enterprisedb.comwrote: Simon Riggs si...@2ndquadrant.com writes: On Wed, 2009-02-04 at 10:55 +, Greg Stark

Re: [HACKERS] polyphase merge?

2009-02-03 Thread Don Marvick
Dear All, Since nobody replied, I would give it a try. I am going to implement the merge pattern described in Knuth Page 365 (5.4.9), essentially it is as follow: - create initial runs using replacement selection (basically this is as in the current implementation) - add enough dummy runs of size

[HACKERS] polyphase merge?

2009-01-29 Thread Don Marvick
Dear All, I apologize if this has been discussed before. I have tried to search to the mailing list and could not find an exact answer. Currently, Postgres uses Knuth's Algorithm 5.4.2D to perform external merge sort. IMHO the algorithm is designed for tapes. Why don't the simple text book

Re: [HACKERS] Polyphase Merge

2008-01-22 Thread Sam Mason
On Mon, Jan 21, 2008 at 04:13:32PM -0500, Tom Lane wrote: Sam Mason [EMAIL PROTECTED] writes: It's really up to you to find answers to these questions, especially the first one. Once you've designed an efficient algorithm then the second point (which I'm interpreting as how you'd go about

Re: [HACKERS] Polyphase Merge

2008-01-22 Thread mac_man2005
-- From: Tom Lane [EMAIL PROTECTED] Sent: Monday, January 21, 2008 10:13 PM To: Sam Mason [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Polyphase Merge I agree --- having to read the run back from external storage

[HACKERS] Polyphase Merge

2008-01-21 Thread mac_man2005
I'm trying to refine the sorting module of tuplesort.c During run creations I use two heaps instead of just one (yeah, it's still me... the one of the two heaps still trying to get some answer/help from -hackers) Those two runs are built in a way such that if we would concatenate one of them

Re: [HACKERS] Polyphase Merge

2008-01-21 Thread Sam Mason
On Mon, Jan 21, 2008 at 07:42:24PM +0100, [EMAIL PROTECTED] wrote: During run creations I use two heaps instead of just one (yeah, it's still me... the one of the two heaps still trying to get some answer/help from -hackers) Hi again! ISSUES a) how to distribute logical runs (that is both

Re: [HACKERS] Polyphase Merge

2008-01-21 Thread Tom Lane
Sam Mason [EMAIL PROTECTED] writes: It's really up to you to find answers to these questions, especially the first one. Once you've designed an efficient algorithm then the second point (which I'm interpreting as how you'd go about changing tuplestore(?) so that things can be read in reverse