Re: [HACKERS] Replacement Selection
in puttuple_common(), the transition from an internal to external sort is performed at the bottom of the TSS_INITIAL case in the main switch statement. The transition? Do we internal sort somewhere else and then external sort here in tuplesort.c? The function dumptuples() heapifies the in-core tuples (divides the in-core tuples into initial runs and then advances the state to TSS_BUILDRUNS). Cannot see where dumptuples() advances the state to TSS_BUILDRUNS. I expected something like state-status = TSS_BUILDRUNS; executed through dumptuples() I recommend you run the code in the debugger on a external-sorting query: watch two or three tuples go into the heap and you'll get the idea. The top of the heap is at state-memtuples[0] the heap goes down from there. New tuples are added there and the heap is adjusted (Using the tuplesort_heap_siftup() function). -Tim ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] writes: The function dumptuples() heapifies the in-core tuples (divides the in-core tuples into initial runs and then advances the state to TSS_BUILDRUNS). Cannot see where dumptuples() advances the state to TSS_BUILDRUNS. I expected something like state-status = TSS_BUILDRUNS; executed through dumptuples() There's only one state-status = TSS_BUILDRUNS in the whole file. It's called by inittapes which is called in one place, just before dumptuples. Seriously, please try a bit harder before giving up. The code in this file is quite interdependent which means you'll have to read through the whole file (except perhaps the last section which just contains the interface functions to feed different types of datums or tuples) to understand any of it. But it's quite self-contained which makes it one of the easier modules in the system to get a functional grasp of. The hard part is understanding the algorithm itself and working out the details of the array management. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Replacement Selection
Hi to all. It seems a previous mail of mine with following body hasn't been sent. Sorry for possibly getting it twice. Actually I have now modified that body, so it's worth to read it once again. Thanks for your attention. Regards. PREVIOUS MAIL-- Well, the refinements are the followings: Using 2 heaps instead of just one: one heap creating a descending run and the other one creating an ascending run. Both associated to the same logical run. Suppose we want the input elements to be finally sorted in an ascending order. To do this we could QuickSort the first M initialization elements into RAM and then divide it into 2 parts. Suppose the first heap creates the following run: 10 9 8 And suppose the second heap creates the following run: 3 5 7 Those two runs can be seen as just one by mergesort... since they could be physically merged into one single run: at first we could write the elements 3,5,7 and then the elements of the other run, red upside down. Possible advantages: Having two heaps of that kinds lets RS better adapt to local variations of the input trend. This technique can be called Two Ways Replacement Selection (2WRS) just because of those 2 heaps. As an extreme example, we can say that having the input already sort in reverse order no more leads us to the worst case: with 2WRS no matter the input is already sort in ascending/descending order... in this case we'll produce just one run instead of producing the maximum number of runs as in RS worst case (input in reverse order). Moreover it lets us to grow the current run in 2 ways: just imagine we would output runs in a regular file. With 2WRS this could be seen as start outputting elements from the middle of such a regular file, the descending heap outputting elements from the middle upwards while the ascending one outputting from the middle downward. This could imply getting a smaller number of dead records (as I said in previous mails, a dear record is an element that won't form part of the current run) and so having longer runs. Others optimizations, for example, can be done with the virtual concatenation technique: storing a cache of couples (first_element,last_element) for each created run. This could be useful in case we can find 2 couples (first_element_1, last_element_1) and (first_element_2, last_element_2) with last_element_1 = first_element_2. In this case, those runs too can be seen as belonging to the same logical run (actually they are 2 RS different physical runs, or even 4 in 2WRS but can be seen as just one by mergesort). Of course, once those 2 (or 4) runs are logically merged into that only one, this last one in turn could be merged to other runs. What does all that imply? Mergesort would actually consider a smaller number of runs (since it should just work on logical runs). This means less jumps between runs on disk. Now... to test those refinements I should integrate my code into PostgreSQL... but it's not that easy for me... Thanks for your attention. PREVIOUS MAIL-- ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
On Tue, 2007-11-27 at 09:25 +0100, [EMAIL PROTECTED] wrote: Others optimizations, for example, can be done with the virtual concatenation technique: storing a cache of couples (first_element,last_element) for each created run. This could be useful in case we can find 2 couples (first_element_1, last_element_1) and (first_element_2, last_element_2) with last_element_1 = first_element_2. In this case, those runs too can be seen as belonging to the same logical run (actually they are 2 RS different physical runs, or even 4 in 2WRS but can be seen as just one by mergesort). Of course, once those 2 (or 4) runs are logically merged into that only one, this last one in turn could be merged to other runs. What does all that imply? Mergesort would actually consider a smaller number of runs (since it should just work on logical runs). This means less jumps between runs on disk. That's actually a refinement of an idea I've been working on for optimizing sort. I'll post those separately. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
Any comment about Two Ways Replacement Selection (two heaps instead of just one) ? -- From: Simon Riggs [EMAIL PROTECTED] Sent: Tuesday, November 27, 2007 1:03 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection On Tue, 2007-11-27 at 09:25 +0100, [EMAIL PROTECTED] wrote: Others optimizations, for example, can be done with the virtual concatenation technique: storing a cache of couples (first_element,last_element) for each created run. This could be useful in case we can find 2 couples (first_element_1, last_element_1) and (first_element_2, last_element_2) with last_element_1 = first_element_2. In this case, those runs too can be seen as belonging to the same logical run (actually they are 2 RS different physical runs, or even 4 in 2WRS but can be seen as just one by mergesort). Of course, once those 2 (or 4) runs are logically merged into that only one, this last one in turn could be merged to other runs. What does all that imply? Mergesort would actually consider a smaller number of runs (since it should just work on logical runs). This means less jumps between runs on disk. That's actually a refinement of an idea I've been working on for optimizing sort. I'll post those separately. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Replacement Selection
On Tue, 2007-11-27 at 17:49 +0100, [EMAIL PROTECTED] wrote: Any comment about Two Ways Replacement Selection (two heaps instead of just one) ? It might allow dynamic heap size management more easily than with a single heap. If you really think it will be better, try it. You'll learn loads, right or wrong. It's difficult to forecast ahead of time what's a good idea and what's a bad idea. The real truth of these things is that you need to pop the hood and start tinkering and its's quite hard to make a plan for that. If you have a bad idea, just move on to the next one; they're just ideas. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Replacement Selection
Hi to all. I'm new. I'd like to integrate my code into PostgreSQL. It's the implementation of some refinements of Replacement Selection algorithm used for External Sorting. I have got some issue and preferibly I'd like to be supported by some developers that have something to do with it. Who can I talk to? Thanks for your attentions. Good Luck! Manolo. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: I'm new. I'd like to integrate my code into PostgreSQL. It's the implementation of some refinements of Replacement Selection algorithm used for External Sorting. I have got some issue and preferibly I'd like to be supported by some developers that have something to do with it. Who can I talk to? This mailing list is the right place to discuss that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Replacement Selection
Thanks for your support. I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? Thanks for your attention. -- From: Heikki Linnakangas [EMAIL PROTECTED] Sent: Monday, November 26, 2007 1:35 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] wrote: I'm new. I'd like to integrate my code into PostgreSQL. It's the implementation of some refinements of Replacement Selection algorithm used for External Sorting. I have got some issue and preferibly I'd like to be supported by some developers that have something to do with it. Who can I talk to? This mailing list is the right place to discuss that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: Thanks for your support. I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? src/backend/utils/sort/tuplesort.c -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ I would rather have GNU than GNOT. (ccchips, lwn.net/Articles/37595/) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? In src/backend/utils/sort/tuplesort.c. The comments at the top of that file is a good place to start. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Replacement Selection
Ok guys! Thanks for your help. Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? Thanks for your time! -- From: Heikki Linnakangas [EMAIL PROTECTED] Sent: Monday, November 26, 2007 2:34 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] wrote: I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? In src/backend/utils/sort/tuplesort.c. The comments at the top of that file is a good place to start. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? You haven't given any details on what you're trying to do. What are you trying to do? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: Ok guys! Thanks for your help. Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? I think you should print the file and read it several times until you understand what's going on. Then you can start thinking where and how to modify it. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J Oh, great altar of passive entertainment, bestow upon me thy discordant images at such speed as to render linear thought impossible (Calvin a la TV) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
Alvaro Herrera [EMAIL PROTECTED] writes: [EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? I think you should print the file and read it several times until you understand what's going on. Then you can start thinking where and how to modify it. Also, go find a copy of Knuth volume 3, because a whole lot of the comments assume you've read Knuth's discussion of external sorting. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Fw: [HACKERS] Replacement Selection
Thanks for your advice. The developement of this integration is part of my final project. And fo course my initial bibliografy includes the Knuth reference as you can see 1. Vladimir Estivill-Castro and Derick Wood. A survey of adaptive sorting algorithms. ACM Computing Surveys, 24(4):441{476, 1992. 2. Donald E. Knuth. The art of computer programming, volume 3: sorting and searching. Addison-Wesley, Reading, 2nd edition, 1998. 3. P. Larson and G. Graefe. Memory management during run generation in external sorting. In ACM, editor, SIGMOD98, pages 472{483, 1998. 4. Per-Ake Larson. External sorting: Run formation revisited. IEEE Transactions on Knowledge and Data Engineering, 15(4):961{972, 2003. 5. Je®rey Scott Vitter and David A. Hutchinson. Distribution sort with randomized cycling. pages 77-86. -- From: Tom Lane [EMAIL PROTECTED] Sent: Monday, November 26, 2007 6:00 PM To: Alvaro Herrera [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection Alvaro Herrera [EMAIL PROTECTED] writes: [EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? I think you should print the file and read it several times until you understand what's going on. Then you can start thinking where and how to modify it. Also, go find a copy of Knuth volume 3, because a whole lot of the comments assume you've read Knuth's discussion of external sorting. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Replacement Selection
Sorry. I'm trying to integrate my code into PostgreSQL. At the moment I have got my working code, with my own main() etc etc. The code is supposed to perform run generation during external sorting. That's all, my code won't do any mergesort. Just run generation. I'm studing the code and I don't know where to put my code into. Which part I need to substitute and which other are absolutely untouchables. I admit I'm not an excellent programmer. I've always been writing my own codes, simple codes. Now I have got some ideas that can possibly help postgreSQL to get better. And for the first time I'm to integrate code into others code. I say it just to apologize in case some things that could be obvious for someone else, maybe are not for me. Anyway... back to work. My code has the following structure. 1) Generates a random input stream to sort. As for this part, i just generate an integer input stream, not a stream of db records. I talk about stream because I'm in a general case in which the input source can be unknown and we cannot even know how much elements to sort 2)Fill the available memory with the first M elements from stream. They will be arranged into an heap structure. 3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth algorithm) marks elements belonging to runs in otder to know which run they belong to and to know when the current heap has finished building the current run. I don't memorize this kind of info. I just output from heap to run all of the elements going into the current run. The elements supposed to go into the next run (I call them dead records) are still stored into main memory, but as leaves of the heap. This implies reducing the heap size and so heapifying a smaller number of elements each time I get a dead record (it's not necessary to sort dead records). When the heap size is zero a new run is created heapifying all the dead records currently present into main memory. I haven't seen something similar into tuplesort.c, apparently no heapify is called no new run created and stuff like this. Do you see any parallelism between PostgreSQL code with what I said in the previous points? Thanks for your attention. -- From: Heikki Linnakangas [EMAIL PROTECTED] Sent: Monday, November 26, 2007 5:42 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? You haven't given any details on what you're trying to do. What are you trying to do? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] writes: 3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth algorithm) marks elements belonging to runs in otder to know which run they belong to and to know when the current heap has finished building the current run. I don't memorize this kind of info. I just output from heap to run all of the elements going into the current run. The elements supposed to go into the next run (I call them dead records) are still stored into main memory, but as leaves of the heap. This implies reducing the heap size and so heapifying a smaller number of elements each time I get a dead record (it's not necessary to sort dead records). When the heap size is zero a new run is created heapifying all the dead records currently present into main memory. Why would this be an improvement over Knuth? AFAICS you can't generate longer runs this way, and it's not saving any time --- in fact it's costing time, because re-heapifying adds a lot of new comparisons. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Replacement Selection
I must precise that it's not the improvement. Other more complex algorithms correspond to the refinements, but at the moment I just want to know which part of PostgreSQL code does what. I also implemented Replacement Selection (RS) so if I'm able to integrate my RS I hope I would be able to integrate the others too. Anyway, even in my RS implementation a longer run is created. The first M initialization elements will surely form part of the current run. M is the memory size so at least a run sized M will be created. After initialization, the elements are not suddenly output, but an element from heap is output into run as soon as I get an element from stream. In other words, for each element from stream, the root element of the heap is output, and the input element takes the root place into the heap. If that element is a good record I just heapify (since the element will be placed at the now free root place). If that input element is a dead record I swap it with the last leaf and reduce the heap size. -- From: Tom Lane [EMAIL PROTECTED] Sent: Monday, November 26, 2007 7:31 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] writes: 3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth algorithm) marks elements belonging to runs in otder to know which run they belong to and to know when the current heap has finished building the current run. I don't memorize this kind of info. I just output from heap to run all of the elements going into the current run. The elements supposed to go into the next run (I call them dead records) are still stored into main memory, but as leaves of the heap. This implies reducing the heap size and so heapifying a smaller number of elements each time I get a dead record (it's not necessary to sort dead records). When the heap size is zero a new run is created heapifying all the dead records currently present into main memory. Why would this be an improvement over Knuth? AFAICS you can't generate longer runs this way, and it's not saving any time --- in fact it's costing time, because re-heapifying adds a lot of new comparisons. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: I also implemented Replacement Selection (RS) so if I'm able to integrate my RS I hope I would be able to integrate the others too. The existing code implements RS. Tom asked you to describe what improvements you hope to make; I'm confident that he already understands how to implement RS. :-) ** Why don't you compile with TRACE_SORT enabled and watch the log output. The function in tuplesort.c that you should start with is puttuple_common(). in puttuple_common(), the transition from an internal to external sort is performed at the bottom of the TSS_INITIAL case in the main switch statement. The function dumptuples() heapifies the in-core tuples (divides the in-core tuples into initial runs and then advances the state to TSS_BUILDRUNS). All subsequent tuples will hit the TSS_BUILDRUNS case and will insert tuples into the heap; emitting tuples for the current run as it goes. I recommend you run the code in the debugger on a external-sorting query: watch two or three tuples go into the heap and you'll get the idea. The top of the heap is at state-memtuples[0] the heap goes down from there. New tuples are added there and the heap is adjusted (Using the tuplesort_heap_siftup() function). -Tim ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] writes: Anyway, even in my RS implementation a longer run is created. The first M initialization elements will surely form part of the current run. M is the memory size so at least a run sized M will be created. After initialization, the elements are not suddenly output, but an element from heap is output into run as soon as I get an element from stream. In other words, for each element from stream, the root element of the heap is output, and the input element takes the root place into the heap. If that element is a good record I just heapify (since the element will be placed at the now free root place). If that input element is a dead record I swap it with the last leaf and reduce the heap size. AFAICS that produces runs that are *exactly* the same length as Knuth's method --- you're just using a different technique for detecting when the run is over, to wit record is not in heap vs record is in heap but with a higher run number. I guess you would save some comparisons while the heap is shrinking, but it's not at all clear that you'd save more than what it will cost you to re-heapify all the dead records once the run is over. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Replacement Selection
Tom Lane [EMAIL PROTECTED] writes: AFAICS that produces runs that are *exactly* the same length as Knuth's method --- you're just using a different technique for detecting when the run is over, to wit record is not in heap vs record is in heap but with a higher run number. I guess you would save some comparisons while the heap is shrinking, but it's not at all clear that you'd save more than what it will cost you to re-heapify all the dead records once the run is over. This sounded familiar... It sounds a lot like what this CVS log message is describing as a mistaken idea: revision 1.2 date: 1999-10-30 18:27:15 +0100; author: tgl; state: Exp; lines: +423 -191; Further performance improvements in sorting: reduce number of comparisons during initial run formation by keeping both current run and next-run tuples in the same heap (yup, Knuth is smarter than I am). And, during merge passes, make use of available sort memory to load multiple tuples from any one input 'tape' at a time, thereby improving locality of access to the temp file. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Replacement Selection
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I guess you would save some comparisons while the heap is shrinking, but it's not at all clear that you'd save more than what it will cost you to re-heapify all the dead records once the run is over. This sounded familiar... It sounds a lot like what this CVS log message is describing as a mistaken idea: Wow, I had forgotten all about that; but yeah this sounds exactly like my first-cut rewrite of PG's sorting back in 1999. I have some vague memory of having dismissed Knuth's approach as being silly because of the extra space and (small number of) cycles needed to compare run numbers in the heap. I hadn't realized that there was an impact on total number of comparisons required :-( The discussion from that time period in pgsql-hackers makes it sound like you need a large test case to notice the problem, though. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly