Re: [HACKERS] [PATCH] Incremental sort
On Tue, Oct 3, 2017 at 2:52 PM, Robert Haas wrote: > On Mon, Oct 2, 2017 at 12:37 PM, Alexander Korotkov > wrote: > > I've applied patch on top of c12d570f and rerun the same benchmarks. > > CSV-file with results is attached. There is no dramatical changes. > There > > is still minority of performance regression cases while majority of cases > > has improvement. > > Yes, I think these results look pretty good. But are these times in > seconds? You might need to do some testing with bigger sorts. Good point. I'll rerun benchmarks with larger dataset size. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
On Mon, Oct 2, 2017 at 12:37 PM, Alexander Korotkov wrote: > I've applied patch on top of c12d570f and rerun the same benchmarks. > CSV-file with results is attached. There is no dramatical changes. There > is still minority of performance regression cases while majority of cases > has improvement. Yes, I think these results look pretty good. But are these times in seconds? You might need to do some testing with bigger sorts. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Sat, Sep 16, 2017 at 2:46 AM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > On Thu, Sep 14, 2017 at 2:48 AM, Alexander Korotkov < > a.korot...@postgrespro.ru> wrote: > >> Patch rebased to current master is attached. I'm going to improve my >> testing script and post new results. >> > > New benchmarking script and results are attached. There new dataset > parameter is introduced: skew factor. Skew factor defines skew in > distribution of groups sizes. > My idea of generating is just usage of power function where power is from > 0 to 1. Following formula is used to get group number for particular item > number i. > [((i / number_of_indexes) ^ power) * number_of_groups] > For example, power = 1/6 gives following distribution of groups sizes: > group numbergroup size > 0 2 > 1 63 > 2 665 > 3 3367 > 4 11529 > 5 31031 > 6 70993 > 7 144495 > 8 269297 > 9 468558 > > For convenience, instead of power itself, I use skew factor where power = > 1.0 / (1.0 + skew). Therefore, with skew = 0.0, distribution of groups > sizes is uniform. Larger skew gives more skewed distribution (and that > seems to be quite intuitive). For, negative skew, group sizes are mirrored > as for corresponding positive skew. For example, skew factor = -5.0 gives > following groups sizes distribution: > group numbergroup size > 0 468558 > 1 269297 > 2 144495 > 3 70993 > 4 31031 > 5 11529 > 6 3367 > 7 665 > 8 63 > 9 2 > > Results shows that between 2172 test cases, in 2113 incremental sort gives > speedup while in 59 it causes slowdown. The following 4 test cases show > most significant slowdown (>10% of time). > > Table GroupedCols GroupCount Skew PreorderedFrac > FullSortMedian IncSortMedian TimeChangePercent > int4|int4|numeric 1 100 -10 0 > 1.5688240528 2.0607631207 31.36 > text|int8|text|int4 110 0 > 1.7785198689 <(778)%20519-8689> 2.1816160679 22.66 > int8|int8|int4 1 10 -10 0 > 1.136412859 1.3166360855 15.86 > numeric|text|int4|int8 2 10 -10 1 > 0.4403841496 0.5070910454 15.15 > > As you can see, 3 of this 4 test cases have skewed distribution while one > of them is related to costly location-aware comparison of text. I've no > particular idea of how to cope these slowdowns. Probably, it's OK to have > slowdown in some cases while have speedup in majority of cases (assuming > there is an option to turn off new behavior). Probably, we should teach > optimizer more about skewed distributions of groups, but that doesn't seem > feasible for me. > > Any thoughts? > BTW, replacement selection sort was removed by 8b304b8b. I think it worth to rerun benchmarks after that, because results might be changed. Will do. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
On Fri, May 5, 2017 at 11:13 AM, Alexander Korotkov wrote: > Incremental sort is faster in vast majority of cases. It appears to be > slower only when whose dataset is one sort group. In this case incremental > sort is useless, and it should be considered as misuse of incremental sort. > Slowdown is related to the fact that we anyway have to do extra comparisons, > unless we somehow push our comparison result into qsort itself and save some > cpu cycles (but that would be unreasonable break of encapsulation). Thus, > in such cases regression seems to be inevitable anyway. I think we could > evade this regression during query planning. If we see that there would be > only few groups, we should choose plain sort instead of incremental sort. I'm sorry that I don't have time to review this in detail right now, but it sounds like you are doing good work to file down cases where this might cause regressions, which is great. Regarding the point in the paragraph above, I'd say that it's OK for the planner to be responsible for picking between Sort and Incremental Sort in some way. It is, after all, the planner's job to decide between different strategies for executing the same query and, of course, sometimes it will be wrong, but that's OK as long as it's not wrong too often (or by too much, hopefully). It may be a little difficult to get this right, though, because I'm not sure that the information you need actually exists (or is reliable). For example, consider the case where we need to sort 100m rows and there are 2 groups. If 1 group contains 1 row and the other group contains all of the rest, there is really no point in an incremental sort. On the other hand, if each group contains 50m rows and we can get the data presorted by the grouping column, there might be a lot of point to an incremental sort, because two 50m-row sorts might be a lot cheaper than one 100m sort. More generally, it's quite easy to imagine situations where the individual groups can be quicksorted but sorting all of the rows requires I/O, even when the number of groups isn't that big. On the other hand, the real sweet spot for this is probably the case where the number of groups is very large, with many single-row groups or many groups with just a few rows each, so if we can at least get this to work in those cases that may be good enough. On the third hand, when costing aggregation, I think we often underestimate the number of groups and there might well be similar problems here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Thu, Apr 27, 2017 at 5:06 PM, Robert Haas wrote: > On Wed, Apr 26, 2017 at 11:39 AM, Alexander Korotkov > wrote: > > But I'd like to make incremental sort not slower than quicksort in case > of > > presorted data. New idea about it comes to my mind. Since cause of > > incremental sort slowness in this case is too frequent reset of > tuplesort, > > then what if we would artificially put data in larger groups. Attached > > revision of patch implements this: it doesn't stop to accumulate tuples > to > > tuplesort until we have MIN_GROUP_SIZE tuples. > > > > Now, incremental sort is not slower than quicksort. And this seems to be > > cool. > > However, in the LIMIT case we will pay the price of fetching some extra > > tuples from outer node. But, that doesn't seem to hurt us too much. > > > > Any thoughts? > > Nice idea. Cool. Than I'm going to make a set of synthetic performance tests in order to ensure that there is no regression. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
On Wed, Apr 26, 2017 at 11:39 AM, Alexander Korotkov wrote: > But I'd like to make incremental sort not slower than quicksort in case of > presorted data. New idea about it comes to my mind. Since cause of > incremental sort slowness in this case is too frequent reset of tuplesort, > then what if we would artificially put data in larger groups. Attached > revision of patch implements this: it doesn't stop to accumulate tuples to > tuplesort until we have MIN_GROUP_SIZE tuples. > > Now, incremental sort is not slower than quicksort. And this seems to be > cool. > However, in the LIMIT case we will pay the price of fetching some extra > tuples from outer node. But, that doesn't seem to hurt us too much. > > Any thoughts? Nice idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Wed, Apr 26, 2017 at 8:20 PM, Peter Geoghegan wrote: > On Wed, Apr 26, 2017 at 10:10 AM, Alexander Korotkov > wrote: > > OK, I get it. Our qsort is so fast not only on 100% presorted case. > > However, that doesn't change many things in context of incremental sort. > > The important point is to make any presorted test case only ~99% > presorted, so as to not give too much credit to the "high risk" > presort check optimization. > > The switch to insertion sort that we left in (not the bad one removed > by a3f0b3d -- the insertion sort that actually comes from the B&M > paper) does "legitimately" make sorting faster with presorted cases. I'm still focusing on making incremental sort not slower than qsort with presorted optimization. Independently on whether this is "high risk" optimization or not... However, adding more test cases is always good. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
On Wed, Apr 26, 2017 at 10:10 AM, Alexander Korotkov wrote: > OK, I get it. Our qsort is so fast not only on 100% presorted case. > However, that doesn't change many things in context of incremental sort. The important point is to make any presorted test case only ~99% presorted, so as to not give too much credit to the "high risk" presort check optimization. The switch to insertion sort that we left in (not the bad one removed by a3f0b3d -- the insertion sort that actually comes from the B&M paper) does "legitimately" make sorting faster with presorted cases. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Wed, Apr 26, 2017 at 7:56 PM, Peter Geoghegan wrote: > On Wed, Apr 26, 2017 at 8:39 AM, Alexander Korotkov > wrote: > > That appears to be wrong. I intended to make cost_sort prefer plain sort > > over incremental sort for this dataset size. But, that appears to be not > > always right solution. Quick sort is so fast only on presorted data. > > As you may know, I've often said that the precheck for sorted input > added to our quicksort implementation by a3f0b3d is misguided. It > sometimes throws away a ton of work if the presorted input isn't > *perfectly* presorted. This happens when the first out of order tuple > is towards the end of the presorted input. > > I think that it isn't fair to credit our qsort with doing so well on a > 100% presorted case, because it doesn't do the necessary bookkeeping > to not throw that work away completely in certain important cases. > OK, I get it. Our qsort is so fast not only on 100% presorted case. However, that doesn't change many things in context of incremental sort. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
On Wed, Apr 26, 2017 at 8:39 AM, Alexander Korotkov wrote: > That appears to be wrong. I intended to make cost_sort prefer plain sort > over incremental sort for this dataset size. But, that appears to be not > always right solution. Quick sort is so fast only on presorted data. As you may know, I've often said that the precheck for sorted input added to our quicksort implementation by a3f0b3d is misguided. It sometimes throws away a ton of work if the presorted input isn't *perfectly* presorted. This happens when the first out of order tuple is towards the end of the presorted input. I think that it isn't fair to credit our qsort with doing so well on a 100% presorted case, because it doesn't do the necessary bookkeeping to not throw that work away completely in certain important cases. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On 2017-04-03 22:18:21 -0400, Robert Haas wrote: > On Mon, Apr 3, 2017 at 5:09 PM, Andres Freund wrote: > > To me this hasn't gotten even remotely enough performance evaluation. > > And I don't think it's fair to characterize it as pending since 2013, > > given it was essentially "waiting on author" for most of that. > > This is undeniably a patch which has been kicking around for a lot of > time without getting a lot of attention, and if it just keeps getting > punted down the road, it's never going to become committable. Indeed, it's old. And it hasn't gotten enough timely feedback. But I don't think the wait time can meaningfully be measured by subtracting two dates: The first version of the patch, as a PoC, has been posted 2013-12-14, which then got a good amount of feedback & revisions, and then stalled till 2014-07-12. There a few back-and forths yielded a new version. >From 2014-09-15 till 2015-10-16 the patch stalled, waiting on its author. That version had open todos ([1]), as had the version from 2016-03-13 [2], which weren't addressed 2016-03-30 - unfortunately that was pretty much when the tree was frozen. 2016-09-13 a rebased patch was sent, some minor points were raised 2016-10-02 (unaddressed), a larger review was done 2016-12-01 ([5]), unaddressed till 2017-02-18. At that point we're in this thread. There's obviously some long waiting-on-author periods in there. And some long needs-review periods. > Alexander's questions upthread about what decisions the committer who > took an interest (Heikki) would prefer never really got an answer, for > example. I don't deny that there may be some work left to do here, > but I think blaming the author for a week's delay when this has been > ignored so often for so long is unfair. I'm not trying to blame Alexander for a week's worth of delay, at all. It's just that, well, we're past the original code-freeze date, three days before the "final" code freeze. I don't think fairness is something we can achieve at this point :(. Given the risk of regressions - demonstrated in this thread although partially adressed - and the very limited amount of benchmarking done, it seems unlikely that this is going to be merged. Regards, Andres [1] http://archives.postgresql.org/message-id/CAPpHfdvhwMsG69exCRUGK3ms-ng0PSPcucH5FU6tAaM-qL-1%2Bw%40mail.gmail.com [2] http://archives.postgresql.org/message-id/CAPpHfdvzjYGLTyA-8ib8UYnKLPrewd9Z%3DT4YJNCRWiHWHHweWw%40mail.gmail.com [3] http://archives.postgresql.org/message-id/CAPpHfdtCcHZ-mLWzsFrRCvHpV1LPSaOGooMZ3sa40AkwR=7...@mail.gmail.com [4] http://archives.postgresql.org/message-id/capphfdvj1tdi2wa64zbbp5-yg-uzarxzk3k7j7zt-crx6ys...@mail.gmail.com [5] http://archives.postgresql.org/message-id/CA+TgmoZapyHRm7NVyuyZ+yAV=u1a070bogre7pkgyraegr4...@mail.gmail.com [6] http://archives.postgresql.org/message-id/CAPpHfds1waRZ=NOmueYq0sx1ZSCnt+5QJvizT8ndT2=etze...@mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Mon, Apr 3, 2017 at 5:09 PM, Andres Freund wrote: > To me this hasn't gotten even remotely enough performance evaluation. > And I don't think it's fair to characterize it as pending since 2013, > given it was essentially "waiting on author" for most of that. This is undeniably a patch which has been kicking around for a lot of time without getting a lot of attention, and if it just keeps getting punted down the road, it's never going to become committable. Alexander's questions upthread about what decisions the committer who took an interest (Heikki) would prefer never really got an answer, for example. I don't deny that there may be some work left to do here, but I think blaming the author for a week's delay when this has been ignored so often for so long is unfair. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Tue, Apr 4, 2017 at 12:09 AM, Andres Freund wrote: > On 2017-04-04 00:04:09 +0300, Alexander Korotkov wrote: > > > >Thank you! > > > >I already sent version of patch after David's reminder. > > > >Please find rebased patch in the attachment. > > > > > > Cool. I think that's still a bit late for v10? > > > > > > > I don't know. ISTM, that I addressed all the issues raised by reviewers. > > Also, this patch is pending since late 2013. It would be very nice to > > finally get it in... > > To me this hasn't gotten even remotely enough performance evaluation. > I'm ready to put my efforts on that. > And I don't think it's fair to characterize it as pending since 2013, > Probably, this duration isn't good characteristic at all. > given it was essentially "waiting on author" for most of that. > What makes you think so? Do you have some statistics? Or is it just random assumption? -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
Hi, On 2017-04-04 00:04:09 +0300, Alexander Korotkov wrote: > > >Thank you! > > >I already sent version of patch after David's reminder. > > >Please find rebased patch in the attachment. > > > > Cool. I think that's still a bit late for v10? > > > > I don't know. ISTM, that I addressed all the issues raised by reviewers. > Also, this patch is pending since late 2013. It would be very nice to > finally get it in... To me this hasn't gotten even remotely enough performance evaluation. And I don't think it's fair to characterize it as pending since 2013, given it was essentially "waiting on author" for most of that. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Mon, Apr 3, 2017 at 10:05 PM, Andres Freund wrote: > On April 3, 2017 12:03:56 PM PDT, Alexander Korotkov < > a.korot...@postgrespro.ru> wrote: > >On Mon, Apr 3, 2017 at 9:34 PM, Andres Freund > >wrote: > > > >> On 2017-03-29 00:17:02 +0300, Alexander Korotkov wrote: > >> > On Tue, Mar 28, 2017 at 5:27 PM, David Steele > >> wrote: > >> > > On 3/20/17 10:19 AM, Heikki Linnakangas wrote: > >> > > > >> > >> On 03/20/2017 11:33 AM, Alexander Korotkov wrote: > >> > >> > >> > >>> Please, find rebased patch in the attachment. > >> > >>> > >> > >> > >> > >> I had a quick look at this. > >> > >> > >> > > > >> > > <...> > >> > > > >> > > According to 'perf', 85% of the CPU time is spent in > >ExecCopySlot(). To > >> > >> alleviate that, it might be worthwhile to add a special case for > >when > >> > >> the group contains exactly one group, and not put the tuple to > >the > >> > >> tuplesort in that case. Or if we cannot ensure that the > >Incremental > >> Sort > >> > >> is actually faster, the cost model should probably be smarter, > >to > >> avoid > >> > >> picking an incremental sort when it's not a win. > >> > >> > >> > > > >> > > This thread has been idle for over a week. Please respond with a > >new > >> > > patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be > >> marked > >> > > "Returned with Feedback". > >> > >> > Thank you for reminder! > >> > >> I've just done so. Please resubmit once updated, it's a cool > >feature. > >> > > > >Thank you! > >I already sent version of patch after David's reminder. > >Please find rebased patch in the attachment. > > Cool. I think that's still a bit late for v10? > I don't know. ISTM, that I addressed all the issues raised by reviewers. Also, this patch is pending since late 2013. It would be very nice to finally get it in... -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
On April 3, 2017 12:03:56 PM PDT, Alexander Korotkov wrote: >On Mon, Apr 3, 2017 at 9:34 PM, Andres Freund >wrote: > >> On 2017-03-29 00:17:02 +0300, Alexander Korotkov wrote: >> > On Tue, Mar 28, 2017 at 5:27 PM, David Steele >> wrote: >> > > On 3/20/17 10:19 AM, Heikki Linnakangas wrote: >> > > >> > >> On 03/20/2017 11:33 AM, Alexander Korotkov wrote: >> > >> >> > >>> Please, find rebased patch in the attachment. >> > >>> >> > >> >> > >> I had a quick look at this. >> > >> >> > > >> > > <...> >> > > >> > > According to 'perf', 85% of the CPU time is spent in >ExecCopySlot(). To >> > >> alleviate that, it might be worthwhile to add a special case for >when >> > >> the group contains exactly one group, and not put the tuple to >the >> > >> tuplesort in that case. Or if we cannot ensure that the >Incremental >> Sort >> > >> is actually faster, the cost model should probably be smarter, >to >> avoid >> > >> picking an incremental sort when it's not a win. >> > >> >> > > >> > > This thread has been idle for over a week. Please respond with a >new >> > > patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be >> marked >> > > "Returned with Feedback". >> >> > Thank you for reminder! >> >> I've just done so. Please resubmit once updated, it's a cool >feature. >> > >Thank you! >I already sent version of patch after David's reminder. >Please find rebased patch in the attachment. Cool. I think that's still a bit late for v10? Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On 2017-03-29 00:17:02 +0300, Alexander Korotkov wrote: > On Tue, Mar 28, 2017 at 5:27 PM, David Steele wrote: > > > Hi Alexander, > > > > On 3/20/17 10:19 AM, Heikki Linnakangas wrote: > > > >> On 03/20/2017 11:33 AM, Alexander Korotkov wrote: > >> > >>> Please, find rebased patch in the attachment. > >>> > >> > >> I had a quick look at this. > >> > > > > <...> > > > > According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To > >> alleviate that, it might be worthwhile to add a special case for when > >> the group contains exactly one group, and not put the tuple to the > >> tuplesort in that case. Or if we cannot ensure that the Incremental Sort > >> is actually faster, the cost model should probably be smarter, to avoid > >> picking an incremental sort when it's not a win. > >> > > > > This thread has been idle for over a week. Please respond with a new > > patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked > > "Returned with Feedback". > Thank you for reminder! I've just done so. Please resubmit once updated, it's a cool feature. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On Tue, Mar 28, 2017 at 5:27 PM, David Steele wrote: > Hi Alexander, > > On 3/20/17 10:19 AM, Heikki Linnakangas wrote: > >> On 03/20/2017 11:33 AM, Alexander Korotkov wrote: >> >>> Please, find rebased patch in the attachment. >>> >> >> I had a quick look at this. >> > > <...> > > According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To >> alleviate that, it might be worthwhile to add a special case for when >> the group contains exactly one group, and not put the tuple to the >> tuplesort in that case. Or if we cannot ensure that the Incremental Sort >> is actually faster, the cost model should probably be smarter, to avoid >> picking an incremental sort when it's not a win. >> > > This thread has been idle for over a week. Please respond with a new > patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked > "Returned with Feedback". Thank you for reminder! -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Incremental sort
Hi Alexander, On 3/20/17 10:19 AM, Heikki Linnakangas wrote: On 03/20/2017 11:33 AM, Alexander Korotkov wrote: Please, find rebased patch in the attachment. I had a quick look at this. <...> According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To alleviate that, it might be worthwhile to add a special case for when the group contains exactly one group, and not put the tuple to the tuplesort in that case. Or if we cannot ensure that the Incremental Sort is actually faster, the cost model should probably be smarter, to avoid picking an incremental sort when it's not a win. This thread has been idle for over a week. Please respond with a new patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked "Returned with Feedback". -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort
On 03/20/2017 11:33 AM, Alexander Korotkov wrote: Please, find rebased patch in the attachment. I had a quick look at this. * I'd love to have an explanation of what an Incremental Sort is, in the file header comment for nodeIncrementalSort.c. * I didn't understand the maxMem stuff in tuplesort.c. The comments there use the phrase "on-disk memory", which seems like an oxymoron. Also, "maximum status" seems weird, as it assumes that there's a natural order to the states. * In the below example, the incremental sort is significantly slower than the Seq Scan + Sort you get otherwise: create table foo (a int4, b int4, c int4); insert into sorttest select g, g, g from generate_series(1, 100) g; vacuum foo; create index i_sorttest on sorttest (a, b, c); set work_mem='100MB'; postgres=# explain select count(*) from (select * from sorttest order by a, c) as t; QUERY PLAN --- Aggregate (cost=138655.68..138655.69 rows=1 width=8) -> Incremental Sort (cost=610.99..124870.38 rows=1102824 width=12) Sort Key: sorttest.a, sorttest.c Presorted Key: sorttest.a -> Index Only Scan using i_sorttest on sorttest (cost=0.43..53578.79 rows=1102824 width=12) (5 rows) Time: 0.409 ms postgres=# select count(*) from (select * from sorttest order by a, c) as t; count - 100 (1 row) Time: 387.091 ms postgres=# explain select count(*) from (select * from sorttest order by a, c) as t; QUERY PLAN --- Aggregate (cost=130063.84..130063.85 rows=1 width=8) -> Sort (cost=115063.84..117563.84 rows=100 width=12) Sort Key: sorttest.a, sorttest.c -> Seq Scan on sorttest (cost=0.00..15406.00 rows=100 width=12) (4 rows) Time: 0.345 ms postgres=# select count(*) from (select * from sorttest order by a, c) as t; count - 100 (1 row) Time: 231.668 ms According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To alleviate that, it might be worthwhile to add a special case for when the group contains exactly one group, and not put the tuple to the tuplesort in that case. Or if we cannot ensure that the Incremental Sort is actually faster, the cost model should probably be smarter, to avoid picking an incremental sort when it's not a win. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort (was: PoC: Partial sort)
On Mon, Feb 27, 2017 at 8:29 PM, Alexander Korotkov wrote: This patch needs to be rebased. 1. It fails while applying as below patching file src/test/regress/expected/sysviews.out Hunk #1 FAILED at 70. 1 out of 1 hunk FAILED -- saving rejects to file src/test/regress/expected/sysviews.out.rej patching file src/test/regress/sql/inherit.sql 2. Also, there are compilation errors due to new commits. -fwrapv -fexcess-precision=standard -O2 -I../../../../src/include -D_GNU_SOURCE -c -o createplan.o createplan.c createplan.c: In function ‘create_gather_merge_plan’: createplan.c:1510:11: warning: passing argument 3 of ‘make_sort’ makes integer from pointer without a cast [enabled by default] gm_plan->nullsFirst); ^ createplan.c:235:14: note: expected ‘int’ but argument is of type ‘AttrNumber *’ static Sort *make_sort(Plan *lefttree, int numCols, int skipCols, ^ createplan.c:1510:11: warning: passing argument 4 of ‘make_sort’ from incompatible pointer type [enabled by default] gm_plan->nullsFirst); -- Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Incremental sort (was: PoC: Partial sort)
On Sat, Feb 18, 2017 at 4:01 PM, Alexander Korotkov wrote: > I decided to start new thread for this patch for following two reasons. > * It's renamed from "Partial sort" to "Incremental sort" per suggestion by > Robert Haas [1]. New name much better characterizes the essence of > algorithm. > * I think it's not PoC anymore. Patch received several rounds of review > and now it's in the pretty good shape. > > Attached revision of patch has following changes. > * According to review [1], two new path and plan nodes are responsible for > incremental sort: IncSortPath and IncSort which are inherited from SortPath > and Sort correspondingly. That allowed to get rid of set of hacks with > minimal code changes. > * According to review [1] and comment [2], previous tuple is stored in > standalone tuple slot of SortState rather than just HeapTuple. > * New GUC parameter enable_incsort is introduced to control planner ability > to choose incremental sort. > * Test of postgres_fdw with not pushed down cross join is corrected. It > appeared that with incremental sort such query is profitable to push down. > I changed ORDER BY columns so that index couldn't be used. I think this > solution is more elegant than setting enable_incsort = off. I usually advocate for spelling things out instead of abbreviating, so I guess I'll stay true to form here and suggest that abbreviating incremental to inc doesn't seem like a great idea. Is that sort incrementing, incremental, incredible, incautious, or incorporated? The first hunk in the patch, a change in the postgres_fdw regression test output, looks an awful lot like a bug: now the query that formerly returned various different numbers is returning all zeroes. It might not actually be a bug, because you've also changed the test query (not sure why), but anyway the new regression test output that is all zeroes seems less useful for catching bugs in, say, the ordering of the results than the old output where the different rows were different. I don't know of any existing cases where the same executor file is responsible for executing more than 1 different type of executor node. I was imagining a more-complete separation of the new executor node. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Incremental sort (was: PoC: Partial sort)
Hi all! I decided to start new thread for this patch for following two reasons. * It's renamed from "Partial sort" to "Incremental sort" per suggestion by Robert Haas [1]. New name much better characterizes the essence of algorithm. * I think it's not PoC anymore. Patch received several rounds of review and now it's in the pretty good shape. Attached revision of patch has following changes. * According to review [1], two new path and plan nodes are responsible for incremental sort: IncSortPath and IncSort which are inherited from SortPath and Sort correspondingly. That allowed to get rid of set of hacks with minimal code changes. * According to review [1] and comment [2], previous tuple is stored in standalone tuple slot of SortState rather than just HeapTuple. * New GUC parameter enable_incsort is introduced to control planner ability to choose incremental sort. * Test of postgres_fdw with not pushed down cross join is corrected. It appeared that with incremental sort such query is profitable to push down. I changed ORDER BY columns so that index couldn't be used. I think this solution is more elegant than setting enable_incsort = off. Also patch has set of assorted code and comments improvements. Links 1. https://www.postgresql.org/message-id/CA+TgmoZapyHRm7NVyuyZ+yAV=u1a070bogre7pkgyraegr4...@mail.gmail.com 2. https://www.postgresql.org/message-id/cam3swzql4yd2sndhemcgl0q2b2otdkuvv_l6zg_fcgoluwm...@mail.gmail.com -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company incremental-sort-1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers