Re: [HACKERS] gistchoose vs. bloat
On 21.01.2013 15:19, Heikki Linnakangas wrote: On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Meh. There's already a random decision, in the equivalent place and for a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever complained about that being indeterminate, so I'm unconvinced that there's a market for it with gist. I wonder if it would work for gist to do something similar to _bt_findinsertloc, and have a bias towards the left page, but sometimes descend to one of the pages to the right. You would get the cache locality of usually descending down the same subtree, but also fill the pages to the right. Jeff / Alexander, want to give that a shot? I did some experimenting with that. I used the same test case Alexander did, with geonames data, and compared unpatched version, the original patch, and the attached patch that biases the first best tuple found, but still sometimes chooses the other equally good ones. testname| initsize | finalsize | idx_blks_read | idx_blks_hit +--+---+---+-- patched-10-4mb | 75497472 | 90202112 | 5853604 | 10178331 unpatched-4mb | 75145216 | 94863360 | 5880676 | 10185647 unpatched-4mb | 75587584 | 97165312 | 5903107 | 10183759 patched-2-4mb | 74768384 | 81403904 | 5768124 | 10193738 origpatch-4mb | 74883072 | 82182144 | 5783412 | 10185373 All these tests were performed with shared_buffers=4MB, so that the index won't fit completely in shared buffers, and I could use the idx_blk_read/hit ratio as a measure of cache-friendliness. The origpath test was with a simplified version of Alexander's patch, see attached. It's the same as the original, but with the randomization-option and gist-build related code removed. The patched-10 and patched-2 tests are two variants with my patch, with 1/10 and 1/2 chance of moving to the next equal tuple, respectively. The differences in cache hit ratio might be just a result of different index sizes. I included two unpatched runs above to show that there's a fair amount of noise in these tests. That's because of the random nature of the test case; it picks rows to delete and insert at random. I think the conclusion is that all of these patches are effective. The 1/10 variant is less effective, as expected, as it's closer in behavior to the unpatched behavior than the others. The 1/2 variant seems as good as the original patch. I performed another test, by inserting 100 duplicate values on an empty table. testname | finalsize | idx_blks_read | idx_blks_hit +---+---+-- unpatched | 89030656 | 21350 | 2972033 patched-10 | 88973312 | 21450 | 2971920 patched-2 | 88481792 | 22947 | 2970260 origpatch | 61186048 |761817 | 2221500 The original patch produces a much smaller index in this test, but since the inserts are distributed randomly, the pages are accessed randomly which is bad for caching. A table full of duplicates isn't very realistic, but overall, I'm leaning towards my version of this patch (gistchoose-2.patch). It has less potential for causing a regression in existing applications, but is just as effective in the original scenario of repeated delete+insert. - Heikki duplicatetest.sh Description: Bourne shell script gistbloat.sh Description: Bourne shell script diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c index 8b60774..75778f6 100644 --- a/src/backend/access/gist/gistutil.c +++ b/src/backend/access/gist/gistutil.c @@ -379,6 +379,7 @@ gistchoose(Relation r, Page p, IndexTuple it, /* it has compressed entry */ GISTENTRY entry, identry[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; + int look_further_on_equal = -1; Assert(!GistPageIsLeaf(p)); @@ -446,6 +447,8 @@ gistchoose(Relation r, Page p, IndexTuple it, /* it has compressed entry */ if (j r-rd_att-natts - 1) best_penalty[j + 1] = -1; + +look_further_on_equal = -1; } else if (best_penalty[j] == usize) { @@ -468,12 +471,46 @@
Re: [HACKERS] gistchoose vs. bloat
On 21.01.2013 15:19, Heikki Linnakangas wrote: On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Meh. There's already a random decision, in the equivalent place and for a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever complained about that being indeterminate, so I'm unconvinced that there's a market for it with gist. I wonder if it would work for gist to do something similar to _bt_findinsertloc, and have a bias towards the left page, but sometimes descend to one of the pages to the right. You would get the cache locality of usually descending down the same subtree, but also fill the pages to the right. Jeff / Alexander, want to give that a shot? I did some experimenting with that. I used the same test case Alexander did, with geonames data, and compared unpatched version, the original patch, and the attached patch that biases the first best tuple found, but still sometimes chooses the other equally good ones. testname| initsize | finalsize | idx_blks_read | idx_blks_hit +--+---+---+-- patched-10-4mb | 75497472 | 90202112 | 5853604 | 10178331 unpatched-4mb | 75145216 | 94863360 | 5880676 | 10185647 unpatched-4mb | 75587584 | 97165312 | 5903107 | 10183759 patched-2-4mb | 74768384 | 81403904 | 5768124 | 10193738 origpatch-4mb | 74883072 | 82182144 | 5783412 | 10185373 All these tests were performed with shared_buffers=4MB, so that the index won't fit completely in shared buffers, and I could use the idx_blk_read/hit ratio as a measure of cache-friendliness. The origpath test was with a simplified version of Alexander's patch, see attached. It's the same as the original, but with the randomization-option and gist-build related code removed. The patched-10 and patched-2 tests are two variants with my patch, with 1/10 and 1/2 chance of moving to the next equal tuple, respectively. The differences in cache hit ratio might be just a result of different index sizes. I included two unpatched runs above to show that there's a fair amount of noise in these tests. That's because of the random nature of the test case; it picks rows to delete and insert at random. I think the conclusion is that all of these patches are effective. The 1/10 variant is less effective, as expected, as it's closer in behavior to the unpatched behavior than the others. The 1/2 variant seems as good as the original patch. I performed another test, by inserting 100 duplicate values on an empty table. testname | finalsize | idx_blks_read | idx_blks_hit +---+---+-- unpatched | 89030656 | 21350 | 2972033 patched-10 | 88973312 | 21450 | 2971920 patched-2 | 88481792 | 22947 | 2970260 origpatch | 61186048 |761817 | 2221500 The original patch produces a much smaller index in this test, but since the inserts are distributed randomly, the pages are accessed randomly which is bad for caching. A table full of duplicates isn't very realistic, but overall, I'm leaning towards my version of this patch (gistchoose-2.patch). It has less potential for causing a regression in existing applications, but is just as effective in the original scenario of repeated delete+insert. - Heikki duplicatetest.sh Description: Bourne shell script gistbloat.sh Description: Bourne shell script diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c index 8b60774..75778f6 100644 --- a/src/backend/access/gist/gistutil.c +++ b/src/backend/access/gist/gistutil.c @@ -379,6 +379,7 @@ gistchoose(Relation r, Page p, IndexTuple it, /* it has compressed entry */ GISTENTRY entry, identry[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; + int look_further_on_equal = -1; Assert(!GistPageIsLeaf(p)); @@ -446,6 +447,8 @@ gistchoose(Relation r, Page p, IndexTuple it, /* it has compressed entry */ if (j r-rd_att-natts - 1) best_penalty[j + 1] = -1; + +look_further_on_equal = -1; } else if (best_penalty[j] == usize) { @@ -468,12 +471,46 @@
Re: [HACKERS] gistchoose vs. bloat
Heikki Linnakangas hlinnakan...@vmware.com writes: I did some experimenting with that. I used the same test case Alexander did, with geonames data, and compared unpatched version, the original patch, and the attached patch that biases the first best tuple found, but still sometimes chooses the other equally good ones. testname| initsize | finalsize | idx_blks_read | idx_blks_hit +--+---+---+-- patched-10-4mb | 75497472 | 90202112 | 5853604 | 10178331 unpatched-4mb | 75145216 | 94863360 | 5880676 | 10185647 unpatched-4mb | 75587584 | 97165312 | 5903107 | 10183759 patched-2-4mb | 74768384 | 81403904 | 5768124 | 10193738 origpatch-4mb | 74883072 | 82182144 | 5783412 | 10185373 I think the conclusion is that all of these patches are effective. The 1/10 variant is less effective, as expected, as it's closer in behavior to the unpatched behavior than the others. The 1/2 variant seems as good as the original patch. At least on this example, it seems a tad better, if you look at index size. A table full of duplicates isn't very realistic, but overall, I'm leaning towards my version of this patch (gistchoose-2.patch). It has less potential for causing a regression in existing applications, but is just as effective in the original scenario of repeated delete+insert. +1 for this patch, but I think the comments could use more work. I was convinced it was wrong on first examination, mainly because it's hard to follow the underdocumented look_further_on_equal logic. I propose the attached, which is the same logic with better comments (I also chose to rename and invert the sense of the state variable, because it seemed easier to follow this way ... YMMV on that though). regards, tom lane diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c index 8b60774f50349222ee04fef13613e3592120973b..a127627334e80704edeed38d3522e4000a695abb 100644 *** a/src/backend/access/gist/gistutil.c --- b/src/backend/access/gist/gistutil.c *** gistchoose(Relation r, Page p, IndexTupl *** 379,384 --- 379,385 GISTENTRY entry, identry[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; + int keep_current_best; Assert(!GistPageIsLeaf(p)); *** gistchoose(Relation r, Page p, IndexTupl *** 402,407 --- 403,433 best_penalty[0] = -1; /* + * If we find a tuple that's exactly as good as the currently best one, we + * could use either one. When inserting a lot of tuples with the same or + * similar keys, it's preferable to descend down the same path when + * possible, as that's more cache-friendly. On the other hand, if all + * inserts land on the same leaf page after a split, we're never going to + * insert anything to the other half of the split, and will end up using + * only 50% of the available space. Distributing the inserts evenly would + * lead to better space usage, but that hurts cache-locality during + * insertion. To get the best of both worlds, when we find a tuple that's + * exactly as good as the previous best, choose randomly whether to stick + * to the old best, or use the new one. Once we decide to stick to the + * old best, we keep sticking to it for any subsequent equally good tuples + * we might find. This favors tuples with low offsets, but still allows + * some inserts to go to other equally-good subtrees. + * + * keep_current_best is -1 if we haven't yet had to make a random choice + * whether to keep the current best tuple. If we have done so, and + * decided to keep it, keep_current_best is 1; if we've decided to + * replace, keep_current_best is 0. (This state will be reset to -1 as + * soon as we've made the replacement, but sometimes we make the choice in + * advance of actually finding a replacement best tuple.) + */ + keep_current_best = -1; + + /* * Loop over tuples on page. */ maxoff = PageGetMaxOffsetNumber(p); *** gistchoose(Relation r, Page p, IndexTupl *** 446,451 --- 472,480 if (j r-rd_att-natts - 1) best_penalty[j + 1] = -1; + + /* we have new best, so reset keep-it decision */ + keep_current_best = -1; } else if (best_penalty[j] == usize) { *** gistchoose(Relation r, Page p, IndexTupl *** 468,479 } /* ! * If we find a tuple with zero penalty for all columns, there's no ! * need to examine remaining tuples; just break out of the loop and ! * return it. */ if (zero_penalty) ! break; } return result; --- 497,537 } /* ! * If we looped past the last column, and did not update result, ! * then this tuple is exactly as good as the prior best tuple. ! */ ! if (j == r-rd_att-natts result != i) ! { ! if
Re: [HACKERS] gistchoose vs. bloat
On Thu, Jan 24, 2013 at 11:26 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 21.01.2013 15:19, Heikki Linnakangas wrote: On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Meh. There's already a random decision, in the equivalent place and for a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever complained about that being indeterminate, so I'm unconvinced that there's a market for it with gist. I wonder if it would work for gist to do something similar to _bt_findinsertloc, and have a bias towards the left page, but sometimes descend to one of the pages to the right. You would get the cache locality of usually descending down the same subtree, but also fill the pages to the right. Jeff / Alexander, want to give that a shot? I did some experimenting with that. I used the same test case Alexander did, with geonames data, and compared unpatched version, the original patch, and the attached patch that biases the first best tuple found, but still sometimes chooses the other equally good ones. testname| initsize | finalsize | idx_blks_read | idx_blks_hit +--+--**-+---+**-- patched-10-4mb | 75497472 | 90202112 | 5853604 | 10178331 unpatched-4mb | 75145216 | 94863360 | 5880676 | 10185647 unpatched-4mb | 75587584 | 97165312 | 5903107 | 10183759 patched-2-4mb | 74768384 | 81403904 | 5768124 | 10193738 origpatch-4mb | 74883072 | 82182144 | 5783412 | 10185373 All these tests were performed with shared_buffers=4MB, so that the index won't fit completely in shared buffers, and I could use the idx_blk_read/hit ratio as a measure of cache-friendliness. The origpath test was with a simplified version of Alexander's patch, see attached. It's the same as the original, but with the randomization-option and gist-build related code removed. The patched-10 and patched-2 tests are two variants with my patch, with 1/10 and 1/2 chance of moving to the next equal tuple, respectively. The differences in cache hit ratio might be just a result of different index sizes. I included two unpatched runs above to show that there's a fair amount of noise in these tests. That's because of the random nature of the test case; it picks rows to delete and insert at random. I think the conclusion is that all of these patches are effective. The 1/10 variant is less effective, as expected, as it's closer in behavior to the unpatched behavior than the others. The 1/2 variant seems as good as the original patch. Does two unpatched-4mb lines are different by coincidence? If so then variance is significant and we need more experiments to actually compare patched-2-4mb and origpatch-4mb lines. There is another cause of overhead when use randomization in gistchoose: extra penalty calls. It could be significant when index fits to cache. In order evade it I especially change behaviour of my patch from look sequentially and choose random to look in random order. I think we need to include comparison of CPU time. -- With best regards, Alexander Korotkov.
Re: [HACKERS] gistchoose vs. bloat
Alexander Korotkov aekorot...@gmail.com writes: There is another cause of overhead when use randomization in gistchoose: extra penalty calls. It could be significant when index fits to cache. In order evade it I especially change behaviour of my patch from look sequentially and choose random to look in random order. I think we need to include comparison of CPU time. Hmm ... actually, isn't that an argument in favor of Heikki's method? The way he's doing it, we can exit without making additional penalty calls once we've found a zero-penalty tuple and decided not to look further (which is something with a pretty high probability). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On 24.01.2013 22:35, Tom Lane wrote: Alexander Korotkovaekorot...@gmail.com writes: There is another cause of overhead when use randomization in gistchoose: extra penalty calls. It could be significant when index fits to cache. In order evade it I especially change behaviour of my patch from look sequentially and choose random to look in random order. I think we need to include comparison of CPU time. Hmm ... actually, isn't that an argument in favor of Heikki's method? The way he's doing it, we can exit without making additional penalty calls once we've found a zero-penalty tuple and decided not to look further (which is something with a pretty high probability). No, I think Alexander is right, although I believe the difference is minimal in practice. If we assume that the there are no zero-penalty tuples on the page, with both approaches you have to call penalty on every tuple to know which is best. If there are zero-penalty tuples, then there is a small difference. With Alexander's method, you can stop looking as soon as you find a zero-penalty tuple (the order you check the tuples is random). With my method, you can stop looking as soon as you find a zero-penalty tuple, *and* you decide to not look further. With the 1/2 probability to stop looking further, you give up on average after 2 tuples. So if I'm doing my math right, my patch does on average between 1x - 2x as many penalty calls as Alexander's, depending on how many zero-penalty tuples there are. The 2x case is when the page is full of zero-penalty tuples, in which case the difference isn't big in absolute terms; 2 penalty calls per page versus 1. BTW, one thing that I wondered about this: How expensive is random()? I'm assuming not very, but I don't really know. Alexander's patch called random() for every tuple on the page, while I call it only once for each equal-penalty tuple. If it's really cheap, I think my/Tom's patch could be slightly simplified by always initializing keep_current_best with random() at top of the function, and eliminating the -1 haven't decided yet state. OTOH, if random() is expensive, I note that we only need one bit of randomness, so we could avoid calling random() so often if we utilized all the bits from each random() call. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Heikki Linnakangas hlinnakan...@vmware.com writes: BTW, one thing that I wondered about this: How expensive is random()? I'm assuming not very, but I don't really know. Alexander's patch called random() for every tuple on the page, while I call it only once for each equal-penalty tuple. If it's really cheap, I think my/Tom's patch could be slightly simplified by always initializing keep_current_best with random() at top of the function, and eliminating the -1 haven't decided yet state. I thought about that too, and concluded that random() is probably expensive enough that we don't want to call it unnecessarily. OTOH, if random() is expensive, I note that we only need one bit of randomness, so we could avoid calling random() so often if we utilized all the bits from each random() call. Meh. That would hard-wire the decision that the probability of keeping a best tuple is exactly 0.5. I'd rather keep the flexibility to tune it later. The way your patch is set up, it seems unlikely that it will call random() very many times per page, so I'm not that concerned about minimizing the number of calls further. (Also, in the probably-common case that there are no exactly equally good alternatives, this would actually be a net loss since it would add one unnecessary call.) So basically, Alexander's method requires more random() calls and fewer penalty() calls than yours, at least in typical cases. It's hard to say which is faster without benchmarking, and it would matter which opclass you were testing on. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Jeff Davis pg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Meh. There's already a random decision, in the equivalent place and for a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever complained about that being indeterminate, so I'm unconvinced that there's a market for it with gist. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Meh. There's already a random decision, in the equivalent place and for a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever complained about that being indeterminate, so I'm unconvinced that there's a market for it with gist. I wonder if it would work for gist to do something similar to _bt_findinsertloc, and have a bias towards the left page, but sometimes descend to one of the pages to the right. You would get the cache locality of usually descending down the same subtree, but also fill the pages to the right. Jeff / Alexander, want to give that a shot? When building an index from scratch, using the new buffered index build, you could do a lot better than fill each page like with regular inserts and split when one fills up. You could e.g buffer 10 pages worth of tuples, and perform a 10-way split of all the buffered tuples together, distributing them equally to 10 pages (or 10+something, to leave some room for updates). But that's clearly a separate and much larger patch. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Jeff Davis pg...@j-davis.com writes: On Fri, 2012-12-14 at 18:36 +0200, Heikki Linnakangas wrote: BTW, I don't much like the option name randomization. It's not clear what's been randomized. I'd prefer something like distribute_on_equal_penalty, although that's really long. Better ideas? I agree that randomization is vague, but I can't think of anything better. I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible benefit. Even if there are some cases where it miraculously fails to lose, how many users are going to realize that applies to their case and make use of the option? Sounds good to me. If I remember correctly, there was also an argument that it may be useful for repeatable test results. That's a little questionable for performance (except in those cases where few penalties are identical anyway), but could plausibly be useful for a crash report or something. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Fri, 2012-12-14 at 01:03 +0400, Alexander Korotkov wrote: Hi! On Sat, Dec 8, 2012 at 7:05 PM, Andres Freund and...@2ndquadrant.com wrote: I notice there's no documentation about the new reloption at all? Thanks for notice! I've added small description to docs in the attached patch. Here is an edited version of the documentation note. Please review to see if you like my version. Also, I fixed a compiler warning. My tests showed a significant reduction in the size of a gist index with many of the same penalty values. The run times showed mixed results, however, and I didn't dig in much further because you've already done significant testing. Marking this one ready again. Regards, Jeff Davis gist_choose_bloat-0.5A.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Fri, Dec 14, 2012 at 12:46 PM, Jeff Davis pg...@j-davis.com wrote: Thanks for notice! I've added small description to docs in the attached patch. Here is an edited version of the documentation note. Please review to see if you like my version. Edited version looks good for me. Also, I fixed a compiler warning. Thanks! -- With best regards, Alexander Korotkov.
Re: [HACKERS] gistchoose vs. bloat
One question: does the randomization ever help when building a new index? In the original test case, you repeatedly delete and insert tuples, and I can see how the index can get bloated in that case. But I don't see how bloat would occur when building the index from scratch. BTW, I don't much like the option name randomization. It's not clear what's been randomized. I'd prefer something like distribute_on_equal_penalty, although that's really long. Better ideas? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Fri, 2012-12-14 at 18:36 +0200, Heikki Linnakangas wrote: One question: does the randomization ever help when building a new index? In the original test case, you repeatedly delete and insert tuples, and I can see how the index can get bloated in that case. But I don't see how bloat would occur when building the index from scratch. When building an index on a bunch of identical int4range values (in my test, [1,10) ), the resulting index was about 17% smaller. If the current algorithm always chooses to insert on the left-most page, then it seems like there would be a half-filled right page for every split that occurs. Is that reasoning correct? However, I'm having some second thoughts about the run time for index builds. Maybe we should have a few more tests to determine if this should really be the default or just an option? BTW, I don't much like the option name randomization. It's not clear what's been randomized. I'd prefer something like distribute_on_equal_penalty, although that's really long. Better ideas? I agree that randomization is vague, but I can't think of anything better. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Hi! On Sat, Dec 8, 2012 at 7:05 PM, Andres Freund and...@2ndquadrant.comwrote: I notice there's no documentation about the new reloption at all? Thanks for notice! I've added small description to docs in the attached patch. -- With best regards, Alexander Korotkov. gist_choose_bloat-0.5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Hi, On 2012-11-02 12:54:33 +0400, Alexander Korotkov wrote: On Sun, Oct 21, 2012 at 11:03 AM, Jeff Davis pg...@j-davis.com wrote: On Thu, 2012-10-18 at 15:09 -0300, Alvaro Herrera wrote: Jeff, do you think we need more review of this patch? In the patch, it refers to rd_options without checking for NULL first, which needs to be fixed. There's actually still one place where it says id rather than is. Just a nitpick. Regarding my point 4 from the previous email, I mildly disagree with the style, but I don't see a correctness problem there. If the first two items are fixed, then the patch is fine with me. First two items are fixed in attached version of the patch. So the patch is ready for committer now? I notice there's no documentation about the new reloption at all? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Thu, 2012-10-18 at 15:09 -0300, Alvaro Herrera wrote: Jeff, do you think we need more review of this patch? In the patch, it refers to rd_options without checking for NULL first, which needs to be fixed. There's actually still one place where it says id rather than is. Just a nitpick. Regarding my point 4 from the previous email, I mildly disagree with the style, but I don't see a correctness problem there. If the first two items are fixed, then the patch is fine with me. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Alexander Korotkov escribió: 4. It looks like the randomization is happening while trying to compare the penalties. I think it may be more readable to separate those two steps; e.g. /* create a mapping whether randomization is on or not */ for (i = FirstOffsetNumber; i = maxoff; i = OffsetNumberNext(i)) offsets[i - FirstOffsetNumber] = i; if (randomization) /* randomize offsets array */ for (i = 0; i maxoff; i++) { offset = offsets[i]; ... } That's just an idea; if you think it's more readable as-is (or if I am misunderstanding) then let me know. Actually, current implementation comes from idea of creating possible less overhead when randomization is off. I'll try to measure overhead in worst case. If it is low enough then you proposal looks reasonable to me. Were you able to do these measurements? If not, I'll defer to your and Jeff's judgement on what's the best next step here. Jeff, do you think we need more review of this patch? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Mon, Oct 1, 2012 at 5:15 AM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Review comments: 1. Comment above while loop in gistRelocateBuildBuffersOnSplit needs to be updated. Actually, I didn't realize what exact comment you expect. Check if added commend meets you expectations. 2. Typo in two places: if randomization id required. 3. In gistRelocateBuildBuffersOnSplit, shouldn't that be: splitPageInfo = relocationBuffersInfos[bufferIndex]; not: splitPageInfo = relocationBuffersInfos[i]; Fixed. 4. It looks like the randomization is happening while trying to compare the penalties. I think it may be more readable to separate those two steps; e.g. /* create a mapping whether randomization is on or not */ for (i = FirstOffsetNumber; i = maxoff; i = OffsetNumberNext(i)) offsets[i - FirstOffsetNumber] = i; if (randomization) /* randomize offsets array */ for (i = 0; i maxoff; i++) { offset = offsets[i]; ... } That's just an idea; if you think it's more readable as-is (or if I am misunderstanding) then let me know. Actually, current implementation comes from idea of creating possible less overhead when randomization is off. I'll try to measure overhead in worst case. If it is low enough then you proposal looks reasonable to me. -- With best regards, Alexander Korotkov. gist_choose_bloat-0.3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Review comments: 1. Comment above while loop in gistRelocateBuildBuffersOnSplit needs to be updated. 2. Typo in two places: if randomization id required. 3. In gistRelocateBuildBuffersOnSplit, shouldn't that be: splitPageInfo = relocationBuffersInfos[bufferIndex]; not: splitPageInfo = relocationBuffersInfos[i]; 4. It looks like the randomization is happening while trying to compare the penalties. I think it may be more readable to separate those two steps; e.g. /* create a mapping whether randomization is on or not */ for (i = FirstOffsetNumber; i = maxoff; i = OffsetNumberNext(i)) offsets[i - FirstOffsetNumber] = i; if (randomization) /* randomize offsets array */ for (i = 0; i maxoff; i++) { offset = offsets[i]; ... } That's just an idea; if you think it's more readable as-is (or if I am misunderstanding) then let me know. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Do you expect it to be bad in any reasonable situations? I'm inclined to just make it always randomize if it's better. I think it would be hard for a user to guess when it's better and when not. Maybe it's useful to turn randomization off for testing purposes, e.g. to ensure determinism? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Tue, Sep 11, 2012 at 10:35 AM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Do you expect it to be bad in any reasonable situations? I'm inclined to just make it always randomize if it's better. I think it would be hard for a user to guess when it's better and when not. Randomization should increase IO when index doesn't entirely fit to cache. Without randomization only fraction of the tree would be used for actual insertions. While with randomization whole tree would be potentially used for insertions. Maybe it's useful to turn randomization off for testing purposes, e.g. to ensure determinism? Yes, that's another good point. For example, randomization impede reproducing of bugs. -- With best regards, Alexander Korotkov.
Re: [HACKERS] gistchoose vs. bloat
On Mon, Aug 20, 2012 at 9:13 PM, Alexander Korotkov aekorot...@gmail.comwrote: Current gistchoose code has a bug. I've started separate thread about it. http://archives.postgresql.org/pgsql-hackers/2012-08/msg00544.php Also, it obviously needs more comments. Current state of patch is more proof of concept than something ready. I'm going to change it in following ways: 1) We don't know how expensive user penalty function is. So, I'm going to change randomization algorithm so that it doesn't increase number of penalty calls in average. 2) Since, randomization could produce additional IO, there are probably no optimal solution for all the cases. We could introduce user-visible option which enables or disables randomization. However, default value of this option is another question. Also, I think you should use random() rather than rand(). Thanks, will fix. New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. -- With best regards, Alexander Korotkov. gist_choose_bloat-0.2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Mon, Aug 20, 2012 at 7:13 AM, Jeff Davis pg...@j-davis.com wrote: I took a look at this patch. The surrounding code is pretty messy (not necessarily because of your patch). A few comments would go a long way. The 'which_grow' array is initialized as it goes, first using pointer notations (*which_grows = -1.0) and then using subscript notation. As far as I can tell, the first r-rd_att-natts of the array (the only elements that matter) need to be written the first time through anyway. Why not just replace which_grow[j] 0 with i == FirstOffsetNumber and add a comment that we're initializing the penalties with the first index tuple? The 'sum_grow' didn't make any sense, thank you for getting rid of that. Also, we should document that the earlier attributes always take precedence, which is why we break out of the inner loop as soon as we encounter an attribute with a higher penalty. Please add a comment indicating why you are randomly choosing among the equal penalties. I think that there might be a problem with the logic, as well. Let's say you have two attributes and there are two index tuples, it1 and it2; with penalties [10,10] and [10,100] respectively. The second time through the outer loop, with i = 2, you might (P=0.5) assign 2 to the 'which' variable in the first iteration of the inner loop, before it realizes that it2 actually has a higher penalty. I think you need to finish out the inner loop and have a flag that indicates that all attributes are equal before you do the probabilistic replacement. Current gistchoose code has a bug. I've started separate thread about it. http://archives.postgresql.org/pgsql-hackers/2012-08/msg00544.php Also, it obviously needs more comments. Current state of patch is more proof of concept than something ready. I'm going to change it in following ways: 1) We don't know how expensive user penalty function is. So, I'm going to change randomization algorithm so that it doesn't increase number of penalty calls in average. 2) Since, randomization could produce additional IO, there are probably no optimal solution for all the cases. We could introduce user-visible option which enables or disables randomization. However, default value of this option is another question. Also, I think you should use random() rather than rand(). Thanks, will fix. -- With best regards, Alexander Korotkov.
Re: [HACKERS] gistchoose vs. bloat
On Mon, 2012-06-18 at 15:12 +0400, Alexander Korotkov wrote: Hackers, While experimenting with gistchoose I achieve interesting results about relation of gistchoose behaviour and gist index bloat. ... Current implementation of gistchoose select first index tuple which have minimal penalty. It is possible for several tuples to have same minimal penalty. I've created simple patch which selects random from them. I then I've following results for same testcase. I took a look at this patch. The surrounding code is pretty messy (not necessarily because of your patch). A few comments would go a long way. The 'which_grow' array is initialized as it goes, first using pointer notations (*which_grows = -1.0) and then using subscript notation. As far as I can tell, the first r-rd_att-natts of the array (the only elements that matter) need to be written the first time through anyway. Why not just replace which_grow[j] 0 with i == FirstOffsetNumber and add a comment that we're initializing the penalties with the first index tuple? The 'sum_grow' didn't make any sense, thank you for getting rid of that. Also, we should document that the earlier attributes always take precedence, which is why we break out of the inner loop as soon as we encounter an attribute with a higher penalty. Please add a comment indicating why you are randomly choosing among the equal penalties. I think that there might be a problem with the logic, as well. Let's say you have two attributes and there are two index tuples, it1 and it2; with penalties [10,10] and [10,100] respectively. The second time through the outer loop, with i = 2, you might (P=0.5) assign 2 to the 'which' variable in the first iteration of the inner loop, before it realizes that it2 actually has a higher penalty. I think you need to finish out the inner loop and have a flag that indicates that all attributes are equal before you do the probabilistic replacement. Also, I think you should use random() rather than rand(). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers