Re: [HACKERS] gistchoose vs. bloat

2013-01-24 Thread Heikki Linnakangas
On 21.01.2013 15:19, Heikki Linnakangas wrote: On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that

Re: [HACKERS] gistchoose vs. bloat

2013-01-24 Thread Heikki Linnakangas
On 21.01.2013 15:19, Heikki Linnakangas wrote: On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that

Re: [HACKERS] gistchoose vs. bloat

2013-01-24 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes: I did some experimenting with that. I used the same test case Alexander did, with geonames data, and compared unpatched version, the original patch, and the attached patch that biases the first best tuple found, but still sometimes chooses

Re: [HACKERS] gistchoose vs. bloat

2013-01-24 Thread Alexander Korotkov
On Thu, Jan 24, 2013 at 11:26 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 21.01.2013 15:19, Heikki Linnakangas wrote: On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we

Re: [HACKERS] gistchoose vs. bloat

2013-01-24 Thread Tom Lane
Alexander Korotkov aekorot...@gmail.com writes: There is another cause of overhead when use randomization in gistchoose: extra penalty calls. It could be significant when index fits to cache. In order evade it I especially change behaviour of my patch from look sequentially and choose random

Re: [HACKERS] gistchoose vs. bloat

2013-01-24 Thread Heikki Linnakangas
On 24.01.2013 22:35, Tom Lane wrote: Alexander Korotkovaekorot...@gmail.com writes: There is another cause of overhead when use randomization in gistchoose: extra penalty calls. It could be significant when index fits to cache. In order evade it I especially change behaviour of my patch from

Re: [HACKERS] gistchoose vs. bloat

2013-01-24 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes: BTW, one thing that I wondered about this: How expensive is random()? I'm assuming not very, but I don't really know. Alexander's patch called random() for every tuple on the page, while I call it only once for each equal-penalty tuple. If

Re: [HACKERS] gistchoose vs. bloat

2013-01-21 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the

Re: [HACKERS] gistchoose vs. bloat

2013-01-21 Thread Heikki Linnakangas
On 21.01.2013 15:06, Tom Lane wrote: Jeff Davispg...@j-davis.com writes: On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long

Re: [HACKERS] gistchoose vs. bloat

2013-01-20 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes: On Fri, 2012-12-14 at 18:36 +0200, Heikki Linnakangas wrote: BTW, I don't much like the option name randomization. It's not clear what's been randomized. I'd prefer something like distribute_on_equal_penalty, although that's really long. Better ideas? I

Re: [HACKERS] gistchoose vs. bloat

2013-01-20 Thread Jeff Davis
On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote: I looked at this patch. ISTM we should not have the option at all but just do it always. I cannot believe that always-go-left is ever a preferable strategy in the long run; the resulting imbalance in the index will surely kill any possible

Re: [HACKERS] gistchoose vs. bloat

2012-12-14 Thread Jeff Davis
On Fri, 2012-12-14 at 01:03 +0400, Alexander Korotkov wrote: Hi! On Sat, Dec 8, 2012 at 7:05 PM, Andres Freund and...@2ndquadrant.com wrote: I notice there's no documentation about the new reloption at all? Thanks for notice! I've added small description to docs in the

Re: [HACKERS] gistchoose vs. bloat

2012-12-14 Thread Alexander Korotkov
On Fri, Dec 14, 2012 at 12:46 PM, Jeff Davis pg...@j-davis.com wrote: Thanks for notice! I've added small description to docs in the attached patch. Here is an edited version of the documentation note. Please review to see if you like my version. Edited version looks good for me.

Re: [HACKERS] gistchoose vs. bloat

2012-12-14 Thread Heikki Linnakangas
One question: does the randomization ever help when building a new index? In the original test case, you repeatedly delete and insert tuples, and I can see how the index can get bloated in that case. But I don't see how bloat would occur when building the index from scratch. BTW, I don't much

Re: [HACKERS] gistchoose vs. bloat

2012-12-14 Thread Jeff Davis
On Fri, 2012-12-14 at 18:36 +0200, Heikki Linnakangas wrote: One question: does the randomization ever help when building a new index? In the original test case, you repeatedly delete and insert tuples, and I can see how the index can get bloated in that case. But I don't see how bloat

Re: [HACKERS] gistchoose vs. bloat

2012-12-13 Thread Alexander Korotkov
Hi! On Sat, Dec 8, 2012 at 7:05 PM, Andres Freund and...@2ndquadrant.comwrote: I notice there's no documentation about the new reloption at all? Thanks for notice! I've added small description to docs in the attached patch. -- With best regards, Alexander Korotkov.

Re: [HACKERS] gistchoose vs. bloat

2012-12-08 Thread Andres Freund
Hi, On 2012-11-02 12:54:33 +0400, Alexander Korotkov wrote: On Sun, Oct 21, 2012 at 11:03 AM, Jeff Davis pg...@j-davis.com wrote: On Thu, 2012-10-18 at 15:09 -0300, Alvaro Herrera wrote: Jeff, do you think we need more review of this patch? In the patch, it refers to rd_options without

Re: [HACKERS] gistchoose vs. bloat

2012-10-21 Thread Jeff Davis
On Thu, 2012-10-18 at 15:09 -0300, Alvaro Herrera wrote: Jeff, do you think we need more review of this patch? In the patch, it refers to rd_options without checking for NULL first, which needs to be fixed. There's actually still one place where it says id rather than is. Just a nitpick.

Re: [HACKERS] gistchoose vs. bloat

2012-10-18 Thread Alvaro Herrera
Alexander Korotkov escribió: 4. It looks like the randomization is happening while trying to compare the penalties. I think it may be more readable to separate those two steps; e.g. /* create a mapping whether randomization is on or not */ for (i = FirstOffsetNumber; i = maxoff; i

Re: [HACKERS] gistchoose vs. bloat

2012-10-03 Thread Alexander Korotkov
On Mon, Oct 1, 2012 at 5:15 AM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Review

Re: [HACKERS] gistchoose vs. bloat

2012-09-30 Thread Jeff Davis
On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Review comments: 1. Comment above while loop in gistRelocateBuildBuffersOnSplit

Re: [HACKERS] gistchoose vs. bloat

2012-09-11 Thread Jeff Davis
On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Do you expect it to be bad in any reasonable situations? I'm inclined to just make

Re: [HACKERS] gistchoose vs. bloat

2012-09-11 Thread Alexander Korotkov
On Tue, Sep 11, 2012 at 10:35 AM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: New version of patch is attached. Parameter randomization was introduced. It controls whether to randomize choose. Choose algorithm was rewritten. Do you

Re: [HACKERS] gistchoose vs. bloat

2012-09-04 Thread Alexander Korotkov
On Mon, Aug 20, 2012 at 9:13 PM, Alexander Korotkov aekorot...@gmail.comwrote: Current gistchoose code has a bug. I've started separate thread about it. http://archives.postgresql.org/pgsql-hackers/2012-08/msg00544.php Also, it obviously needs more comments. Current state of patch is more

Re: [HACKERS] gistchoose vs. bloat

2012-08-20 Thread Alexander Korotkov
On Mon, Aug 20, 2012 at 7:13 AM, Jeff Davis pg...@j-davis.com wrote: I took a look at this patch. The surrounding code is pretty messy (not necessarily because of your patch). A few comments would go a long way. The 'which_grow' array is initialized as it goes, first using pointer notations

Re: [HACKERS] gistchoose vs. bloat

2012-08-19 Thread Jeff Davis
On Mon, 2012-06-18 at 15:12 +0400, Alexander Korotkov wrote: Hackers, While experimenting with gistchoose I achieve interesting results about relation of gistchoose behaviour and gist index bloat. ... Current implementation of gistchoose select first index tuple which have minimal

[HACKERS] gistchoose vs. bloat

2012-06-18 Thread Alexander Korotkov
Hackers, While experimenting with gistchoose I achieve interesting results about relation of gistchoose behaviour and gist index bloat. I've created following testcase for reproducing gist index bloating: 1) Create test table with 100 points from geonames # create table geotest (id serial,