Re: [HACKERS] Double sorting split patch

2011-10-06 Thread Heikki Linnakangas
On 05.10.2011 15:59, Alexander Korotkov wrote: Path without allocating extra bytes is attached. I run some more detailed tests on geonames and two smaller datasets from rtreeportal.org. Ok, thanks. Looks good to me now, so committed. -- Heikki Linnakangas EnterpriseDB

Re: [HACKERS] Double sorting split patch

2011-10-06 Thread Alexander Korotkov
On Thu, Oct 6, 2011 at 11:06 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 05.10.2011 15:59, Alexander Korotkov wrote: Path without allocating extra bytes is attached. I run some more detailed tests on geonames and two smaller datasets from rtreeportal.org. Ok,

Re: [HACKERS] Double sorting split patch

2011-10-06 Thread Heikki Linnakangas
On 06.10.2011 11:22, Alexander Korotkov wrote: Thanks. I'm going to continue work on application of this split method in following areas: 1) range types 2) seg contrib module 3) cube contrib module (there situation is not so easy, probably some heuristic of split method selection would be

Re: [HACKERS] Double sorting split patch

2011-10-05 Thread Heikki Linnakangas
On 04.10.2011 15:10, Alexander Korotkov wrote: On Tue, Oct 4, 2011 at 1:46 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Ok. Could you phrase that as a code comment? Here's a version of the patch I've been working on. There's no functional changes, just a lot of moving

Re: [HACKERS] Double sorting split patch

2011-10-05 Thread Alexander Korotkov
On Wed, Oct 5, 2011 at 11:37 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 04.10.2011 15:10, Alexander Korotkov wrote: On Tue, Oct 4, 2011 at 1:46 PM, Heikki Linnakangas heikki.linnakangas@**enterprisedb.comheikki.linnakan...@enterprisedb.com wrote: Ok. Could you

Re: [HACKERS] Double sorting split patch

2011-10-05 Thread Alexander Korotkov
Path without allocating extra bytes is attached. I run some more detailed tests on geonames and two smaller datasets from rtreeportal.org. Test tables are following: 1) test1 - geonames 2) test2 - California Roads, containing the MBRs of 2,249,727 streets 3) test3 - Tiger Streams, containing the

Re: [HACKERS] Double sorting split patch

2011-10-04 Thread Heikki Linnakangas
On 22.09.2011 22:12, Alexander Korotkov wrote: Patch without that dead code is attached. Thanks. Can you elaborate the consider-split algorithm? The criteria to select the new split over the previously selected one is this: ! /* !* If ratio is acceptable, we

Re: [HACKERS] Double sorting split patch

2011-10-04 Thread Alexander Korotkov
On Tue, Oct 4, 2011 at 12:12 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Can you elaborate the consider-split algorithm? The criteria to select the new split over the previously selected one is this: ! /* !* If ratio is acceptable, we

Re: [HACKERS] Double sorting split patch

2011-10-04 Thread Heikki Linnakangas
On 04.10.2011 11:51, Alexander Korotkov wrote: On Tue, Oct 4, 2011 at 12:12 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Can you elaborate the consider-split algorithm? The criteria to select the new split over the previously selected one is this: ! /* !

Re: [HACKERS] Double sorting split patch

2011-10-04 Thread Alexander Korotkov
On Tue, Oct 4, 2011 at 1:46 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Ok. Could you phrase that as a code comment? Here's a version of the patch I've been working on. There's no functional changes, just a lot of moving things around, comment changes, etc. to

Re: [HACKERS] Double sorting split patch

2011-09-22 Thread Heikki Linnakangas
! /* !* Calculate delta between penalties of join common entries to !* different groups. !*/ ! for (i = 0; i commonEntriesCount; i++) { ! double lower, !

Re: [HACKERS] Double sorting split patch

2011-09-22 Thread Alexander Korotkov
On Thu, Sep 22, 2011 at 3:22 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: ! /* !* Calculate delta between penalties of join common entries to !* different groups. !*/ ! for (i = 0; i

Re: [HACKERS] Double sorting split patch

2011-09-22 Thread Alexander Korotkov
On Thu, Sep 22, 2011 at 3:31 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Thu, Sep 22, 2011 at 3:22 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: 'lower' and 'upper' are not used for anything in the above. Is that just dead code that can be removed, or is there

Re: [HACKERS] Double sorting split patch

2011-09-21 Thread Alexander Korotkov
I've processed the results of the tests with double sorting split which I've sheduled for buffering build. I've updated wiki page with them: http://wiki.postgresql.org/wiki/Fast_GiST_index_build_GSoC_2011#Testing_results Raw results of query speed measues are in the attachment. There number of

Re: [HACKERS] Double sorting split patch

2011-09-18 Thread Peter Geoghegan
On 18 September 2011 01:51, Greg Stark st...@mit.edu wrote: I think we provided the qsort implementation for the benefit of platforms that didn't have a decent one and then ended up switching to use it always for some reason I don't recall.  It wasn't because we thought we were better at

Re: [HACKERS] Double sorting split patch

2011-09-17 Thread Alexander Korotkov
On Fri, Sep 16, 2011 at 3:07 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: That looks awfully complicated. I don't understand how that works. I wonder if two passes would be simpler? I doubt it becomes much simpler, but here it is. -- With best regards, Alexander

Re: [HACKERS] Double sorting split patch

2011-09-17 Thread Heikki Linnakangas
On 17.09.2011 17:36, Alexander Korotkov wrote: On Fri, Sep 16, 2011 at 3:07 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: That looks awfully complicated. I don't understand how that works. I wonder if two passes would be simpler? I doubt it becomes much simpler, but here

Re: [HACKERS] Double sorting split patch

2011-09-17 Thread Alexander Korotkov
On Sat, Sep 17, 2011 at 9:00 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Thanks! It does look a lot simpler that way, IMHO. I presume this didn't change the performance characteristics? On the tests I provided in the first letter difference seems to be insignificant.

Re: [HACKERS] Double sorting split patch

2011-09-17 Thread Peter Geoghegan
On 15 September 2011 19:46, Alexander Korotkov aekorot...@gmail.com wrote: Proposed algorithm finds following splits by single pass on two arrays: one sorted by lower bound of interval and another sorted by upper bound of interval. Have you considered if further performance improvements could

Re: [HACKERS] Double sorting split patch

2011-09-17 Thread Greg Stark
On Sat, Sep 17, 2011 at 9:09 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: I find it curious that we go to the trouble of providing a custom qsort implementation in qsort.c, pg_qsort, but haven't gone one step further and considered inlining effects. I think we provided the qsort

Re: [HACKERS] Double sorting split patch

2011-09-16 Thread Heikki Linnakangas
On 15.09.2011 21:46, Alexander Korotkov wrote: On Thu, Sep 15, 2011 at 7:27 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I've looked at the patch, and took a brief look at the paper - but I still don't understand the algorithm. I just can't get my head around the concepts

Re: [HACKERS] Double sorting split patch

2011-09-15 Thread Heikki Linnakangas
On 11.09.2011 22:30, Alexander Korotkov wrote: Hackers, I've got my patch with double sorting picksplit impementation for GiST into more acceptable form. A little of testing is below. Index creation time is slightly higher, but search is much faster. The testing datasets were following: 1)

Re: [HACKERS] Double sorting split patch

2011-09-15 Thread Alexander Korotkov
On Thu, Sep 15, 2011 at 7:27 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I've looked at the patch, and took a brief look at the paper - but I still don't understand the algorithm. I just can't get my head around the concepts of split pairs and left/right groups. Can you

[HACKERS] Double sorting split patch

2011-09-11 Thread Alexander Korotkov
Hackers, I've got my patch with double sorting picksplit impementation for GiST into more acceptable form. A little of testing is below. Index creation time is slightly higher, but search is much faster. The testing datasets were following: 1) uniform dataset - 10M rows 2) geonames points - 7.6M