Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-14 Thread Heikki Linnakangas
On 01.03.2013 16:22, Alexander Korotkov wrote: On Wed, Mar 13, 2013 at 11:10 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01.03.2013 16:22, Alexander Korotkov wrote: frac = area / (length2 - length1); you can get NaN result. I've especially adjusted the code to get more of

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-13 Thread Alexander Korotkov
Hi! Thanks for your work on this patch! On Tue, Mar 12, 2013 at 8:03 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: So, after some hacking, I ended up with this version. I find it more readable, I hope I didn't miss anything. This seems to produce results that are close, but not

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-13 Thread Heikki Linnakangas
On 01.03.2013 16:22, Alexander Korotkov wrote: On Tue, Mar 12, 2013 at 8:03 PM, Heikki Linnakangashlinnakan...@vmware.com wrote: So, after some hacking, I ended up with this version. I find it more readable, I hope I didn't miss anything. This seems to produce results that are close, but not

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-13 Thread Alexander Korotkov
On Wed, Mar 13, 2013 at 11:10 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01.03.2013 16:22, Alexander Korotkov wrote: On Tue, Mar 12, 2013 at 8:03 PM, Heikki Linnakangashlinnakangas@** vmware.com hlinnakan...@vmware.com wrote: So, after some hacking, I ended up with this

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-12 Thread Heikki Linnakangas
On 01.03.2013 16:22, Alexander Korotkov wrote: These changes were made in attached patch. Thanks. I've been staring at this code for a very long time now, trying to understand how the math in calc_hist_selectivity_contained works. I think I understand it now, but it probably needs a lot

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-12 Thread Heikki Linnakangas
On 01.03.2013 16:22, Alexander Korotkov wrote: I've been staring at this code for a very long time now, trying to understand how the math in calc_hist_selectivity_contained works. I think I understand it now, but it probably needs a lot more comments and perhaps some refactoring, so that the

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-01 Thread Alexander Korotkov
On Wed, Feb 13, 2013 at 5:55 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Wed, Feb 13, 2013 at 5:28 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04.01.2013 10:42, Alexander Korotkov wrote: /* * Calculate selectivity of operator using histograms of range lower bounds

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-02-13 Thread Heikki Linnakangas
On 04.01.2013 10:42, Alexander Korotkov wrote: /* * Calculate selectivity of operator using histograms of range lower bounds * and histogram of range lengths. */ static double calc_hist_selectivity_overlap(TypeCacheEntry *typcache, RangeBound *lower,

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-02-13 Thread Alexander Korotkov
On Wed, Feb 13, 2013 at 5:28 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04.01.2013 10:42, Alexander Korotkov wrote: /* * Calculate selectivity of operator using histograms of range lower bounds * and histogram of range lengths. */ static double

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-01-04 Thread Alexander Korotkov
On Mon, Dec 10, 2012 at 11:21 PM, Jeff Davis pg...@j-davis.com wrote: And I have a few other questions/comments: * Why is summ spelled with two ms? Is it short for summation? If so, might be good to use summation of instead of integrate in the comment. Fixed. * Why does

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-12-13 Thread Alexander Korotkov
Hi, Jeff! Thanks a lot for review! On Mon, Dec 10, 2012 at 11:21 PM, Jeff Davis pg...@j-davis.com wrote: It looks like there are still some problems with this patch. CREATE TABLE foo(ir int4range); insert into foo select 'empty' from generate_series(1,1); insert into foo select

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-12-10 Thread Jeff Davis
It looks like there are still some problems with this patch. CREATE TABLE foo(ir int4range); insert into foo select 'empty' from generate_series(1,1); insert into foo select int4range(NULL, g, '(]') from generate_series(1,100) g; insert into foo select int4range(g, NULL, '[)')

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-12-08 Thread Andres Freund
On 2012-11-05 22:10:50 -0800, Jeff Davis wrote: On Mon, 2012-11-05 at 11:12 -0300, Alvaro Herrera wrote: What's going on with this patch? I haven't seen any activity in a while. Should I just move this to the next commitfest? Sorry, I dropped the ball here. I will still review it, whether

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-11-05 Thread Alvaro Herrera
What's going on with this patch? I haven't seen any activity in a while. Should I just move this to the next commitfest? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-11-05 Thread Jeff Davis
On Mon, 2012-11-05 at 11:12 -0300, Alvaro Herrera wrote: What's going on with this patch? I haven't seen any activity in a while. Should I just move this to the next commitfest? Sorry, I dropped the ball here. I will still review it, whether it makes this commitfest or not. Regards,

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-10-18 Thread Alvaro Herrera
Heikki, would you be able to give this patch a look and perhaps commit it? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-09-30 Thread Jeff Davis
On Tue, 2012-09-04 at 17:27 +0400, Alexander Korotkov wrote: Addon patch is attached. Actually, I don't get your intention of introducing STATISTIC_KIND_RANGE_EMPTY_FRAC stakind. Did you plan to leave it as empty frac in distinct stakind or replace this stakind with

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-09-04 Thread Alexander Korotkov
On Mon, Aug 27, 2012 at 5:00 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 24.08.2012 18:51, Heikki Linnakangas wrote: On 20.08.2012 00:31, Alexander Korotkov wrote: New version of patch. * Collect new stakind STATISTIC_KIND_BOUNDS_**HISTOGRAM, which is lower and

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-27 Thread Heikki Linnakangas
On 24.08.2012 18:51, Heikki Linnakangas wrote: On 20.08.2012 00:31, Alexander Korotkov wrote: New version of patch. * Collect new stakind STATISTIC_KIND_BOUNDS_HISTOGRAM, which is lower and upper bounds histograms combined into single ranges array, instead of STATISTIC_KIND_HISTOGRAM. One

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-27 Thread Alexander Korotkov
On Mon, Aug 27, 2012 at 5:00 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 24.08.2012 18:51, Heikki Linnakangas wrote: On 20.08.2012 00:31, Alexander Korotkov wrote: New version of patch. * Collect new stakind STATISTIC_KIND_BOUNDS_**HISTOGRAM, which is lower and

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-24 Thread Heikki Linnakangas
On 20.08.2012 00:31, Alexander Korotkov wrote: On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.08.2012 11:34, Alexander Korotkov wrote: Ok, we've to decide if we need standard histogram. In some cases it can be used for more accurate

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-21 Thread Heikki Linnakangas
On 20.08.2012 00:31, Alexander Korotkov wrote: On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.08.2012 11:34, Alexander Korotkov wrote: Ok, we've to decide if we need standard histogram. In some cases it can be used for more accurate

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-19 Thread Alexander Korotkov
On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.08.2012 11:34, Alexander Korotkov wrote: Ok, we've to decide if we need standard histogram. In some cases it can be used for more accurate estimation of and operators. But I think it is

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-16 Thread Heikki Linnakangas
On 15.08.2012 11:34, Alexander Korotkov wrote: On Wed, Aug 15, 2012 at 12:14 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Histogram of upper bounds would be both more accurate and natural for some operators. However, it requires collecting additional statistics while

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-16 Thread Heikki Linnakangas
On 15.08.2012 11:34, Alexander Korotkov wrote: On Wed, Aug 15, 2012 at 12:14 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Histogram of upper bounds would be both more accurate and natural for some operators. However, it requires collecting additional statistics while

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-15 Thread Alexander Korotkov
On Tue, Aug 14, 2012 at 7:46 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 14.08.2012 09:45, Alexander Korotkov wrote: After fixing few more bugs, I've a version with much more reasonable accuracy. Great! One little thing just occurred to me: You're relying on the

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-15 Thread Heikki Linnakangas
On 15.08.2012 10:38, Alexander Korotkov wrote: On Tue, Aug 14, 2012 at 7:46 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: It would be quite easy to provide reasonable estimates for those operators, if we had a separate histogram of upper bounds. I also note that the

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-15 Thread Alexander Korotkov
On Wed, Aug 15, 2012 at 12:14 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Histogram of upper bounds would be both more accurate and natural for some operators. However, it requires collecting additional statistics while AFAICS it doesn't liberate us from having

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-15 Thread Tom Lane
Alexander Korotkov aekorot...@gmail.com writes: Ok, we've to decide if we need standard histogram. In some cases it can be used for more accurate estimation of and operators. But I think it is not so important. So, we can replace standard histogram with histograms of lower and upper bounds?

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-14 Thread Alexander Korotkov
On Mon, Aug 13, 2012 at 1:11 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Thu, Aug 9, 2012 at 12:44 AM, Alexander Korotkov aekorot...@gmail.comwrote: My conclusion is so, that current errors are probably ok for selectivity estimation. But taking into attention that generated

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-14 Thread Heikki Linnakangas
On 14.08.2012 09:45, Alexander Korotkov wrote: After fixing few more bugs, I've a version with much more reasonable accuracy. Great! One little thing just occurred to me: You're relying on the regular scalar selectivity estimators for the , , and operators. That seems bogus, in particular

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-12 Thread Alexander Korotkov
On Thu, Aug 9, 2012 at 12:44 AM, Alexander Korotkov aekorot...@gmail.comwrote: My conclusion is so, that current errors are probably ok for selectivity estimation. But taking into attention that generated datasets ideally fits assumptions of estimation, there could be room for improvement.

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-09 Thread Alexander Korotkov
New revision of patch with two fixes: 1) Check if histogram bin width is zero in get_position. 2) Check statsTuple is valid tuple in rangecontsel. -- With best regards, Alexander Korotkov. range_stat-0.3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-08 Thread Matthias
Having statistics on ranges was really missing! The planner was doing some really, really bad choices on bigger tables regarding seq/random scans, nested loop/other joins etc. Is there any chance this makes it into 9.2 final? It would really round-off the introduction of range types and maybe

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-06 Thread Heikki Linnakangas
On 04.08.2012 12:31, Alexander Korotkov wrote: Hackers, attached patch is for collecting statistics and selectivity estimation for ranges. In order to make our estimations accurate for every distribution of ranges, we would collect 2d-distribution of lower and upper bounds of range into some

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-06 Thread Alexander Korotkov
On Mon, Aug 6, 2012 at 6:09 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 04.08.2012 12:31, Alexander Korotkov wrote: Hackers, attached patch is for collecting statistics and selectivity estimation for ranges. In order to make our estimations accurate for every