Re: [HACKERS] Maximum statistics target

2008-03-20 Thread Decibel!
On Mar 10, 2008, at 1:26 PM, Peter Eisentraut wrote: Am Montag, 10. März 2008 schrieb Gregory Stark: It's not possible to believe that you'd not notice O(N^2) behavior for N approaching 80 ;-). Perhaps your join columns were unique keys, and thus didn't have any most-common-values?

Re: [HACKERS] Maximum statistics target

2008-03-20 Thread Kenneth Marshall
On Thu, Mar 20, 2008 at 11:17:10AM -0500, Decibel! wrote: On Mar 10, 2008, at 1:26 PM, Peter Eisentraut wrote: At some point I think it makes a lot more sense to just have VACUUM gather stats as it goes, rather than have ANALYZE generate a bunch of random IO. BTW, when it comes to the case

Re: [HACKERS] Maximum statistics target

2008-03-10 Thread Peter Eisentraut
Am Freitag, 7. März 2008 schrieb Tom Lane: I'm not wedded to the number 1000 in particular --- obviously that's just a round number. But it would be good to see some performance tests with larger settings before deciding that we don't need a limit. Well, I'm not saying we should raise the

Re: [HACKERS] Maximum statistics target

2008-03-10 Thread Cédric Villemain
Le Monday 10 March 2008, Peter Eisentraut a écrit : Am Freitag, 7. März 2008 schrieb Tom Lane: I'm not wedded to the number 1000 in particular --- obviously that's just a round number. But it would be good to see some performance tests with larger settings before deciding that we don't

Re: [HACKERS] Maximum statistics target

2008-03-10 Thread Guillaume Smet
On Mon, Mar 10, 2008 at 11:36 AM, Peter Eisentraut [EMAIL PROTECTED] wrote: The time to analyze is also quite constant, just before you run out of memory. :) The MaxAllocSize is the limiting factor in all this. In my example, statistics targets larger than about 80 created pg_statistic

Re: [HACKERS] Maximum statistics target

2008-03-10 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes: Am Freitag, 7. März 2008 schrieb Tom Lane: IIRC, egjoinsel is one of the weak spots, so tests involving planning of joins between two tables with large MCV lists would be a good place to start. I have run tests with joining two and three tables with

Re: [HACKERS] Maximum statistics target

2008-03-10 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Peter Eisentraut [EMAIL PROTECTED] writes: Am Freitag, 7. März 2008 schrieb Tom Lane: IIRC, egjoinsel is one of the weak spots, so tests involving planning of joins between two tables with large MCV lists would be a good place to start. I have run tests

Re: [HACKERS] Maximum statistics target

2008-03-10 Thread Peter Eisentraut
Am Montag, 10. März 2008 schrieb Gregory Stark: It's not possible to believe that you'd not notice O(N^2) behavior for N approaching 80 ;-). Perhaps your join columns were unique keys, and thus didn't have any most-common-values? We could remove the hard limit on statistics target and

Re: [HACKERS] Maximum statistics target

2008-03-10 Thread Stephen Denne
We could remove the hard limit on statistics target and impose the limit instead on the actual size of the arrays. Ie, allow people to specify larger sample sizes and discard unreasonably large excess data (possibly warning them when that happens). That would remove the screw case the

Re: [HACKERS] Maximum statistics target

2008-03-09 Thread Stephen Denne
Tom Lane wrote: Martijn van Oosterhout [EMAIL PROTECTED] writes: On Fri, Mar 07, 2008 at 07:25:25PM +0100, Peter Eisentraut wrote: What's the problem with setting it to ten million if I have ten million values in the table and I am prepared to spend the resources to maintain those

[HACKERS] Maximum statistics target

2008-03-07 Thread Peter Eisentraut
Related to the concurrent discussion about selectivity estimations ... What is the reason the statistics target is limited to 1000? I've seen more than one case where increasing the statistics target to 1000 improved results and one would have wanted to increase it further. What's the problem

Re: [HACKERS] Maximum statistics target

2008-03-07 Thread Martijn van Oosterhout
On Fri, Mar 07, 2008 at 07:25:25PM +0100, Peter Eisentraut wrote: What's the problem with setting it to ten million if I have ten million values in the table and I am prepared to spend the resources to maintain those statistics? That it'll probably take 10 million seconds to calculate the

Re: [HACKERS] Maximum statistics target

2008-03-07 Thread Tom Lane
Martijn van Oosterhout [EMAIL PROTECTED] writes: On Fri, Mar 07, 2008 at 07:25:25PM +0100, Peter Eisentraut wrote: What's the problem with setting it to ten million if I have ten million values in the table and I am prepared to spend the resources to maintain those statistics? That it'll