Re: [HACKERS] Group-count estimation statistics

2005-02-01 Thread Manfred Koizar
On Mon, 31 Jan 2005 14:40:08 -0500, Tom Lane [EMAIL PROTECTED] wrote: Manfred Koizar [EMAIL PROTECTED] writes: That's not what I meant. I tried to say that if we have a GROUP BY several columns and one of these columns alone has more than N/10 distinct values, there's no way to get less than

Re: [HACKERS] Group-count estimation statistics

2005-02-01 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes: On Mon, 31 Jan 2005 14:40:08 -0500, Tom Lane [EMAIL PROTECTED] wrote: Oh, I see, you want a max calculation in there too. Seems reasonable. Any objections? Yes. :-( What I said is only true in the absence of any WHERE clause (or join). Otherwise

Re: [HACKERS] Group-count estimation statistics

2005-01-31 Thread Manfred Koizar
On Fri, 28 Jan 2005 10:53:33 -0500, Tom Lane [EMAIL PROTECTED] wrote: we should consider something like clamp to size of table / 10 instead. ... unless a *single* grouping column is estimated to have more than N/10 distinct values, which should be easy to check. Servus Manfred

Re: [HACKERS] Group-count estimation statistics

2005-01-31 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes: On Fri, 28 Jan 2005 10:53:33 -0500, Tom Lane [EMAIL PROTECTED] wrote: we should consider something like clamp to size of table / 10 instead. ... unless a *single* grouping column is estimated to have more than N/10 distinct values, which should be

Re: [HACKERS] Group-count estimation statistics

2005-01-31 Thread Manfred Koizar
On Mon, 31 Jan 2005 11:20:31 -0500, Tom Lane [EMAIL PROTECTED] wrote: Already done that way. if (relvarcount 1) clamp *= 0.1; That's not what I meant. I tried to say that if we have a GROUP BY several columns and one of these columns alone has more than N/10 distinct

Re: [HACKERS] Group-count estimation statistics

2005-01-31 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes: That's not what I meant. I tried to say that if we have a GROUP BY several columns and one of these columns alone has more than N/10 distinct values, there's no way to get less than that many groups. Oh, I see, you want a max calculation in there too.

Re: [HACKERS] Group-count estimation statistics

2005-01-29 Thread Mischa
From: Sailesh Krishnamurthy [EMAIL PROTECTED] Tom == Tom Lane [EMAIL PROTECTED] writes: Tom The only real solution, of course, is to acquire cross-column Tom statistics, but I don't see that happening in the near Tom future. Another approach is a hybrid hashing scheme where

Re: [HACKERS] Group-count estimation statistics

2005-01-28 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: So why is it any more reasonable for Postgres to assume 0 correlation than any other value. Perhaps Postgres should calculate these cases assuming some arbitrary level of correlation. [ shrug... ] Sure, if you want to do the legwork to develop something

Re: [HACKERS] Group-count estimation statistics

2005-01-28 Thread Sailesh Krishnamurthy
Tom == Tom Lane [EMAIL PROTECTED] writes: Tom The only real solution, of course, is to acquire cross-column Tom statistics, but I don't see that happening in the near Tom future. Another approach is a hybrid hashing scheme where we use a hash table until we run out of memory at

Re: [HACKERS] Group-count estimation statistics

2005-01-28 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Greg Stark's thought about a power correction seemed interesting too, though again far too optimistic to trust without some good math to back it up. Fwiw, I'm pretty sure good math is not going to back up my off-the-cuff algorithm. But I did like the answer