Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

Josh Berkus Tue, 03 May 2005 14:48:31 -0700

Mischa,

> Okay, although given the track record of page-based sampling for
> n-distinct, it's a bit like looking for your keys under the streetlight,
> rather than in the alley where you dropped them :-)


Bad analogy, but funny.

The issue with page-based vs. pure random sampling is that to do, for example, 
10% of rows purely randomly would actually mean loading 50% of pages.  With 
20% of rows, you might as well scan the whole table.

Unless, of course, we use indexes for sampling, which seems like a *really 
good* idea to me ....

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

Reply via email to