Hi- I have a table- called "event" with a field event_date_time that is indexed. There are 1,700,000 rows in the table and 92,000 distinct values of event_date_time with anywhere from 1 to 400 rows sharing the same value. (I did a count grouped by event_date_time & scanned it to get this info.)
When I look at the pg_stats on this table, I always see 15,000 or lower in the n_distinct column for event_date_time. (I re-ran analyze several times & then checked pg_stats to see if the numbers varied significantly.) Since this is off by about a factor of 6, I think the planner is missing the chance to use this table as the "driver" in a complex query plan that I'm trying to optimize. So the question is- how can I get a better estimate of n_distinct from analyze? If I alter the stats target as high as it will go, I get closer, but it still shows the index to be about 1/2 as selective as it actually is: alpha=# alter table event alter column event_date_time set statistics 1000; ALTER TABLE alpha=# analyze event; ANALYZE alpha=# select n_distinct from pg_stats where tablename='event' and attname='event_date_time'; n_distinct ------------ 51741 (1 row) This number seems to be consistently around 51,000 if I re-run analyze a few times. I guess my question is two-part: (1)Is there any tweak to make this estimate work better? (2)Since I'm getting numbers that are consistent but way off, is there a bug here? (2-1/2) Or alternately, am I totally missing what n-distinct is supposed to denote? Thanks! -Nick --------------------------------------------------------------------- Nick Fankhauser [EMAIL PROTECTED] Phone 1.765.965.7363 Fax 1.765.962.9788 doxpop - Court records at your fingertips - http://www.doxpop.com/ ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings