Jeff Davis wrote: > On Mon, 2008-01-28 at 23:13 +0000, Heikki Linnakangas wrote: > >> "clusteredness" didn't get screwed up by a table that looks like this: >> "5 6 7 8 9 1 2 3 4" >> > ...test table with a similar > distribution to your example, and it shows a correlation of about -0.5, > but it should be as good as something near -1 or +1. > > I am not a statistics expert, but it seems like a better measurement > would be: "what is the chance that, if the tuples are close together in > index order, the corresponding heap tuples are close together?". > Same applies for data clustered by zip-code.
All rows for any State or City or County or SchoolZone are "close together" on the same pages; yet postgres's stats think they're totally unclustered. > The answer to that question in your example is "very likely", so there > would be no problem. > Is there a reason we don't do this? > I've been tempted to do things like update pg_statistic set stanumbers3='{1.0}' where starelid=2617 and staattnum=7; after every analyze when I have data like this from tables clustered by zip. Seems it'd help more plans than it hurts, but haven't been brave enough to try in production. ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster