Jim Cox wrote:
On Mon, Oct 13, 2008 at 8:30 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
No, I was thinking of something along the lines of:
INFO:  clustering "public.my_c"
INFO:  complete, was 33%, now 100% clustered
The only such measure that we have is the correlation, which isn't very
good anyway, so I'm not sure if that's worthwhile.
It'd be possible to count the number of order reversals during the
indexscan, ie the number of tuples with CTID lower than the previous
one's. But I'm not sure how useful that number really is.

It will look bad for patterns like:
2
1
4
3
6
5
..

which for all practical purposes is just as good as a perfectly sorted table. So no, I don't think that's a very useful metric either without somehow taking caching effects into account.

Another version of the patch should be attached, this time counting the
number of "inversions" (pairs of tuples in the table that are in the wrong
order) as a measure of the "sortedness" of the original data (scanned/live
numbers still reported as an indication of the extent to which the table was
vacuumed).

Until we have a better metric for "sortedness", my earlier suggestion to print it was probably a bad idea. If anything, should probably print the same correlation metric that ANALYZE calculates, so that it would at least match what the planner uses for decision-making.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to