After much code reading, testing, and using the extremely handy pageinspect
contrib to look at pages, here's what I believe is happening. I am not
attempting to describe every possible scenario, only this one test path. 
Following my short test scenario above...

- Inserted rows get line pointers with status 1 (LP_NORMAL)

- When I do the 100 row delete, those rows stay status 1, but get their
t_xmax set, indicating they were deleted.

- When I do the next 200 row delete, those rows also get their t_xmax set.  

- As a side-effect to the deletion, the "hot prune" feature kicks in.  The 2
pages in question were processed by a call to heap_page_prune, which set the
line pointers of the previous 100 deleted rows to 3 (LP_DEAD).  Now I have
100 LP_DEAD and 200 LP_NORMAL with xmax set.

- When I do the vacuum, all 300 are "vacuumed" -- line pointer set to status
0 (LP_UNUSED).

- Because my table has no indexes, lazy_scan_heap calls lazy_vacuum_page
directly for each block, and reports the variable tups_vacuumed ("removed
200 row versions in 2 pages").  However, tups_vacuumed is computed without
counting the 100 LP_DEAD tuples, because per the code comment, it thinks to
do so would be double-counting. Perhaps the output should say something
like:

        removed 300 row versions (200 were recently deleted and 100 were 
previously
deleted).

Whatever the phrasing, I don't know why 200 is the most significant number
here, and 300 isn't mentioned at all.

- If my table did have indexes, as in the original test case of this thread,
then the "removed xxx row version in yyy pages" message comes from
lazy_vacuum_heap instead.  However, instead of using tups_vacuumed, this
code reports the actual number of tuples actually set to status 0. 

I would like to respectfully suggest that the vacuum output line "removed xx
row versions in yy pages" should show the same counts regardless of whether
or not there's an index on the table. I would suggest that the value
reported by lazy_vacuum_heap is correct, and is what I would expect to see. 
I think it would be fine if it also reports the breakdown of LP_DEAD vs
LP_NORMAL tuples vacuumed, if that is deemed useful.

Regarding the output of pgstattuple, via the call to
HeapTupleSatisfiesVisibility, it appears that this simply returns true for
these hot-pruned LP_DEAD tuples, skewing the counts accordingly.

Does that make sense?

-- 
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Surprising-dead-tuple-count-from-pgstattuple-tp2266955p2471232.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to