On 6/26/07, ITAGAKI Takahiro <[EMAIL PROTECTED]> wrote:
Hi, I'm testing HOT patches, applying to CVS HEAD.
Thanks a lot for your tests. I am posting a revised patch on -patches. Please use that for further testing. In the last few days, many people have reviewed the patch including Simon, Heikki, Greg and Korry. I shall post a separate mail summarizing the changes since the last revision.
- MVCC-safe CLUSTER When I clustered a table with HOT-updated tuples, I saw the following error message. The HOT patch latest posted does not support MVCC-safe CLUSTER. | ERROR: unexpected HeapTupleSatisfiesVacuum result
Yes, this is a known issue. Heikki had posted a patch to resolve this conflict. - Number of unremovable tuples reported by VACUUM VERBOSE
HOT-updated tuples (HEAPTUPLE_DEAD_CHAIN) are counted as "keeped" and VACUUM VERBOSE prints them as "cannot be removed yet". However, we can actually remove them. We can reuse the data space of HOT-updated tuples, but need to keep their item pointers. We'd better to show them as two different messages -- for example, unremovable tuples and unreusable item pointers.
We can not remove a HEAPTUPLE_DEAD_CHAIN tuple because even if it is dead, its might be the only way to reach to the live tuple at the end of the chain. Chain pruning logic would ensure that we remove most of such tuples before running vacuum on the page, but few might still be left. We can not reuse the data space just yet because then we loose the xmax/xmin check. Also with several redirecting line pointers, the HOT chain becomes very complex and unmanageable. There are in fact quite a few scenarios here: 1. A dead tuple which is part of a HOT chain can not be removed 2. A dead tuple which is marked LP_DELETE is removed and reported as "removable" 3. A redirect-dead line pointer is removed and reported as "removable" In case 3, no real tuple is being removed. The tuple might have been already reused or vacuumed. So it could be slight misleading. Another problem with the current reporting is that if the original dead tuple is tracked with a separate lp-deleted line pointer and the original root offset is redirect-dead then it might be reported twice as "removable". Once for lp-deleted tuple and again for the redirect-dead line pointer. May be we should report the the redirect-dead offsets as "removable redirected offsets" and not count them in "removable" tuples ? - ANALYZE and statistics of dead rows
Since redirected or redirect-dead item pointers are counted as "dead rows", we overestimates the number of dead rows. It confuses statistics and ill-affects to autovacuums; If autovacuum does ANALYZE, the number of dead tuples looks suddenly increased and it triggers unnecessary VACUUMs by the next autovacuum.
A redirect-dead line pointer consumes 4 bytes of dead space in a page. If a table is full of redirect-dead line pointers, we should trigger vacuum on the table. May be we can maintain separate stats about redirect-dead line pointers and give them lower significance while deciding whether to vacuum or not. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com