On 11/10/14, 7:52 PM, Tom Lane wrote:
On the whole, I'm +1 for just logging the events and seeing what we learn
that way.  That seems like an appropriate amount of effort for finding out
whether there is really an issue.

Attached is a patch that does this.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
>From a8e824900d7c68e2c242b28c9c06c854f01b770a Mon Sep 17 00:00:00 2001
From: Jim Nasby <jim.na...@bluetreble.com>
Date: Sun, 30 Nov 2014 20:43:47 -0600
Subject: [PATCH] Log cleanup lock acquisition failures in vacuum

---

Notes:
    Count how many times we fail to grab the page cleanup lock on the first try,
    logging it with different wording depending on whether scan_all is true.

 doc/src/sgml/ref/vacuum.sgml      | 1 +
 src/backend/commands/vacuumlazy.c | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 450c94f..1272c1c 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -252,6 +252,7 @@ DETAIL:  CPU 0.01s/0.06u sec elapsed 0.07 sec.
 INFO:  "onek": found 3000 removable, 1000 nonremovable tuples in 143 pages
 DETAIL:  0 dead tuples cannot be removed yet.
 There were 0 unused item pointers.
+Could not acquire cleanup lock on 0 pages.
 0 pages are entirely empty.
 CPU 0.07s/0.39u sec elapsed 1.56 sec.
 INFO:  analyzing "public.onek"
diff --git a/src/backend/commands/vacuumlazy.c 
b/src/backend/commands/vacuumlazy.c
index 6db6c5c..8f22ed2 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -105,6 +105,8 @@ typedef struct LVRelStats
        BlockNumber old_rel_pages;      /* previous value of pg_class.relpages 
*/
        BlockNumber rel_pages;          /* total number of pages */
        BlockNumber scanned_pages;      /* number of pages we examined */
+       /* number of pages we could not initially get lock on */
+       BlockNumber     nolock;
        double          scanned_tuples; /* counts only tuples on scanned pages 
*/
        double          old_rel_tuples; /* previous value of pg_class.reltuples 
*/
        double          new_rel_tuples; /* new estimated total # of tuples */
@@ -346,6 +348,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
                        ereport(LOG,
                                        (errmsg("automatic vacuum of table 
\"%s.%s.%s\": index scans: %d\n"
                                                        "pages: %d removed, %d 
remain\n"
+                                                       "%s cleanup lock on %u 
pages.\n"
                                                        "tuples: %.0f removed, 
%.0f remain, %.0f are dead but not yet removable\n"
                                                        "buffer usage: %d hits, 
%d misses, %d dirtied\n"
                                          "avg read rate: %.3f MB/s, avg write 
rate: %.3f MB/s\n"
@@ -356,6 +359,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
                                                        
vacrelstats->num_index_scans,
                                                        
vacrelstats->pages_removed,
                                                        vacrelstats->rel_pages,
+                                                       scan_all ? "Waited for" 
: "Could not acquire", vacrelstats->nolock,
                                                        
vacrelstats->tuples_deleted,
                                                        
vacrelstats->new_rel_tuples,
                                                        
vacrelstats->new_dead_tuples,
@@ -611,6 +615,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
                /* We need buffer cleanup lock so that we can prune HOT chains. 
*/
                if (!ConditionalLockBufferForCleanup(buf))
                {
+                       vacrelstats->nolock++;
+
                        /*
                         * If we're not scanning the whole relation to guard 
against XID
                         * wraparound, it's OK to skip vacuuming a page.  The 
next vacuum
@@ -1101,10 +1107,12 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
                                        vacrelstats->scanned_pages, nblocks),
                         errdetail("%.0f dead row versions cannot be removed 
yet.\n"
                                           "There were %.0f unused item 
pointers.\n"
+                                          "%s cleanup lock on %u pages.\n"
                                           "%u pages are entirely empty.\n"
                                           "%s.",
                                           nkeep,
                                           nunused,
+                                          scan_all ? "Waited for" : "Could not 
acquire", vacrelstats->nolock,
                                           empty_pages,
                                           pg_rusage_show(&ru0))));
 }
-- 
2.1.2

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to