Good morning,

for monitoring purposes we use check_pgactivity
(https://github.com/OPMDG/check_pgactivity) to check a number of
different metrics. One of those checks the last analyze timestamp of
either kind, another the last vacuum timestamp of either kind, and
alerts when either of those becomes older than a certain age that can
be specified as a parameter to these checks.

This is the SQL that runs to find these values:

$PG_VERSION_91 => qq{
            SELECT
                pg_is_in_recovery()::int AS is_in_recovery,
                current_database(),
                schemaname,
                a.relname,
                CASE WHEN NOT pg_is_in_recovery() THEN
                  coalesce(max(
                       coalesce(extract(epoch FROM
                         current_timestamp -
                                 greatest(last_${type},
last_auto${type})),
                        'infinity'::float)),
                   'NaN'::float)
                  ELSE 'NaN'::float END,
                CASE WHEN NOT pg_is_in_recovery() THEN
                  coalesce(sum(${type}_count), 0)
                  ELSE 0 END
                  AS ${type}_count,
                CASE WHEN NOT pg_is_in_recovery() THEN
                  coalesce(sum(auto${type}_count), 0)
                  ELSE 0 END
                  AS auto${type}_count,
                CASE WHEN NOT pg_is_in_recovery() THEN
                  (
                    SELECT
md5(tup_inserted::text||tup_updated::text||tup_deleted::text||'$c_limit
$w_limit'
                    )
                    FROM pg_catalog.pg_stat_database
                    WHERE datname = current_database()
                  )
                  ELSE '' END
            FROM pg_stat_all_tables a
            JOIN pg_class b on a.relid = b.oid
            WHERE schemaname NOT LIKE 'pg_temp_%'
              AND schemaname NOT LIKE 'pg_toast_temp_%'
              AND pg_relation_size(relid) >= $table_min_size
              AND schemaname NOT LIKE 'information_schema'
              AND (('${type}' = 'analyze' AND schemaname NOT LIKE
'pg_toast') -- TOAST tables aren't ANALYZEd
                OR  ('${type}' = 'vacuum'))
              AND (('${type}' = 'analyze' AND a.relname NOT LIKE
'pg_statistic') -- pg_statistic never gets ANALYZEd
                OR  ('${type}' = 'vacuum'))
              AND relkind <> 'p' -- partitioned table do not have
last_* information
            GROUP BY schemaname, a.relname
        }

{type} gets replaced by either analyze or vacuum.

What happens *sometimes* on servers is that the age of the oldest
maintenance increases to infinity, as the fields of last_analyze and
last_vacuum are emptied (NULL value).

However, looking into the monitoring system I can see that there must
have been values in these fields before. I can see the values coming
ouf of that check 2 hours ago, the values were not infinity.

Are there any circumstances under which these fields get emptied? From
reading the documentation any value should never be replaced by a NULL
value:

last_vacuum timestamp with time zone: Last time at which this table was
manually vacuumed (not counting VACUUM FULL)

last_autovacuum timestamp with time zone: Last time at which this table
was vacuumed by the autovacuum daemon

last_analyze timestamp with time zone: Last time at which this table
was manually analyzed

last_autoanalyze timestamp with time zone: Last time at which this
table was analyzed by the autovacuum daemon

I can probably work around that by using either vacuum against the
database or vacuumdb on the system, but I'd like to understand if there
are other factors that influence these fields, and maybe the monitoring
agent needs to change in that regard.


Reply via email to