[PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Tony Wasson

Ever since I started working with PostgreSQL I've heard the need to
watch transaction IDs. The phrase transaction ID wraparound still
gives me a shiver. Attached it a short script that works with the
monitoring system Nagios to keep an eye on transaction IDs. It should
be easy to adapt to any other monitoring system.

It runs the textbook query below and reports how close you are to wraparound.
 SELECT datname, age(datfrozenxid) FROM pg_database;

The script detects a wrap at 2 billion. It starts warning once one or
more databases show an age over 1 billion transactions. It reports
critical at 1.5B transactions. I hope everyone out there is vacuuming
*all* databases often.

Hope some of you can use this script!
Tony Wasson


check_pg_transactionids.pl
Description: Perl program

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Vivek Khera


On May 2, 2006, at 2:26 PM, Tony Wasson wrote:


The script detects a wrap at 2 billion. It starts warning once one or
more databases show an age over 1 billion transactions. It reports
critical at 1.5B transactions. I hope everyone out there is vacuuming
*all* databases often.


Something seems wrong... I just ran your script against my  
development database server which is vacuumed daily and it said I was  
53% of the way to 2B.  Seemed strange to me, so I re-ran vacuum -a - 
z to vacuum all databases (as superuser), reran the script and got  
the same answer.





smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Alvaro Herrera
Vivek Khera wrote:
 
 On May 2, 2006, at 2:26 PM, Tony Wasson wrote:
 
 The script detects a wrap at 2 billion. It starts warning once one or
 more databases show an age over 1 billion transactions. It reports
 critical at 1.5B transactions. I hope everyone out there is vacuuming
 *all* databases often.
 
 Something seems wrong... I just ran your script against my  
 development database server which is vacuumed daily and it said I was  
 53% of the way to 2B.  Seemed strange to me, so I re-ran vacuum -a - 
 z to vacuum all databases (as superuser), reran the script and got  
 the same answer.

That's right, because a database's age is only decremented in
database-wide vacuums.  (Wow, who wouldn't want a person-wide vacuum if
it did the same thing ...)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Tony Wasson

On 5/2/06, Vivek Khera [EMAIL PROTECTED] wrote:


On May 2, 2006, at 2:26 PM, Tony Wasson wrote:

 The script detects a wrap at 2 billion. It starts warning once one or
 more databases show an age over 1 billion transactions. It reports
 critical at 1.5B transactions. I hope everyone out there is vacuuming
 *all* databases often.

Something seems wrong... I just ran your script against my
development database server which is vacuumed daily and it said I was
53% of the way to 2B.  Seemed strange to me, so I re-ran vacuum -a -
z to vacuum all databases (as superuser), reran the script and got
the same answer.


Ah thanks, it's a bug in my understanding of the thresholds.

With the standard freezing policy, the age column will start at one
billion for a freshly-vacuumed database.

So essentially, 1B is normal, 2B is the max. The logic is now..

The script detects a wrap at 2 billion. It starts warning once one or
more databases show an age over 1.5 billion transactions. It reports
critical at 1.75B transactions.

If anyone else understands differently, hit me with a clue bat.


check_pg_transactionids.pl
Description: Perl program

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Bruno Wolff III
On Tue, May 02, 2006 at 12:06:30 -0700,
  Tony Wasson [EMAIL PROTECTED] wrote:
 
 Ah thanks, it's a bug in my understanding of the thresholds.
 
 With the standard freezing policy, the age column will start at one
 billion for a freshly-vacuumed database.
 
 So essentially, 1B is normal, 2B is the max. The logic is now..
 
 The script detects a wrap at 2 billion. It starts warning once one or
 more databases show an age over 1.5 billion transactions. It reports
 critical at 1.75B transactions.
 
 If anyone else understands differently, hit me with a clue bat.

Isn't this obsolete now anyway? I am pretty sure 8.1 has safeguards against
wrap around.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Tony Wasson

On 5/2/06, Bruno Wolff III [EMAIL PROTECTED] wrote:

On Tue, May 02, 2006 at 12:06:30 -0700,
  Tony Wasson [EMAIL PROTECTED] wrote:

 Ah thanks, it's a bug in my understanding of the thresholds.

 With the standard freezing policy, the age column will start at one
 billion for a freshly-vacuumed database.

 So essentially, 1B is normal, 2B is the max. The logic is now..

 The script detects a wrap at 2 billion. It starts warning once one or
 more databases show an age over 1.5 billion transactions. It reports
 critical at 1.75B transactions.

 If anyone else understands differently, hit me with a clue bat.

Isn't this obsolete now anyway? I am pretty sure 8.1 has safeguards against
wrap around.


My motivation was primarily to monitor some existing PostgreSQL 8.0
servers. I'm not convinced it is safe to stop worrying about
transaction ids even on an 8.1 box.

It is comforting that 8.1 does safeguard against wraparound in at
least 2 ways. First, it emits a warnings during the last 10 million
transactions. If you manage to ignore all those, posgresql will shut
down before a wraparound. I think PostgreSQL does everything correctly
there, but I suspect someone will run into the shut down daemon
problem.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Jim C. Nasby
On Tue, May 02, 2006 at 03:03:40PM -0400, Alvaro Herrera wrote:
 That's right, because a database's age is only decremented in
 database-wide vacuums.  (Wow, who wouldn't want a person-wide vacuum if
 it did the same thing ...)

The heck with age, I'd take a person-wide vacuum if it just got rid of
all my 'dead rows'...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] postgresql transaction id monitoring with nagios

2006-05-02 Thread Jim C. Nasby
On Tue, May 02, 2006 at 12:06:30PM -0700, Tony Wasson wrote:
 Ah thanks, it's a bug in my understanding of the thresholds.
 
 With the standard freezing policy, the age column will start at one
 billion for a freshly-vacuumed database.
 
 So essentially, 1B is normal, 2B is the max. The logic is now..
 
 The script detects a wrap at 2 billion. It starts warning once one or
 more databases show an age over 1.5 billion transactions. It reports
 critical at 1.75B transactions.
 
 If anyone else understands differently, hit me with a clue bat.

You should take a look at the code in -HEAD that triggers autovacuum to
do a XID-wrap-prevention vacuum, as well as the code that warns that
we're approaching wrap. From memory, the limit for the later is

max_transactions  3

Where max_transactions should be 4B on most platforms.

I'm intending to submit a patch to clean some of that code up (put all
the thresholds in one .h file rather than how they're spread through
source code right now); if you drop me an email off-list I'll send you
info once I do that.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org