Ah, that makes sense. I think I'll add some logic to the script that has it get new data points if it comes up with a negative value.
Thanks for the insight. QH On Mon, Apr 22, 2013 at 5:11 PM, Andres Freund <and...@2ndquadrant.com>wrote: > On 2013-04-22 16:36:38 -0600, Quentin Hartman wrote: > > I'm using this script to check my replication lag on my streaming > > replication pairs with Nagios: > > > > https://gist.github.com/jacobian/743942 > > > > It generally works fine, but will occasionally return a negative lag > value > > (-37kb for example) which of course causes it to throw an alarm, but is > > total nonsense. I've been working on the assumption that it is some sort > of > > bug in the script, but in taking a quick look at it nothing jumps out at > me. > > > > Is there something in Postgres itself that could cause this to happen > once > > in awhile? Is it something to be concerned about? Is there a better way > to > > monitor this state? > > Well, between the time pg_current_xlog_location() is run on the primary > and pg_last_xlog_replay_location() on the standby some time passes, so > its not all that unlikely that wal has been generated, streamed *and* > applied in that time. Given the short timeframe it only happens every > now and then. > > Did you check the pg_stat_replication view on the primary? > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >