I'm using Zabbix to monitor my stuff and created an extension 'httpspeed' that will return the response time in milliseconds. If the website returns an invalid code it will return a negative value...
Then I also monitor pound using poundctl with zabbix and it tests if there are
nodes not responding...
I noticed that poundctl didn't respond when the webserver became unavailable.
This was because no-one was asking for that webserver.
With that "httpspeed" test I'm getting an immediate response...
You could even have Zabbix to restart your server.... I'm not a fan of
automatic resonses and prefer to have intense monitoring where Zabbix tells me
what's working and what isn't so I can draw my own conclusions...
If I have a watchdog that automatically restarts servers then I will monitor
its logfile and report if a service constantly restarts....
# cat /usr/local/sbin/httpspeed
#!/bin/bash
export PATH=${PATH}:/usr/local/sbin:/sbin:/usr/sbin:/bin:/usr/bin
TIMEOUT=3
# If called by zabbix, handle some things different
if echo "${BASH_SOURCE}" | grep -q "zabbix" ; then
# get rid of 1st parameter (only on Zabbix 1.8x)
# shift 1
# Change TimeOut value to the one in /etc/zabbix/zabbix_server.conf
ZABBIX_TIMEOUT=`grep -i '^Timeout' /etc/zabbix/zabbix_server.conf 2>/dev/null
| awk -F= '{print $2}' | tr -cd '0-9'`
if [ -z "${ZABBIX_TIMEOUT}" ] ; then
TIMEOUT=3
else
# Let's take 1 second less than the one in /etc/zabbix/zabbix_server.conf
and just hope to be in time
TIMEOUT=$(( ${ZABBIX_TIMEOUT} - 1 ))
fi
fi
OPTIONS=
URL="$*"
RETVAL=-1
COUNTS=1
SCRATCH=`mktemp`
if [ ! -z "${URL}" ] ; then
# Check if it's https and modify OPTIONS
echo "${URL}" | grep -q 'https' && OPTIONS="${OPTIONS} -l"
# Somehow httping doesn't properly obey TimeOuts.. I'm running it in
background and use my own timeloop
# suppress terminal output
exec 2>/dev/null
# Start httping in background
httping ${OPTIONS} -s -c${COUNTS} ${URL} 2>/dev/null | grep -o 'time=.*' |
awk -F= '{print $2}' | sort -n | head -n1 >${SCRATCH} &
# Wait for swaks
let TIMEOUT*=2
n=1
while [ ! -s ${SCRATCH} ] ; do
sleep .48
[ $n -ge ${TIMEOUT} ] && break
let n++
done
if grep -q ' ms' ${SCRATCH} ; then
RETVAL=-2
SPEED=`grep -o '.* ms ' ${SCRATCH} | tr -cd '0-9.' | awk -F. '{print $1}'`
STATCODE=`cat ${SCRATCH} | awk '{print $3}' | tr -cd '0-9'`
if [ ! -z "${STATCODE}" ] ; then
RETVAL="-${STATCODE}"
if echo "${STATCODE}" | egrep -q '^(200|201|202|301|302|303|304|403|401)$'
; then
RETVAL=${SPEED}
fi
fi
else
# Too late you lazy bastard, I might as well kill you...
kill -9 %1 2>/dev/null
fi
fi
rm -f ${SCRATCH} 2>/dev/null
echo "${RETVAL}"
# grep pound /etc/zabbix/zabbix_agentd.conf
UserParameter=pound.backends.active, sudo poundctl -c
/var/run/pound/poundctl.socket 2>/dev/null | grep -ci 'Backend .* active'
UserParameter=pound.backends.alive, sudo poundctl -c
/var/run/pound/poundctl.socket 2>/dev/null | grep -ci 'Backend .* active .*
alive'
UserParameter=pound.backends.dead, sudo poundctl -c
/var/run/pound/poundctl.socket 2>/dev/null | grep -ci 'Backend .* active .*
DEAD'
UserParameter=pound.services, sudo poundctl -c /var/run/pound/poundctl.socket
2>/dev/null | grep -ci 'Service .* active'
UserParameter=pound.listeners, sudo poundctl -c /var/run/pound/poundctl.socket
2>/dev/null | grep -ci ' Listener '
-----Oorspronkelijk bericht-----
Afzender: Tim J. Rice <[email protected]>
Verstuurd: Zaterdag 22 Maart 2014 12:23
Aan: [email protected]
Onderwerp: Re: [Pound Mailing List] Pound not disabling dead backends
Landy, that is exactly what I do. We have a set of servers in two different
datacenters. We also have a monitoring server in each datacenter that monitor
the backends in the different DC's.
I simply use something like curl -s http://ipaddyofbackend | grep "Expected
output" | wc -l
Then create an if statement that restarts tomcat on the bad server if the
number is less than what I expect to see, sleeps x minutes, then tries again.
If fail on the second time, it bounces the machine and alerts on the problem.
Question for you is, if the backend server is completely down does pound not
try to send traffic to that server? I use Zenloadbalancer which uses pound and
it works very well. You may want to take a look at that. BTW it is free. I
virtualize the zenloadbalancer using kvm and it really is rock solid.
Best luck to you.
Tim R.
--------------------------------
From: "Landy Bible" <[email protected]>
To: [email protected]
Sent: Friday, March 21, 2014 8:50:59 PM
Subject: [Pound Mailing List] Pound not disabling dead backends
Hey all,
I'm having some trouble with Pound. I have a set of application servers behind
Pound that aren't terribly stable. However, pound doesn't seem to kick bad
backends out of rotation like it should. I get log message like so:
Mar 21 20:33:02 poundcat pound: (b58ffb40) e500 response error read from
xxx.xxx.xxx.203:8080/GET /DmpMediaProvider/displaySlide?id=2809 HTTP/1.1:
Connection timed out (15.017 secs)
This happens over and over again, yet my poundctl output claims it's still
alive.
2. Backend xxx.xxx.xxx.203:8080 active (5 0.000 sec) alive
Any ideas why this is happening? Earlier this week I had 3/4 of the backend
servers go down and pound was still happily forwarding requests to the dead
servers. At this point I'm considering writing a script to watch for those log
messages and use poundctl to kick the dead backends out.
I'm running pound 2.5 installed from apt-get on ubuntu server 12.04 LTS.
Thanks,
Landy
<<attachment: winmail.dat>>
