AW: [Pound Mailing List] Pound not disabling dead backends

Jean-Pierre van Melis Thu, 03 Apr 2014 02:41:36 -0700

I'm using Zabbix to monitor my stuff and created an extension 'httpspeed' that 
will return the response time in milliseconds.
If the website returns an invalid code it will return a negative value...


Then I also monitor pound using poundctl with zabbix and it tests if there are 
nodes not responding...

I noticed that poundctl didn't respond when the webserver became unavailable. 
This was because no-one was asking for that webserver.
With that "httpspeed" test I'm getting an immediate response...

You could even have Zabbix to restart your server.... I'm not a fan of 
automatic resonses and prefer to have intense monitoring where Zabbix tells me 
what's working and what isn't so I can draw my own conclusions...
If I have a watchdog that automatically restarts servers then I will monitor 
its logfile and report if a service constantly restarts....


# cat /usr/local/sbin/httpspeed
#!/bin/bash
export PATH=${PATH}:/usr/local/sbin:/sbin:/usr/sbin:/bin:/usr/bin

TIMEOUT=3

# If called by zabbix, handle some things different
if echo "${BASH_SOURCE}" | grep -q "zabbix" ; then
  # get rid of 1st parameter (only on Zabbix 1.8x)
  # shift 1

  # Change TimeOut value to the one in /etc/zabbix/zabbix_server.conf
  ZABBIX_TIMEOUT=`grep -i '^Timeout' /etc/zabbix/zabbix_server.conf 2>/dev/null 
| awk -F= '{print $2}' | tr -cd '0-9'`
  if [ -z "${ZABBIX_TIMEOUT}" ] ; then
    TIMEOUT=3
  else
    # Let's take 1 second less than the one in /etc/zabbix/zabbix_server.conf 
and just hope to be in time
    TIMEOUT=$(( ${ZABBIX_TIMEOUT} - 1 ))
  fi
fi

OPTIONS=
URL="$*"

RETVAL=-1
COUNTS=1
SCRATCH=`mktemp`

if [ ! -z "${URL}" ] ; then
 # Check if it's https and modify OPTIONS
 echo "${URL}" | grep -q 'https' && OPTIONS="${OPTIONS} -l"


 # Somehow httping doesn't properly obey TimeOuts..  I'm running it in 
background and use my own timeloop

 # suppress terminal output
 exec 2>/dev/null
 # Start httping in background
 httping ${OPTIONS} -s -c${COUNTS} ${URL} 2>/dev/null | grep -o 'time=.*' |  
awk -F= '{print $2}' | sort -n | head -n1 >${SCRATCH} &

 # Wait for swaks
 let TIMEOUT*=2
 n=1
 while [ ! -s ${SCRATCH} ] ; do
  sleep .48
  [ $n -ge ${TIMEOUT} ] && break
  let n++
 done

 if grep -q ' ms' ${SCRATCH} ; then

  RETVAL=-2
  SPEED=`grep -o '.* ms ' ${SCRATCH} | tr -cd '0-9.' | awk -F. '{print $1}'`
  STATCODE=`cat ${SCRATCH} | awk '{print $3}' | tr -cd '0-9'`

  if [ ! -z "${STATCODE}" ] ; then
   RETVAL="-${STATCODE}"
   if echo "${STATCODE}" |  egrep -q '^(200|201|202|301|302|303|304|403|401)$' 
; then
    RETVAL=${SPEED}
   fi
  fi
 else
  # Too late you lazy bastard, I might as well kill you...
  kill -9 %1 2>/dev/null
 fi
fi

rm -f ${SCRATCH} 2>/dev/null
echo "${RETVAL}"


# grep pound /etc/zabbix/zabbix_agentd.conf
UserParameter=pound.backends.active, sudo poundctl -c 
/var/run/pound/poundctl.socket 2>/dev/null | grep -ci 'Backend .* active'
UserParameter=pound.backends.alive, sudo poundctl -c 
/var/run/pound/poundctl.socket 2>/dev/null | grep -ci 'Backend .* active .* 
alive'
UserParameter=pound.backends.dead, sudo poundctl -c 
/var/run/pound/poundctl.socket 2>/dev/null | grep -ci 'Backend .* active .* 
DEAD'
UserParameter=pound.services, sudo poundctl -c /var/run/pound/poundctl.socket 
2>/dev/null | grep -ci 'Service .* active'
UserParameter=pound.listeners, sudo poundctl -c /var/run/pound/poundctl.socket 
2>/dev/null | grep -ci ' Listener '





-----Oorspronkelijk bericht-----
Afzender: Tim J. Rice <[email protected]>
Verstuurd: Zaterdag 22 Maart 2014 12:23
Aan: [email protected]
Onderwerp: Re: [Pound Mailing List] Pound not disabling dead backends

Landy, that is exactly what I do.  We have a set of servers in two different 
datacenters.  We also have a monitoring server in each datacenter that monitor 
the backends in the different DC's.

I simply use something like curl -s http://ipaddyofbackend | grep "Expected 
output" | wc -l
Then create an if statement that restarts tomcat on the bad server if the 
number is less than what I expect to see, sleeps x minutes, then tries again.  
If fail on the second time, it bounces the machine and alerts on the problem.

Question for you is, if the backend server is completely down does pound not 
try to send traffic to that server?  I use Zenloadbalancer which uses pound and 
it works very well.  You may want to take a look at that.  BTW it is free.  I 
virtualize the zenloadbalancer using kvm and it really is rock solid.

Best luck to you.
Tim R.




--------------------------------
From: "Landy Bible" <[email protected]>
To: [email protected]
Sent: Friday, March 21, 2014 8:50:59 PM
Subject: [Pound Mailing List] Pound not disabling dead backends

Hey all,

I'm having some trouble with Pound.  I have a set of application servers behind 
Pound that aren't terribly stable.  However, pound doesn't seem to kick bad 
backends out of rotation like it should.  I get log message like so:

Mar 21 20:33:02 poundcat pound: (b58ffb40) e500 response error read from 
xxx.xxx.xxx.203:8080/GET /DmpMediaProvider/displaySlide?id=2809 HTTP/1.1: 
Connection timed out (15.017 secs)

This happens over and over again, yet my poundctl output claims it's still 
alive.

2. Backend xxx.xxx.xxx.203:8080 active (5 0.000 sec) alive

Any ideas why this is happening?  Earlier this week I had 3/4 of the backend 
servers go down and pound was still happily forwarding requests to the dead 
servers.  At this point I'm considering writing a script to watch for those log 
messages and use poundctl to kick the dead backends out.

I'm running pound 2.5 installed from apt-get on ubuntu server 12.04 LTS.

Thanks,
Landy

<<attachment: winmail.dat>>

AW: [Pound Mailing List] Pound not disabling dead backends

Reply via email to