It's not likely the 500 error, it's probably the HUP not finding the PID 
correctly that was the problem.

I have a very similar problem a lot right now.  I just took over running 
smokeping when a co-worker left, so I was starting from no experience other 
than looking at the graphs. the first thing I had to do was move from an RHEL 4 
server to an RHEL 6 server. I've another post to make asking for help tuning 
fcgi to work, as I had to go from a perperl version (2.4.3) to an fcgi version 
(2.6.8) and it's not too stable (the system load is either .5 or 150, it 
doesn't have much in between).  

The last person running smokeping hadn't done some config file updates, and the 
load on the system was very high, so he'd removed some slaves. as I put them 
all back a little at a time I would update the config file, and the slaves 
would all get this error and stop sending updates.  what I traced it down to is 
that the slaves are not running in daemon mode, but the software only writes 
out a PID file in daemon mode. since my startup up script removes old PID 
files, I get a pid file missing error when the HUP happens.  You might check 
and be sure your PID file on the slaves is correct for the current process, not 
an old one that didn't get deleted in the past. 

I've looked through the docs and tried to search the mailing list archives, but 
haven't figured this one out yet. So can anyone tell me if slaves have to be 
run "--nodaemon" for some reason? it would be handy to have the PID file there, 
things would be a lot nicer since it tries to do the HUP in nodaemon mode, even 
though there is no PID file.

On Jul 17, 2012, at 5:22, Jason Yates wrote:

> Hi,
> 
> There was a config error on my master smokeping box (as a result of somebody 
> trying to add a menu option without specifying a menu or title param), the 
> error itself was resolved within minutes however during the “outage” the 
> slaves must have polled and as a result logged the following (logs are in 
> reverse time order).
>  
> Mon Jul 16 15:30:52 2012 - ERROR: no instance of SmokePing running (pid 
> 30738)?
> Mon Jul 16 15:30:52 2012 - server has new config for me ... HUPing the parent
> Mon Jul 16 15:30:52 2012 - Sent data to Server and got new config in response.
>  
> <FPING is still running here. Logs cut.>
>  
> Mon Jul 16 15:30:38 2012 - ERROR: we did not get config from the master. 
> Maybe we are not configured as a slave for any of the tar
> gets on the master ?
> Mon Jul 16 15:30:38 2012 - WARNING Master said 500 Internal Server Error
> Mon Jul 16 15:30:34 2012 - Got HUP signal.
> Mon Jul 16 15:30:34 2012 - server has new config for me ... HUPing the parent
> Mon Jul 16 15:30:34 2012 - Sent data to Server and got new config in response.
>  
> Is it possible to stop the slaves from killing smokeping on the first 500 
> error? Or have it automatically restart? I’d prefer not to have to go around 
> and restart the slave processes each time a config error is made.
>  
> Thanks all.
>  
> Jason Yates
> Network Engineer
> 
> Office: +44 208 834 8493
> Mobile: +44 7590 534249
> IS Networks : +44 208 834 8573
> 
> Betfair. The World’s Biggest Betting Community.
> 
> Please consider the environment before printing this e-mail.
> 
> Betfair Limited | Winslow Road | Hammersmith Embankment | London | W6 9HP. 
> Registered in England and Wales under company number 5140986.
> 
> This email (which includes any attachment and any subsequent reply) is sent 
> for and on behalf of one or more operating entities in the Betfair Group, 
> details of which are available here. The information in this e-mail is 
> confidential and may contain legal advice that is subject to legal privilege. 
> As such it is intended only for the named recipient(s). This e-mail may not 
> be disclosed or used by any person other than the addressee, nor may it be 
> copied in any way. If you are not a named recipient please notify the sender 
> immediately and delete any copies of this email. Any unauthorised copying, 
> disclosure or distribution of the material in this e-mail is strictly 
> forbidden. Any view or opinions presented are solely those of the author and 
> do not necessarily represent those of the Betfair Group. Betfair® and the 
> BETFAIR LOGO are registered trade marks of The Sporting Exchange Limited.
>  
>  
> 
> ________________________________________________________________________
> In order to protect our email recipients, Betfair Group use SkyScan from 
> MessageLabs to scan all Incoming and Outgoing mail for viruses.
> 
> ________________________________________________________________________
> _______________________________________________
> smokeping-users mailing list
> [email protected]
> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users

-- 
-debbie
Debbie Fligor, n9dn       Lead Network Engineer for CITES @ Univ. of Il
email: [email protected]          <http://www.uiuc.edu/ph/www/fligor>
"Every keystroke can be monitored. And the computers never forget."







_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users

Reply via email to