On Thursday, August 22, 2013 07:26:33 AM [email protected] wrote:
> \¥/ * When one of the two www servers is unreachable
> \¥/ * When www-0 (the one hosting mailman) soaks up all its CPU
> \¥/ * When postfix goes down
> \¥/ * When space.synhak.org isn't reachable from the 'net
> \¥/ * When the webcam isn't reachable from the 'net
> \¥/ * When our monthly AWS expenditures goes above $15
> \¥/
> \¥/ Its not nagios, but its a start. I (and whoever else wants to opt in)
> get \¥/ a
> \¥/ text message when one of those alarms is triggered.
> 
> Here's a funny thought, Send the alarms to Phong and have the bot fix em :)
> 
> My friend had some scripts/batch that would check to see if a service is
> running and if not, it would restart that service. Would something like
> that help reduce alarms like these?

Thats systemd. Its just a matter of configuring it. However, I don't want it 
to restart automatically. When it dies, it is because the system runs out of 
RAM. If the system is starving for RAM to the point where it kills programs to 
free up some RAM, immediately restarting that program will only worsen the 
problem.

> 
> Where you able to try Nagios? I saw a talk about it at ALUG it sounded
> really nice.

Eventually we'll get to Nagios. I'm not really sure how we can efficiently fit 
it into our infrastructure yet, since I imagine we'd actually want a third 
server on our AWS cluster to support mail, mailman, running phong's various 
scripts, and other random things that need a server with a reliable 
connection. I understand that Chris has a machine out there somewhere on the 
'net that we can use.

> 
> _______________________________________________
> Discuss mailing list
> [email protected]
> http://synhak.org/mailman/listinfo/discuss
_______________________________________________
Discuss mailing list
[email protected]
http://synhak.org/mailman/listinfo/discuss

Reply via email to