Hi all, Thought I would share a couple of tidbits on how we monitor Matterhorn. We use nagios to monitor several different sources, including:
1. if the load average of any of the workers/admin/engage is too high 2. if agents or servers are unreachable on the network (ping) 3. if the capture agent is capturing and if this concurs with the schedule The first couple are pretty easy to setup, but this last one required a bit of custom scripting. We're using this python script from our msub, you're welcome to help yourself to it (eventually it'll be merged back into trunk I imagine): https://opencast.jira.com/svn/MH/msub/usask.ca/trunk/1.2.x/docs/scripts/ubuntu_capture_agent/checkstatus.py Happy casting, Chris -- Christopher Brooks, BSc, MSc ARIES Laboratory, University of Saskatchewan Web: http://www.cs.usask.ca/~cab938 Phone: 1.306.966.1442 Mail: Advanced Research in Intelligent Educational Systems Laboratory Department of Computer Science University of Saskatchewan 176 Thorvaldson Building 110 Science Place Saskatoon, SK S7N 5C9 _______________________________________________ Matterhorn-users mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
