Hello all,

  At the Helics II cluster (IWR, Heidelberg, Germany) we are using the
OpenIPMI python library to monitor the compute nodes. The master host, on
which
our python daemon runs, uses Debian Etch with OpenIPMI v. 2.0.14. The daemon
runs fine for about a week on average (from 3 up to 10 days) and then it
 suddenly  crashes. This has happened repeatedly for the past half-year or
so. Since this is a purely Python daemon, we suspect that there might be a
problem
with OpenIPMI.

  There are never tracebacks and the daemon either crashes, freezes, or
segfaults. We are using screen heavily, but this happens even when it is run
directly from bash. Most interestingly, this problem persisted for a couple
of months, then after an upgrade of the operating system it was running well
for a couple of months. However, it started crashing again since the
beginning of April. An upgrade to 2.0.16 didn't help. We haven't modified
the code of
the daemon at least for a year.

  Does anybody has any idea why this might happen? Thanks in advance.

Greetings,

Venelin Petkov

Student Computer Administrator at IWR,
Heidelberg, Germany
------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to