Hello,

> Is this a known problem? How can we fix this?

I'm experiencing a similiar problems with maui. Our current 'fix' for
the problem is to use supervisord to restart the process.

IMHO some of those crashes are correlated with a memory corruption
within torque. From time to time it begins to report copies from random
addresses of its heap in the 'jobs' field of 'pbsnodes -a', which should
be the same information maui aquires in the ClusterQuery query call.
I'm not sure this is the only reason for these crashes.
Also we are using the Torque 2.5.8/2.5.9 so this might not be the same
problem.

Regards,

Jörg Blank


_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to