I found myself with a little treat this morning to the tune of tracing running on the entire cluster of 3500 nodes. There were no logs I could find to indicate *why* the tracing had started but it was clear it was initiated by the cluster manager.

Some sleuthing (thanks, collectl!) allowed me to figure out that the tracing started as the command:

/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon notifyOverload _asmgr

I thought that running "mmchocnfig deadlockOverloadThreshold=0 -i" would stop this from happening again but lo and behold tracing kicked off *again* (with the same caller) some time later even after setting that parameter.

What's odd is there are no log events to indicate an overload occurred.

Has anyone seen similar behavior?

We're on 4.2.3.6 efix17.

-Aaron

--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to