A winbindd process taking up 100% could be caused by the problem documented in https://bugzilla.samba.org/show_bug.cgi?id=12105
Capturing a brief strace of the affected process and reporting that through a PMR would be helpful to debug this problem and provide a fix. To answer the wider question: Log files are kept in /var/adm/ras/. In case more detailed traces are required, use the mmprotocoltrace command. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ [email protected] || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" <[email protected]> To: gpfsug main discussion list <[email protected]> Date: 01/11/2017 07:00 AM Subject: Re: [gpfsug-discuss] CES log files Sent by: [email protected] Thanks. Some of the node would just say “failed” or “degraded” with the DCs offline. Of those that thought they were happy to host a CES IP address, they did not respond and winbindd process would take up 100% CPU as seen through top with no users on it. Interesting that even though all CES nodes had the same configuration, three of them never had a problem at all. JF – I’ll look at the protocol tracing next time this happens. It’s a rare thing that three DCs go offline at once but even so there should have been enough resiliency to cope. Thanks Richard From: [email protected] [ mailto:[email protected]] On Behalf Of Andrew Beattie Sent: 11 January 2017 09:55 To: [email protected] Cc: [email protected] Subject: Re: [gpfsug-discuss] CES log files mmhealth might be a good place to start CES should probably throw a message along the lines of the following: mmhealth shows something is wrong with AD server: ... CES DEGRADED ads_down ... Andrew Beattie Software Defined Storage - IT Specialist Phone: 614-2133-7927 E-mail: [email protected] ----- Original message ----- From: "Sobey, Richard A" <[email protected]> Sent by: [email protected] To: "'[email protected]'" <[email protected] > Cc: Subject: [gpfsug-discuss] CES log files Date: Wed, Jan 11, 2017 7:27 PM Which files do I need to look in to determine what’s happening with CES… supposing for example a load of domain controllers were shut down and CES had no clue how to handle this and stopped working until the DCs were switched back on again. Mmfs.log.latest said everything was fine btw. Thanks Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
