I have opened a PMR, and the official response reflects what you just posted.
In addition, it seems there are some performance issues with Python 2 that will 
be 
improved with eventual migration to Python 3.  I was unaware of the mmhealth
functions that the mmsysmon daemon provides. The impact we were seeing 
was some variation in MPI benchmark results when the nodes were fully loaded.
I suppose it would be possible to turn off mmsysmon during the benchmarking,
but I appreciate the effort at streamlining the monitor service.  Cutting back 
on
fork/exec, better python, less polling, more notifications…  all good.

Thanks for the details,

 — ddj

> On Jul 19, 2017, at 9:05 AM, Mathias Dietz <[email protected]> wrote:
> 
> thanks for the feedback. 
> 
> Let me clarify what mmsysmon is doing.
> Since IBM Spectrum Scale 4.2.1 the mmsysmon process is used for the overall 
> health monitoring and CES failover handling.
> Even without CES it is an essential part of the system because it monitors 
> the individual components and provides health state information and error 
> events. 
> This information is needed by other Spectrum Scale components (mmhealth 
> command, the IBM Spectrum Scale GUI, Support tools, Install Toolkit,..) and 
> therefore disabling mmsysmon will impact them. 
> 
> > It’s a huge problem. I don’t understand why it hasn’t been given 
> > much credit by dev or support.
> 
> Over the last couple of month, the development team has put a strong focus on 
> this topic. 
> In order to monitor the health of the individual components, mmsysmon listens 
> for notifications/callback but also has to do some polling.
> We are trying to reduce the polling overhead constantly and replace polling 
> with notifications when possible. 
> 
> Several improvements have been added to 4.2.3, including the ability to 
> configure the polling frequency to reduce the overhead. (mmhealth config 
> interval) 
> See 
> https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_mmhealth.htm
>  
> <https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_mmhealth.htm>
> In addition a new option has been introduced to clock align the monitoring 
> threads in order to reduce CPU jitter. 
> 
> Nevertheless, we don't see significant CPU consumption by mmsysmon on our 
> test systems. 
> It might be a problem specific to your system environment or a wrong 
> configuration therefore please get in contact with IBM support to analyze the 
> root cause of the high usage.
> 
> Kind regards
> 
> Mathias Dietz
> 
> IBM Spectrum Scale - Release Lead Architect and RAS Architect 
> 
> 
> [email protected] wrote on 07/18/2017 07:51:21 PM:
> 
> > From: Jonathon A Anderson <[email protected]>
> > To: gpfsug main discussion list <[email protected]>
> > Date: 07/18/2017 07:51 PM
> > Subject: Re: [gpfsug-discuss] mmsysmon.py revisited
> > Sent by: [email protected]
> > 
> > There’s no official way to cleanly disable it so far as I know yet; 
> > but you can defacto disable it by deleting /var/mmfs/mmsysmon/
> > mmsysmonitor.conf.
> > 
> > It’s a huge problem. I don’t understand why it hasn’t been given 
> > much credit by dev or support.
> > 
> > ~jonathon
> > 
> > 
> > On 7/18/17, 11:21 AM, "[email protected] on 
> > behalf of David Johnson" <[email protected] 
> > on behalf of [email protected]> wrote:
> > 
> >     
> >     
> >     
> >     We also noticed a fair amount of CPU time accumulated by mmsysmon.py on
> >     our diskless compute nodes. I read the earlier query, where it 
> > was answered:
> >     
> >     
> >     
> >     
> >     ces == Cluster Export Services,  mmsysmon.py comes from 
> > mmcesmon. It is used for managing export services of GPFS. If it is 
> > killed,  your nfs/smb etc will be out of work.
> >     Their overhead is small and they are very important. Don't 
> > attempt to kill them.
> >     
> >     
> >     
> >     
> >     
> >     
> >     Our question is this — we don’t run the latest “protocols", our 
> > NFS is CNFS, and our CIFS is clustered CIFS.
> >     I can understand it might be needed with Ganesha, but on every node? 
> >     
> >     
> >     Why in the world would I be getting this daemon running on all 
> > client nodes, when I didn’t install the “protocols" version 
> >     of the distribution?   We have release 4.2.2 at the moment.  How
> > can we disable this?
> >     
> >     
> >     Thanks,
> >      — ddj
> >     
> > 
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> > <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to