Hi All,

I’m pretty new to using the GPFS GUI for health and performance monitoring, but 
am finding it very useful.  I’ve got an issue that I can’t figure out.  In my 
events I see:

Event name:pool-data_high_error
Component:File SystemEntity
type:PoolEntity
name: <redacted>
Event time:3/26/18 4:44:10 PM
Message:The pool <redacted> of file system <redacted> reached a nearly 
exhausted data level. DataPool_capUtilDescription:The pool reached a nearly 
exhausted level.
Cause:The pool reached a nearly exhausted level.
User action:Add more capacity to pool or move data to different pool or delete 
data and/or snapshots.
Reporting node:<redacted>
Event type:Active health state of an entity which is monitored by the system.

Now this is for a “capacity” pool … i.e. one that mmapplypolicy is going to 
fill up to 97% full.  Therefore, I’ve modified the thresholds:

### Threshold Rules ###
rule_name             metric                error  warn    direction  filterBy  
groupBy                                            sensitivity
--------------------------------------------------------------------------------------------------------------------------------------------------
InodeCapUtil_Rule     Fileset_inode         90.0   80.0    high                 
gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name      300
MemFree_Rule          mem_memfree           50000  100000  low                  
node                                               300
MetaDataCapUtil_Rule  MetaDataPool_capUtil  90.0   80.0    high                 
gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name  300
DataCapUtil_Rule      DataPool_capUtil      99.0   90.0    high                 
gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name  300

But it’s still in an “Error” state.  I see that the time of the event is March 
26th at 4:44 PM, so I’m thinking this is something that’s just stale, but I 
can’t figure out how to clear it.  The mmhealth command shows the error, too, 
and from that message it appears as if the event was triggered prior to my 
adjusting the thresholds:

Event                     Parameter     Severity    Active Since             
Event Message
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
pool-data_high_error      redacted        ERROR       2018-03-26 16:44:10      
The pool redacted of file system redacted reached a nearly exhausted data 
level. 90.0

What do I need to do to get the GUI / mmhealth to recognize the new thresholds 
and clear this error?  I’ve searched and searched in the GUI for a way to clear 
it.  I’ve read the “Monitoring and Managing IBM Spectrum Scale Using the GUI” 
rebook pretty much cover to cover and haven’t found anything there about how to 
clear this.  Thanks...

Kevin
—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu> - 
(615)875-9633



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to