Hi Fred,

We sometimes find a node will show that GPFS is active when running mmgetstate, 
but one of our GPFS filesystems, (such as our home or projects filesystems) are 
inaccessible to users, while the other GPFS-mounted filesystems behave as 
expected. Our current node health checks don’t always detect this, especially 
when it’s for a resource-based mount that doesn’t impact the node but would 
impact jobs trying to run on the node.

If there is something native to GPFS that can detect this, all the better, but 
I’m simply unaware of how to do so.

Thanks,

Alex

Senior Systems Administrator

Research Computing Infrastructure
Northwestern University Information Technology (NUIT)

2020 Ridge Ave
Evanston, IL 60208-4311

O: (847) 491-2219
M: (312) 887-1881
www.it.northwestern.edu

________________________________
From: [email protected] 
<[email protected]> on behalf of Frederick Stock 
<[email protected]>
Sent: Friday, August 9, 2019 1:03:09 PM
To: [email protected] <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: [gpfsug-discuss] Checking for Stale File Handles

Are you able to explain why you want to check for stale file handles?  Are you 
attempting to detect failures of some sort, and why do the existing mechanisms 
in GPFS not provide the functionality you require?

Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
[email protected]


----- Original message -----
From: Alexander John Mamach <[email protected]>
Sent by: [email protected]
To: "[email protected]" <[email protected]>
Cc:
Subject: [EXTERNAL] [gpfsug-discuss] Checking for Stale File Handles
Date: Fri, Aug 9, 2019 1:46 PM


Hi folks,



We’re currently investigating a way to check for stale file handles on the 
nodes across our cluster in a way that minimizes impact to the filesystem and 
performance.



Has anyone found a direct way of doing so? We considered a few methods, 
including simply attempting to ls a GPFS filesystem from each node, but that 
might have false positives, (detecting slowdowns as stale file handles), and 
could negatively impact performance with hundreds of nodes doing this 
simultaneously.



Thanks,



Alex



Senior Systems Administrator

Research Computing Infrastructure
Northwestern University Information Technology (NUIT)

2020 Ridge Ave
Evanston, IL 60208-4311

O: (847) 491-2219
M: (312) 887-1881
www.it.northwestern.edu



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=DWNlS6sDn1OrnGOrEhXtg2NFm8p2BHsvrT1P2_F36Mg&m=OsNhvPWE6Qrbzn5DFPdDOj4Rx4ujTI8EnYl4t9v-KXQ&s=fMvliztGa5f_ZKOREibEaRgtHOw_jb3qvBlVVfQj0SY&e=>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to