Hi again all,

I received a direct response and am not sure whether that means the sender did 
not want to be identified, but they asked good questions that I wanted to 
answer on list…

No, we do not use snapshots on this filesystem.

No, we’re not using HSM … our tape backup system is a traditional backup system 
not named TSM.  We’ve created a top level directory in the filesystem called 
“RESTORE” and are restoring everything under that … then doing our moves / 
deletes of what we’ve restored … so I *think* that means all of that should be 
written to the gpfs23data pool?!?

On the “plus” side, I may figure this out myself soon when someone / something 
starts getting I/O errors!  :-O

In the meantime, other ideas are much appreciated!

Kevin


Do you have a job that’s creating snapshots?  That’s an easy one to overlook.

Not sure if you are using an HSM. Any new file that gets generated should 
follow the default rule in ILM unless if meets a placement condition. It would 
only be if you’re using an HSM that files would be placed in a non-placement 
location pool but that is purely because the the file location has already been 
updated to the capacity pool.




On Thu, Jun 7, 2018 at 8:17 AM -0600, "Buterbaugh, Kevin L" 
<kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu>> wrote:

Hi All,

First off, I’m on day 8 of dealing with two different mini-catastrophes at work 
and am therefore very sleep deprived and possibly missing something obvious … 
with that disclaimer out of the way…

We have a filesystem with 3 pools:  1) system (metadata only), 2) gpfs23data 
(the default pool if I run mmlspolicy), and 3) gpfs23capacity (where files with 
an atime - yes atime - of more than 90 days get migrated to by a script that 
runs out of cron each weekend.

However … this morning the free space in the gpfs23capacity pool is dropping … 
I’m down to 0.5 TB free in a 582 TB pool … and I cannot figure out why.  The 
migration script is NOT running … in fact, it’s currently disabled.  So I can 
only think of two possible explanations for this:

1.  There are one or more files already in the gpfs23capacity pool that someone 
has started updating.  Is there a way to check for that … i.e. a way to run 
something like “find /gpfs23 -mtime -7 -ls” but restricted to only files in the 
gpfs23capacity pool.  Marc Kaplan - can mmfind do that??  ;-)

2.  We are doing a large volume of restores right now because one of the 
mini-catastrophes I’m dealing with is one NSD (gpfs23data pool) down due to a 
issue with the storage array.  We’re working with the vendor to try to resolve 
that but are not optimistic so we have started doing restores in case they come 
back and tell us it’s not recoverable.  We did run “mmfileid” to identify the 
files that have one or more blocks on the down NSD, but there are so many that 
what we’re doing is actually restoring all the files to an alternate path 
(easier for out tape system), then replacing the corrupted files, then deleting 
any restores we don’t need.  But shouldn’t all of that be going to the 
gpfs23data pool?  I.e. even if we’re restoring files that are in the 
gpfs23capacity pool shouldn’t the fact that we’re restoring to an alternate 
path (i.e. not overwriting files with the tape restores) and the default pool 
is the gpfs23data pool mean that nothing is being restored to the 
gpfs23capacity pool???

Is there a third explanation I’m not thinking of?

Thanks...

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu> - 
(615)875-9633




_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to