Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Marc A Kaplan
If your restore software uses the gpfs_fputattrs() or 
gpfs_fputattrswithpathname methods, notice there are some options to 
control the pool. 

AND there is also the possibility of using the little known "RESTORE" 
policy rule to algorithmically control the pool selection by different 
criteria than
the SET POOL rule.  When all else fails ... Read The Fine Manual ;-)





From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   06/07/2018 03:37 PM
Subject:Re: [gpfsug-discuss] Capacity pool filling
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Uwe,

Thanks for your response.

So our restore software lays down the metadata first, then the data. While 
it has no specific knowledge of the extended attributes, it does back them 
up and restore them.  So the only explanation that makes sense to me is 
that since the inode for the file says that the file should be in the 
gpfs23capacity pool, the data gets written there.

Right now I don’t have time to do an analysis of the “live” version of a 
fileset and the “restored” version of that same fileset to see if the 
placement of the files matches up.  My quick and dirty checks seem to show 
files getting written to all 3 pools.  Unfortunately, we have no way to 
tell our tape software to ignore files from the gpfs23capacity pool (and 
we’re aware that we won’t need those files).  We’ve also determined that 
it is actually quicker to tell our tape system to restore all files from a 
fileset than to take the time to tell it to selectively restore only 
certain files … and the same amount of tape would have to be read in 
either case.

Our SysAdmin who is primary on tape backup and restore was going on 
vacation the latter part of the week, so he decided to be helpful and just 
queue up all the restores to run one right after the other.  We didn’t 
realize that, so we are solving our disk space issues by slowing down the 
restores until we can run more instances of the script that replaces the 
corrupted files and deletes the unneeded restored files.

Thanks again…

Kevin

> On Jun 7, 2018, at 1:34 PM, Uwe Falke  wrote:
> 
>> However, I took a look in one of the restore directories under 
>> /gpfs23/ RESTORE using mmlsattr and I see files in all 3 pools! 
> 
> 
>> So ? I don?t think GPFS is doing this but the next thing I am 
>> going to do is follow up with our tape software vendor ? I bet 
>> they preserve the pool attribute on files and - like Jaime said - 
>> old stuff is therefore hitting the gpfs23capacity pool.
> 
> Hm, then the backup/restore must be doing very funny things. Usually, 
GPFS 
> should rule the 
> placement of new files, and I assume that a restore of a file, in 
> particular under a different name, 
> creates a new file. So, if your backup tool does override that GPFS 
> placement, it must be very 
> intimate with Scale :-). 
> I'd do some list scans of the capacity pool just to see what the files 
> appearing there from tape have in common. 
> If it's really that these files' data were on the capacity pool at the 
> last backup, they should not be affected by your dead NSD and a restore 
is 
> in vain anyway.
> 
> If that doesn't help or give no clue, then, if the data pool has some 
more 
> free  space, you might try to run an upward/backward migration from 
> capacity to data . 
> 
> And, yeah, as GPFS tends to stripe over all NSDs, all files in data 
large 
> enough plus some smaller ones would have data on your broken NSD. That's 

> the drawback of parallelization.
> Maybe you'd ask the storage vendor whether they supply some more storage 

> for the fault of their (redundant?) device to alleviate your current 
> storage shortage ?
> 
> Mit freundlichen Grüßen / Kind regards
> 
> 
> Dr. Uwe Falke
> 
> IT Specialist
> High Performance Computing Services / Integrated Technology Services / 
> Data Center Services
> 
---
> IBM Deutschland
> Rathausstr. 7
> 09111 Chemnitz
> Phone: +49 371 6978 2165
> Mobile: +49 175 575 2877
> E-Mail: uwefa...@de.ibm.com
> 
---
> IBM Deutschland Business & Technology Services GmbH / Geschäftsführung: 
> Thomas Wolter, Sven Schooß
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht 
Stuttgart, 
> HRB 17122 
> 
> 
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> 
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cacad30699025407bc67b08d5cca54bca%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636639932669887596=vywTFbG4O0lquAIAVfa0csdC0HtpvfhY8%2FOjqm98fxI%3D=0


___

Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Buterbaugh, Kevin L
Hi Uwe,

Thanks for your response.

So our restore software lays down the metadata first, then the data.  While it 
has no specific knowledge of the extended attributes, it does back them up and 
restore them.  So the only explanation that makes sense to me is that since the 
inode for the file says that the file should be in the gpfs23capacity pool, the 
data gets written there.

Right now I don’t have time to do an analysis of the “live” version of a 
fileset and the “restored” version of that same fileset to see if the placement 
of the files matches up.  My quick and dirty checks seem to show files getting 
written to all 3 pools.  Unfortunately, we have no way to tell our tape 
software to ignore files from the gpfs23capacity pool (and we’re aware that we 
won’t need those files).  We’ve also determined that it is actually quicker to 
tell our tape system to restore all files from a fileset than to take the time 
to tell it to selectively restore only certain files … and the same amount of 
tape would have to be read in either case.

Our SysAdmin who is primary on tape backup and restore was going on vacation 
the latter part of the week, so he decided to be helpful and just queue up all 
the restores to run one right after the other.  We didn’t realize that, so we 
are solving our disk space issues by slowing down the restores until we can run 
more instances of the script that replaces the corrupted files and deletes the 
unneeded restored files.

Thanks again…

Kevin

> On Jun 7, 2018, at 1:34 PM, Uwe Falke  wrote:
> 
>> However, I took a look in one of the restore directories under 
>> /gpfs23/ RESTORE using mmlsattr and I see files in all 3 pools! 
> 
> 
>> So ? I don?t think GPFS is doing this but the next thing I am 
>> going to do is follow up with our tape software vendor ? I bet 
>> they preserve the pool attribute on files and - like Jaime said - 
>> old stuff is therefore hitting the gpfs23capacity pool.
> 
> Hm, then the backup/restore must be doing very funny things. Usually, GPFS 
> should rule the 
> placement of new files, and I assume that a restore of a file, in 
> particular under a different name, 
> creates a new file. So, if your backup tool does override that GPFS 
> placement, it must be very 
> intimate with Scale :-). 
> I'd do some list scans of the capacity pool just to see what the files 
> appearing there from tape have in common. 
> If it's really that these files' data were on the capacity pool at the 
> last backup, they should not be affected by your dead NSD and a restore is 
> in vain anyway.
> 
> If that doesn't help or give no clue, then, if the data pool has some more 
> free  space, you might try to run an upward/backward migration from 
> capacity to data . 
> 
> And, yeah, as GPFS tends to stripe over all NSDs, all files in data large 
> enough plus some smaller ones would have data on your broken NSD. That's 
> the drawback of parallelization.
> Maybe you'd ask the storage vendor whether they supply some more storage 
> for the fault of their (redundant?) device to alleviate your current 
> storage shortage ?
> 
> Mit freundlichen Grüßen / Kind regards
> 
> 
> Dr. Uwe Falke
> 
> IT Specialist
> High Performance Computing Services / Integrated Technology Services / 
> Data Center Services
> ---
> IBM Deutschland
> Rathausstr. 7
> 09111 Chemnitz
> Phone: +49 371 6978 2165
> Mobile: +49 175 575 2877
> E-Mail: uwefa...@de.ibm.com
> ---
> IBM Deutschland Business & Technology Services GmbH / Geschäftsführung: 
> Thomas Wolter, Sven Schooß
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
> HRB 17122 
> 
> 
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cacad30699025407bc67b08d5cca54bca%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636639932669887596=vywTFbG4O0lquAIAVfa0csdC0HtpvfhY8%2FOjqm98fxI%3D=0

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Uwe Falke
> However, I took a look in one of the restore directories under 
> /gpfs23/ RESTORE using mmlsattr and I see files in all 3 pools! 


> So ? I don?t think GPFS is doing this but the next thing I am 
> going to do is follow up with our tape software vendor ? I bet 
> they preserve the pool attribute on files and - like Jaime said - 
> old stuff is therefore hitting the gpfs23capacity pool.

Hm, then the backup/restore must be doing very funny things. Usually, GPFS 
should rule the 
placement of new files, and I assume that a restore of a file, in 
particular under a different name, 
creates a new file. So, if your backup tool does override that GPFS 
placement, it must be very 
intimate with Scale :-). 
I'd do some list scans of the capacity pool just to see what the files 
appearing there from tape have in common. 
If it's really that these files' data were on the capacity pool at the 
last backup, they should not be affected by your dead NSD and a restore is 
in vain anyway.

If that doesn't help or give no clue, then, if the data pool has some more 
free  space, you might try to run an upward/backward migration from 
capacity to data . 

And, yeah, as GPFS tends to stripe over all NSDs, all files in data large 
enough plus some smaller ones would have data on your broken NSD. That's 
the drawback of parallelization.
Maybe you'd ask the storage vendor whether they supply some more storage 
for the fault of their (redundant?) device to alleviate your current 
storage shortage ?
 
Mit freundlichen Grüßen / Kind regards

 
Dr. Uwe Falke
 
IT Specialist
High Performance Computing Services / Integrated Technology Services / 
Data Center Services
---
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefa...@de.ibm.com
---
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung: 
Thomas Wolter, Sven Schooß
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 17122 


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Buterbaugh, Kevin L
Hi again all,

I received a direct response and am not sure whether that means the sender did 
not want to be identified, but they asked good questions that I wanted to 
answer on list…

No, we do not use snapshots on this filesystem.

No, we’re not using HSM … our tape backup system is a traditional backup system 
not named TSM.  We’ve created a top level directory in the filesystem called 
“RESTORE” and are restoring everything under that … then doing our moves / 
deletes of what we’ve restored … so I *think* that means all of that should be 
written to the gpfs23data pool?!?

On the “plus” side, I may figure this out myself soon when someone / something 
starts getting I/O errors!  :-O

In the meantime, other ideas are much appreciated!

Kevin


Do you have a job that’s creating snapshots?  That’s an easy one to overlook.

Not sure if you are using an HSM. Any new file that gets generated should 
follow the default rule in ILM unless if meets a placement condition. It would 
only be if you’re using an HSM that files would be placed in a non-placement 
location pool but that is purely because the the file location has already been 
updated to the capacity pool.




On Thu, Jun 7, 2018 at 8:17 AM -0600, "Buterbaugh, Kevin L" 
mailto:kevin.buterba...@vanderbilt.edu>> wrote:

Hi All,

First off, I’m on day 8 of dealing with two different mini-catastrophes at work 
and am therefore very sleep deprived and possibly missing something obvious … 
with that disclaimer out of the way…

We have a filesystem with 3 pools:  1) system (metadata only), 2) gpfs23data 
(the default pool if I run mmlspolicy), and 3) gpfs23capacity (where files with 
an atime - yes atime - of more than 90 days get migrated to by a script that 
runs out of cron each weekend.

However … this morning the free space in the gpfs23capacity pool is dropping … 
I’m down to 0.5 TB free in a 582 TB pool … and I cannot figure out why.  The 
migration script is NOT running … in fact, it’s currently disabled.  So I can 
only think of two possible explanations for this:

1.  There are one or more files already in the gpfs23capacity pool that someone 
has started updating.  Is there a way to check for that … i.e. a way to run 
something like “find /gpfs23 -mtime -7 -ls” but restricted to only files in the 
gpfs23capacity pool.  Marc Kaplan - can mmfind do that??  ;-)

2.  We are doing a large volume of restores right now because one of the 
mini-catastrophes I’m dealing with is one NSD (gpfs23data pool) down due to a 
issue with the storage array.  We’re working with the vendor to try to resolve 
that but are not optimistic so we have started doing restores in case they come 
back and tell us it’s not recoverable.  We did run “mmfileid” to identify the 
files that have one or more blocks on the down NSD, but there are so many that 
what we’re doing is actually restoring all the files to an alternate path 
(easier for out tape system), then replacing the corrupted files, then deleting 
any restores we don’t need.  But shouldn’t all of that be going to the 
gpfs23data pool?  I.e. even if we’re restoring files that are in the 
gpfs23capacity pool shouldn’t the fact that we’re restoring to an alternate 
path (i.e. not overwriting files with the tape restores) and the default pool 
is the gpfs23data pool mean that nothing is being restored to the 
gpfs23capacity pool???

Is there a third explanation I’m not thinking of?

Thanks...

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu - 
(615)875-9633




___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Jaime Pinto
I think the restore is is bringing back a lot of material with atime >  
90, so it is passing-trough gpfs23data and going directly to  
gpfs23capacity.


I also think you may not have stopped the crontab script as you  
believe you did.


Jaime





Quoting "Buterbaugh, Kevin L" :


Hi All,

First off, I?m on day 8 of dealing with two different   
mini-catastrophes at work and am therefore very sleep deprived and   
possibly missing something obvious ? with that disclaimer out of the  
 way?


We have a filesystem with 3 pools:  1) system (metadata only), 2)   
gpfs23data (the default pool if I run mmlspolicy), and 3)   
gpfs23capacity (where files with an atime - yes atime - of more than  
 90 days get migrated to by a script that runs out of cron each   
weekend.


However ? this morning the free space in the gpfs23capacity pool is   
dropping ? I?m down to 0.5 TB free in a 582 TB pool ? and I cannot   
figure out why.  The migration script is NOT running ? in fact, it?s  
 currently disabled.  So I can only think of two possible   
explanations for this:


1.  There are one or more files already in the gpfs23capacity pool   
that someone has started updating.  Is there a way to check for that  
 ? i.e. a way to run something like ?find /gpfs23 -mtime -7 -ls? but  
 restricted to only files in the gpfs23capacity pool.  Marc Kaplan -  
 can mmfind do that??  ;-)


2.  We are doing a large volume of restores right now because one of  
 the mini-catastrophes I?m dealing with is one NSD (gpfs23data pool)  
 down due to a issue with the storage array.  We?re working with the  
 vendor to try to resolve that but are not optimistic so we have   
started doing restores in case they come back and tell us it?s not   
recoverable.  We did run ?mmfileid? to identify the files that have   
one or more blocks on the down NSD, but there are so many that what   
we?re doing is actually restoring all the files to an alternate path  
 (easier for out tape system), then replacing the corrupted files,   
then deleting any restores we don?t need.  But shouldn?t all of that  
 be going to the gpfs23data pool?  I.e. even if we?re restoring  
files  that are in the gpfs23capacity pool shouldn?t the fact that  
we?re  restoring to an alternate path (i.e. not overwriting files  
with the  tape restores) and the default pool is the gpfs23data pool  
mean that  nothing is being restored to the gpfs23capacity pool???


Is there a third explanation I?m not thinking of?

Thanks...

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu -   
(615)875-9633












 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Buterbaugh, Kevin L
Hi All,

So in trying to prove Jaime wrong I proved him half right … the cron job is 
stopped:

#13 22 * * 5 /root/bin/gpfs_migration.sh

However, I took a look in one of the restore directories under /gpfs23/ RESTORE 
using mmlsattr and I see files in all 3 pools!  So that explains why the 
capacity pool is filling, but mmlspolicy says:

Policy for file system '/dev/gpfs23':
   Installed by root@gpfsmgr on Wed Jan 25 10:17:01 2017.
   First line of policy 'gpfs23.policy' is:
RULE 'DEFAULT' SET POOL 'gpfs23data'

So … I don’t think GPFS is doing this but the next thing I am going to do is 
follow up with our tape software vendor … I bet they preserve the pool 
attribute on files and - like Jaime said - old stuff is therefore hitting the 
gpfs23capacity pool.

Thanks Jaime and everyone else who has responded so far…

Kevin

> On Jun 7, 2018, at 9:53 AM, Jaime Pinto  wrote:
> 
> I think the restore is is bringing back a lot of material with atime > 90, so 
> it is passing-trough gpfs23data and going directly to gpfs23capacity.
> 
> I also think you may not have stopped the crontab script as you believe you 
> did.
> 
> Jaime
> 
> Quoting "Buterbaugh, Kevin L" :
> 
>> Hi All,
>> 
>> First off, I?m on day 8 of dealing with two different  mini-catastrophes at 
>> work and am therefore very sleep deprived and  possibly missing something 
>> obvious ? with that disclaimer out of the  way?
>> 
>> We have a filesystem with 3 pools:  1) system (metadata only), 2)  
>> gpfs23data (the default pool if I run mmlspolicy), and 3)  gpfs23capacity 
>> (where files with an atime - yes atime - of more than  90 days get migrated 
>> to by a script that runs out of cron each  weekend.
>> 
>> However ? this morning the free space in the gpfs23capacity pool is  
>> dropping ? I?m down to 0.5 TB free in a 582 TB pool ? and I cannot  figure 
>> out why.  The migration script is NOT running ? in fact, it?s  currently 
>> disabled.  So I can only think of two possible  explanations for this:
>> 
>> 1.  There are one or more files already in the gpfs23capacity pool  that 
>> someone has started updating.  Is there a way to check for that  ? i.e. a 
>> way to run something like ?find /gpfs23 -mtime -7 -ls? but  restricted to 
>> only files in the gpfs23capacity pool.  Marc Kaplan -  can mmfind do that??  
>> ;-)
>> 
>> 2.  We are doing a large volume of restores right now because one of  the 
>> mini-catastrophes I?m dealing with is one NSD (gpfs23data pool)  down due to 
>> a issue with the storage array.  We?re working with the  vendor to try to 
>> resolve that but are not optimistic so we have  started doing restores in 
>> case they come back and tell us it?s not  recoverable.  We did run 
>> ?mmfileid? to identify the files that have  one or more blocks on the down 
>> NSD, but there are so many that what  we?re doing is actually restoring all 
>> the files to an alternate path  (easier for out tape system), then replacing 
>> the corrupted files,  then deleting any restores we don?t need.  But 
>> shouldn?t all of that  be going to the gpfs23data pool?  I.e. even if we?re 
>> restoring files  that are in the gpfs23capacity pool shouldn?t the fact that 
>> we?re  restoring to an alternate path (i.e. not overwriting files with the  
>> tape restores) and the default pool is the gpfs23data pool mean that  
>> nothing is being restored to the gpfs23capacity pool???
>> 
>> Is there a third explanation I?m not thinking of?
>> 
>> Thanks...
>> 
>> ?
>> Kevin Buterbaugh - Senior System Administrator
>> Vanderbilt University - Advanced Computing Center for Research and Education
>> kevin.buterba...@vanderbilt.edu -  
>> (615)875-9633
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> 
> 
> 
>  TELL US ABOUT YOUR SUCCESS STORIES
> 
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.scinethpc.ca%2Ftestimonials=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C9154807425ab4316f58f08d5cc866774%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C63663970107084=VUOqjEJ%2FWt8VI%2BWolWbpa1snbLx85XFJvc0sZPuI86Q%3D=0
> 
> ---
> Jaime Pinto - Storage Analyst
> SciNet HPC Consortium - Compute/Calcul Canada
> https://na01.safelinks.protection.outlook.com/?url=www.scinet.utoronto.ca=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C9154807425ab4316f58f08d5cc866774%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C63663970107084=3PxI2hAdhUOJZp5d%2BjxOu1N0BoQr8X5K8xZG%2BcONjEU%3D=0
>  - 
> https://na01.safelinks.protection.outlook.com/?url=www.computecanada.ca=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C9154807425ab4316f58f08d5cc866774%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C63663970107084=JxtEYIN5%2FYiDf3GKa5ZBP3JiC27%2F%2FGiDaRbX5PnWEGU%3D=0
> University of Toronto
> 661 University Ave. (MaRS), Suite 1140
> Toronto, ON, M5G1M1
> P: 416-978-2755
> C: 416-505-1477
> 
> 

Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Stephen Ulmer

> On Jun 7, 2018, at 10:16 AM, Buterbaugh, Kevin L 
>  wrote:
> 
> Hi All,
> 
> First off, I’m on day 8 of dealing with two different mini-catastrophes at 
> work and am therefore very sleep deprived and possibly missing something 
> obvious … with that disclaimer out of the way…
> 
> We have a filesystem with 3 pools:  1) system (metadata only), 2) gpfs23data 
> (the default pool if I run mmlspolicy), and 3) gpfs23capacity (where files 
> with an atime - yes atime - of more than 90 days get migrated to by a script 
> that runs out of cron each weekend.
> 
> However … this morning the free space in the gpfs23capacity pool is dropping 
> … I’m down to 0.5 TB free in a 582 TB pool … and I cannot figure out why.  
> The migration script is NOT running … in fact, it’s currently disabled.  So I 
> can only think of two possible explanations for this:
> 
> 1.  There are one or more files already in the gpfs23capacity pool that 
> someone has started updating.  Is there a way to check for that … i.e. a way 
> to run something like “find /gpfs23 -mtime -7 -ls” but restricted to only 
> files in the gpfs23capacity pool.  Marc Kaplan - can mmfind do that??  ;-)

Any files that have been opened in that pool will have a recent atime (you’re 
moving them there because they have a not-recent atime, so this should be an 
anomaly). Further, they should have an mtime that is older than 90 days, too. 
You could ask the policy engine which ones have been open/written in the last 
day-ish and maybe see a pattern?

> 2.  We are doing a large volume of restores right now because one of the 
> mini-catastrophes I’m dealing with is one NSD (gpfs23data pool) down due to a 
> issue with the storage array.  We’re working with the vendor to try to 
> resolve that but are not optimistic so we have started doing restores in case 
> they come back and tell us it’s not recoverable.  We did run “mmfileid” to 
> identify the files that have one or more blocks on the down NSD, but there 
> are so many that what we’re doing is actually restoring all the files to an 
> alternate path (easier for out tape system), then replacing the corrupted 
> files, then deleting any restores we don’t need.  But shouldn’t all of that 
> be going to the gpfs23data pool?  I.e. even if we’re restoring files that are 
> in the gpfs23capacity pool shouldn’t the fact that we’re restoring to an 
> alternate path (i.e. not overwriting files with the tape restores) and the 
> default pool is the gpfs23data pool mean that nothing is being restored to 
> the gpfs23capacity pool???
> 

If you are restoring them (as opposed to recalling them), they are different 
files that happen to have similar contents to some other files.


> —
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> kevin.buterba...@vanderbilt.edu  - 
> (615)875-9633
> 
> 
> 
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Buterbaugh, Kevin L
Hi All,

First off, I’m on day 8 of dealing with two different mini-catastrophes at work 
and am therefore very sleep deprived and possibly missing something obvious … 
with that disclaimer out of the way…

We have a filesystem with 3 pools:  1) system (metadata only), 2) gpfs23data 
(the default pool if I run mmlspolicy), and 3) gpfs23capacity (where files with 
an atime - yes atime - of more than 90 days get migrated to by a script that 
runs out of cron each weekend.

However … this morning the free space in the gpfs23capacity pool is dropping … 
I’m down to 0.5 TB free in a 582 TB pool … and I cannot figure out why.  The 
migration script is NOT running … in fact, it’s currently disabled.  So I can 
only think of two possible explanations for this:

1.  There are one or more files already in the gpfs23capacity pool that someone 
has started updating.  Is there a way to check for that … i.e. a way to run 
something like “find /gpfs23 -mtime -7 -ls” but restricted to only files in the 
gpfs23capacity pool.  Marc Kaplan - can mmfind do that??  ;-)

2.  We are doing a large volume of restores right now because one of the 
mini-catastrophes I’m dealing with is one NSD (gpfs23data pool) down due to a 
issue with the storage array.  We’re working with the vendor to try to resolve 
that but are not optimistic so we have started doing restores in case they come 
back and tell us it’s not recoverable.  We did run “mmfileid” to identify the 
files that have one or more blocks on the down NSD, but there are so many that 
what we’re doing is actually restoring all the files to an alternate path 
(easier for out tape system), then replacing the corrupted files, then deleting 
any restores we don’t need.  But shouldn’t all of that be going to the 
gpfs23data pool?  I.e. even if we’re restoring files that are in the 
gpfs23capacity pool shouldn’t the fact that we’re restoring to an alternate 
path (i.e. not overwriting files with the tape restores) and the default pool 
is the gpfs23data pool mean that nothing is being restored to the 
gpfs23capacity pool???

Is there a third explanation I’m not thinking of?

Thanks...

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu - 
(615)875-9633



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss