Re: [gpfsug-discuss] Capacity pool filling

Buterbaugh, Kevin L Thu, 07 Jun 2018 08:57:13 -0700

Hi All,

So in trying to prove Jaime wrong I proved him half right … the cron job is 
stopped:


#13 22 * * 5 /root/bin/gpfs_migration.sh

However, I took a look in one of the restore directories under /gpfs23/ RESTORE 
using mmlsattr and I see files in all 3 pools!  So that explains why the 
capacity pool is filling, but mmlspolicy says:

Policy for file system '/dev/gpfs23':
   Installed by root@gpfsmgr on Wed Jan 25 10:17:01 2017.
   First line of policy 'gpfs23.policy' is:
RULE 'DEFAULT' SET POOL 'gpfs23data'

So … I don’t think GPFS is doing this but the next thing I am going to do is 
follow up with our tape software vendor … I bet they preserve the pool 
attribute on files and - like Jaime said - old stuff is therefore hitting the 
gpfs23capacity pool.

Thanks Jaime and everyone else who has responded so far…

Kevin

> On Jun 7, 2018, at 9:53 AM, Jaime Pinto <pi...@scinet.utoronto.ca> wrote:
> 
> I think the restore is is bringing back a lot of material with atime > 90, so 
> it is passing-trough gpfs23data and going directly to gpfs23capacity.
> 
> I also think you may not have stopped the crontab script as you believe you 
> did.
> 
> Jaime
> 
> Quoting "Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>:
> 
>> Hi All,
>> 
>> First off, I?m on day 8 of dealing with two different  mini-catastrophes at 
>> work and am therefore very sleep deprived and  possibly missing something 
>> obvious ? with that disclaimer out of the  way?
>> 
>> We have a filesystem with 3 pools:  1) system (metadata only), 2)  
>> gpfs23data (the default pool if I run mmlspolicy), and 3)  gpfs23capacity 
>> (where files with an atime - yes atime - of more than  90 days get migrated 
>> to by a script that runs out of cron each  weekend.
>> 
>> However ? this morning the free space in the gpfs23capacity pool is  
>> dropping ? I?m down to 0.5 TB free in a 582 TB pool ? and I cannot  figure 
>> out why.  The migration script is NOT running ? in fact, it?s  currently 
>> disabled.  So I can only think of two possible  explanations for this:
>> 
>> 1.  There are one or more files already in the gpfs23capacity pool  that 
>> someone has started updating.  Is there a way to check for that  ? i.e. a 
>> way to run something like ?find /gpfs23 -mtime -7 -ls? but  restricted to 
>> only files in the gpfs23capacity pool.  Marc Kaplan -  can mmfind do that??  
>> ;-)
>> 
>> 2.  We are doing a large volume of restores right now because one of  the 
>> mini-catastrophes I?m dealing with is one NSD (gpfs23data pool)  down due to 
>> a issue with the storage array.  We?re working with the  vendor to try to 
>> resolve that but are not optimistic so we have  started doing restores in 
>> case they come back and tell us it?s not  recoverable.  We did run 
>> ?mmfileid? to identify the files that have  one or more blocks on the down 
>> NSD, but there are so many that what  we?re doing is actually restoring all 
>> the files to an alternate path  (easier for out tape system), then replacing 
>> the corrupted files,  then deleting any restores we don?t need.  But 
>> shouldn?t all of that  be going to the gpfs23data pool?  I.e. even if we?re 
>> restoring files  that are in the gpfs23capacity pool shouldn?t the fact that 
>> we?re  restoring to an alternate path (i.e. not overwriting files with the  
>> tape restores) and the default pool is the gpfs23data pool mean that  
>> nothing is being restored to the gpfs23capacity pool???
>> 
>> Is there a third explanation I?m not thinking of?
>> 
>> Thanks...
>> 
>> ?
>> Kevin Buterbaugh - Senior System Administrator
>> Vanderbilt University - Advanced Computing Center for Research and Education
>> kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu> -  
>> (615)875-9633
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> 
> 
>         ************************************
>          TELL US ABOUT YOUR SUCCESS STORIES
>         
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.scinethpc.ca%2Ftestimonials&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C9154807425ab4316f58f08d5cc866774%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636639799990107084&sdata=VUOqjEJ%2FWt8VI%2BWolWbpa1snbLx85XFJvc0sZPuI86Q%3D&reserved=0
>         ************************************
> ---
> Jaime Pinto - Storage Analyst
> SciNet HPC Consortium - Compute/Calcul Canada
> https://na01.safelinks.protection.outlook.com/?url=www.scinet.utoronto.ca&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C9154807425ab4316f58f08d5cc866774%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636639799990107084&sdata=3PxI2hAdhUOJZp5d%2BjxOu1N0BoQr8X5K8xZG%2BcONjEU%3D&reserved=0
>  - 
> https://na01.safelinks.protection.outlook.com/?url=www.computecanada.ca&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C9154807425ab4316f58f08d5cc866774%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636639799990107084&sdata=JxtEYIN5%2FYiDf3GKa5ZBP3JiC27%2F%2FGiDaRbX5PnWEGU%3D&reserved=0
> University of Toronto
> 661 University Ave. (MaRS), Suite 1140
> Toronto, ON, M5G1M1
> P: 416-978-2755
> C: 416-505-1477
> 
> ----------------------------------------------------------------
> This message was sent using IMP at SciNet Consortium, University of Toronto.
> 

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] Capacity pool filling

Reply via email to