Hadn’t seen a response, but here’s one thing that might make your decision 
easier on this question:
“But since ALL of the files in the capacity pool haven’t even been looked at in 
at least 90 days already, does it really matter?  I.e. should I just add the 
NSDs to the capacity pool and be done with it?”

Does the performance matter for accessing files in this capacity pool?

If not, then just add it in.

If it does, then you’ll need to concern yourself with the performance you’ll 
get from the NSDs that still have free space to store new data once the smaller 
NSDs become full.  If that’s enough then just add it in.  Old data will still 
be spread across the current storage in the capacity pool, so you’ll get 
current read performance rates for that data.

By creating a new pool, oc, and then migrating data that hasn’t been accessed 
in over 1 year to it from the capacity pool, you’re freeing up new space to 
store new data on the capacity pool.  This seems to really only be a benefit if 
the performance of the capacity pool is a lot greater than the oc pool and your 
users need that performance to satisfy their application workloads.

Of course moving data around on a regular basis also has an impact to overall 
performance during these operations too, but maybe there are times when the 
system is idle and these operations will not really cause any performance 
heartburn.

I think Marc will have to answer your other question… ;o)

Hope that helps!
-Bryan

From: [email protected] 
<[email protected]> On Behalf Of Buterbaugh, Kevin L
Sent: Monday, December 17, 2018 4:02 PM
To: gpfsug main discussion list <[email protected]>
Subject: [gpfsug-discuss] Couple of questions related to storage pools and 
mmapplypolicy

[EXTERNAL EMAIL]
Hi All,

As those of you who suffered thru my talk at SC18 already know, we’re really 
short on space on one of our GPFS filesystems as the output of mmdf piped to 
grep pool shows:

Disks in storage pool: system (Maximum disk size allowed is 24 TB)
(pool total)           4.318T                                1.078T ( 25%)      
  79.47G ( 2%)
Disks in storage pool: data (Maximum disk size allowed is 262 TB)
(pool total)           494.7T                                38.15T (  8%)      
  4.136T ( 1%)
Disks in storage pool: capacity (Maximum disk size allowed is 519 TB)
(pool total)           640.2T                                14.56T (  2%)      
  716.4G ( 0%)

The system pool is metadata only.  The data pool is the default pool.  The 
capacity pool is where files with an atime (yes, atime) > 90 days get migrated. 
 The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB 
drives, so roughly 58.2 TB usable space per NSD.

We have the new storage we purchased, but that’s still being tested and held in 
reserve for after the first of the year when we create a new GPFS 5 formatted 
filesystem and start migrating everything to the new filesystem.

In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and 
will be hooking it up to one of our existing storage arrays on Wednesday.  My 
plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs. 
 They will be about 88 TB usable space each (because … beginning rant … a 12 TB 
drive is < 11 TB is size … and don’t get me started on so-called “4K” TV’s … 
end rant).

A very wise man who used to work at IBM but now hangs out with people in red 
polos (<grin>) once told me that it’s OK to mix NSDs of slightly different 
sizes in the same pool, but you don’t want to put NSDs of vastly different 
sizes in the same pool because the smaller ones will fill first and then the 
larger ones will have to take all the I/O.  I consider 58 TB and 88 TB to be 
pretty significantly different and am therefore planning on creating yet 
another pool called “oc” (over capacity if a user asks, old crap internally!) 
and migrating files with an atime greater than, say, 1 year to that pool.  But 
since ALL of the files in the capacity pool haven’t even been looked at in at 
least 90 days already, does it really matter?  I.e. should I just add the NSDs 
to the capacity pool and be done with it?

If it’s a good idea to create another pool, then I have a question about 
mmapplypolicy and migrations.  I believe I understand how things work, but 
after spending over an hour looking at the documentation I cannot find anything 
that explicitly confirms my understanding … so if I have another pool called oc 
that’s ~264 TB in size and I write a policy file that looks like:

define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME)))

RULE 'ReallyOldStuff'
  MIGRATE FROM POOL 'capacity'
  TO POOL 'oc'
  LIMIT(98)
  SIZE(KB_ALLOCATED/NLINK)
  WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584))

RULE 'OldStuff'
  MIGRATE FROM POOL 'data'
  TO POOL 'capacity'
  LIMIT(98)
  SIZE(KB_ALLOCATED/NLINK)
  WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584))

Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy 
smart enough to calculate how much space it’s going to free up in the capacity 
pool by the “ReallyOldStuff” rule and therefore be able to potentially also 
move a ton of stuff from the data pool to the capacity pool via the 2nd rule 
with just one invocation of mmapplypolicy?  That’s what I expect that it will 
do.  I’m hoping I don’t have to run the mmapplypolicy twice … the first to move 
stuff from capacity to oc and then a second time for it to realize, oh, I’ve 
got a much of space free in the capacity pool now.

Thanks in advance...

Kevin

P.S.  In case you’re scratching your head over the fact that we have files that 
people haven’t even looked at for months and months (more than a year in some 
cases) sitting out there … we sell quota in 1 TB increments … once they’ve 
bought the quota, it’s theirs.  As long as they’re paying us the monthly fee if 
they want to keep files relating to research they did during the George Bush 
Presidency out there … and I mean Bush 41, not Bush 43 ….then that’s their 
choice.  We do not purge files.

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
[email protected]<mailto:[email protected]> - 
(615)875-9633




________________________________

Note: This email is for the confidential use of the named addressee(s) only and 
may contain proprietary, confidential, or privileged information and/or 
personal data. If you are not the intended recipient, you are hereby notified 
that any review, dissemination, or copying of this email is strictly 
prohibited, and requested to notify the sender immediately and destroy this 
email and any attachments. Email transmission cannot be guaranteed to be secure 
or error-free. The Company, therefore, does not make any guarantees as to the 
completeness or accuracy of this email or any attachments. This email is for 
informational purposes only and does not constitute a recommendation, offer, 
request, or solicitation of any kind to buy, sell, subscribe, redeem, or 
perform any type of transaction of a financial product. Personal data, as 
defined by applicable data privacy laws, contained in this email may be 
processed by the Company, and any of its affiliated or related companies, for 
potential ongoing compliance and/or business-related purposes. You may have 
rights regarding your personal data; for information on exercising these rights 
or the Company’s treatment of personal data, please email 
[email protected].
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to