Dear Marc,

well as I think I cannot simply "move" dependent filesets between independent 
ones and our customers must have the opportunity to change data protection 
policy for their Containers at any given time, I cannot map them to a "backed 
up" or "not backed up" independent fileset.


So how much performance impact is lets say 1-10 exclude.dir directives per 
independent fileset?


Many thanks in advance.

Best Regards,

Stephan Peinkofer


________________________________
From: [email protected] 
<[email protected]> on behalf of Marc A Kaplan 
<[email protected]>
Sent: Tuesday, August 14, 2018 5:31 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] GPFS Independent Fileset Limit vs Quotas?

True, mmbackup is designed to work best backing up either a single independent 
fileset or the entire file system.  So if you know some filesets do not need to 
be backed up, map them to one or more indepedent filesets that will not be 
backed up.

mmapplypolicy is happy to scan a single dependent fileset, use option --scope 
fileset and make the primary argument the path to the root of the fileset you 
wish to scan.   The overhead is not simply described.   The directory scan 
phase will explore or walk the (sub)tree in parallel with multiple threads on 
multiple nodes, reading just the directory blocks that need to be read.

The inodescan phase will read blocks of inodes from the given inodespace ...  
since the inodes of dependent filesets may be "mixed" into the same blocks as 
other dependend filesets that are in the same independent fileset, 
mmapplypolicy will incur what you might consider "extra" overhead.




From:        "Peinkofer, Stephan" <[email protected]>
To:        gpfsug main discussion list <[email protected]>
Date:        08/14/2018 12:50 AM
Subject:        Re: [gpfsug-discuss] GPFS Independent Fileset Limit vs Quotas?
Sent by:        [email protected]
________________________________



Dear Marc,


If you "must" exceed 1000 filesets because you are assigning each project to 
its own fileset, my suggestion is this:

Yes, there are scaling/performance/manageability benefits to using mmbackup 
over independent filesets.

But maybe you don't need 10,000 independent filesets --
maybe you can hash or otherwise randomly assign projects that each have their 
own (dependent) fileset name to a lesser number of independent filesets that 
will serve as management groups for (mm)backup...

OK, if that might be doable, whats then the performance impact of having to 
specify Include/Exclude lists for each independent fileset in order to specify 
which dependent fileset should be backed up and which one not?
I don’t remember exactly, but I think I’ve heard at some time, that 
Include/Exclude and mmbackup have to be used with caution. And the same 
question holds true for running mmapplypolicy for a “job” on a single dependent 
fileset? Is the scan runtime linear to the size of the underlying independent 
fileset or are there some optimisations when I just want to scan a 
subfolder/dependent fileset of an independent one?

Like many things in life, sometimes compromises are necessary!

Hmm, can I reference this next time, when we negotiate Scale License pricing 
with the ISS sales people? ;)

Best Regards,
Stephan Peinkofer
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to