Dear Marc,

well the primary reasons for us are:

- Per fileset quota (this seems to work also for dependent filesets as far as I 
know)

- Per user per fileset quota (this seems only to work for independent filesets)

- The dedicated inode space to speedup mmpolicy runs which only have to be 
applied to a specific subpart of the file system

- Scaling mmbackup by backing up different filesets to different TSM Servers 
economically


We have currently more than 1000 projects on our HPC machines and several 
different existing and planned file systems (use cases):


HPC WORK: Here every project has - for the lifetime of the project - a 
dedicated storage area that has some fileset quota attached to it, but no 
further per user or per group quotas are applied here. No backup is taken.


Data Science Storage: This is for long term online and collaborative storage. 
Here projects can get so called "DSS Containers" to which they can give 
arbitrary users access to via a Self Service Interface (a little bit like 
Dropbox). Each of this DSS Containers is implemented via a independent fileset 
so that projects can also specify a per user quota for invited users, we can 
backup each container efficiently into a different TSM Node via mmbackup and we 
can run different actions using the mmapplypolicy to a DSS Container. Also we 
plan to offer our users to enable snapshots on their containers if they wish 
so. We currently deploy a 2PB file system for this and are in the process of 
bringing up two additional 10PB file systems for this but already have requests 
what it would mean if we have to scale this to 50PB.


Data Science Archive (Planned): This is for long term archive storage. The 
usage model will be something similar to DSS but underlying, we plan to use 
TSM/HSM.


Another point, but I don't remember it completely from the top of my head, 
where people might hit the limit is when they are using your OpenStack Manila 
integration. As It think your Manila driver creates an independent fileset for 
each network share in order to be able to provide the per share snapshot 
feature. So if someone is trying to use ISS in a bigger OS Cloud as Manila 
Storage the 1000er limit might hit them also.


Best Regards,

Stephan Peinkofer


________________________________
From: [email protected] 
<[email protected]> on behalf of Marc A Kaplan 
<[email protected]>
Sent: Friday, August 10, 2018 3:02 PM
To: gpfsug main discussion list
Cc: [email protected]; Doris Franke; Uwe Tron; Dorian 
Krause
Subject: Re: [gpfsug-discuss] GPFS Independent Fileset Limit

Questions:  How/why was the decision made to use a large number (~1000) of 
independent filesets ?
What functions/features/commands are being used that work with independent 
filesets, that do not also work with "dependent" filesets?

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to