can you please share the entire command line you are using ? also gpfs version, mmlsconfig output would help as well as if this is a shared storage filesystem or a system using local disks.
thx. Sven On Wed, Sep 13, 2017 at 5:19 PM <[email protected]> wrote: > So we have a number of very similar policy files that get applied for file > migration etc. And they vary drastically in the runtime to process, > apparently > due to different selections on whether to do the work in parallel. > > Running a set of rules with 'mmapplypolicy -I defer' that look like this: > > RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system' > THRESHOLD(0,100,0) > WEIGHT(FILE_SIZE) > TO POOL 'VBI_FILES' > FOR FILESET('vbi') > WHERE (mb_allocated >= 8) > > for 10 filesets can scan 325M directory entries in 6 minutes, and sort and > evaluate the policy in 3 more minutes. > > However, this takes a bit over 30 minutes for the scan and another 20 for > sorting and policy evaluation over the same set of filesets: > > RULE 'VBI_FILES_RULE' LIST 'pruned_files' > THRESHOLD(90,80) > WEIGHT(FILE_SIZE) > FOR FILESET('vbi') > WHERE (mb_allocated >= 8) > > even though the output is essentially identical. Why is LIST so much more > expensive than 'MIGRATE" with '-I defer'? I could understand if I > had an > expensive SHOW clause, but there isn't one here (and a different policy > that I > run that *does* have a big SHOW clause takes almost the same amount of > time as > the minimal LIST).... > > I'm thinking that it has *something* to do with the MIGRATE job outputting: > > [I] 2017-09-12@21:20:44.155 Parallel-piped sort and policy evaluation. 0 > files scanned. > (...) > [I] 2017-09-12@21:24:14.672 Piped sorting and candidate file choosing. 0 > records scanned. > > while the LIST job says: > > [I] 2017-09-12@13:58:06.926 Sorting 327627521 file list records. > (...) > [I] 2017-09-12@14:02:04.223 Policy evaluation. 0 files scanned. > > (Both output the same message during the 'Directory entries scanned: 0.' > phase, but I suspect MIGRATE is multi-threading that part as well, as it > completes much faster). > > What's the controlling factor in mmapplypolicy's decision whether or > not to parallelize the policy? > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
