OK, that is good to know.
I'll give it a try with snapshot then. We already have 3.5 almost
everywhere, and planing for 4.2 upgrade (reading the posts with
interest)
Thanks
Jaime
Quoting Yuri L Volobuev <[email protected]>:
Under both 3.2 and 3.3 mmbackup would always lock up our cluster when
using snapshot. I never understood the behavior without snapshot, and
the lock up was intermittent in the carved-out small test cluster, so
I never felt confident enough to deploy over the larger 4000+ clients
cluster.
Back then, GPFS code had a deficiency: migrating very large files didn't
work well with snapshots (and some operation mm commands). In order to
create a snapshot, we have to have the file system in a consistent state
for a moment, and we get there by performing a "quiesce" operation. This
is done by flushing all dirty buffers to disk, stopping any new incoming
file system operations at the gates, and waiting for all in-flight
operations to finish. This works well when all in-flight operations
actually finish reasonably quickly. That assumption was broken if an
external utility, e.g. mmapplypolicy, used gpfs_restripe_file API on a very
large file, e.g. to migrate the file's blocks to a different storage pool.
The quiesce operation would need to wait for that API call to finish, as
it's an in-flight operation, but migrating a multi-TB file could take a
while, and during this time all new file system ops would be blocked. This
was solved several years ago by changing the API and its callers to do the
migration one block range at a time, thus making each individual syscall
short and allowing quiesce to barge in and do its thing. All currently
supported levels of GPFS have this fix. I believe mmbackup was affected by
the same GPFS deficiency and benefited from the same fix.
yuri
************************************
TELL US ABOUT YOUR SUCCESS STORIES
http://www.scinethpc.ca/testimonials
************************************
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477
----------------------------------------------------------------
This message was sent using IMP at SciNet Consortium, University of Toronto.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss