Hello

I have a short question about AFM prefetch and some more remarks regarding AFM 
and it’s use for data migration. I understand that many  of you have done this 
for very large amounts of data and number of files. I would welcome an input, 
comments or remarks. Sorry if this is a bit too long for a mailing list.

Short:
How can I tell an AFM cache  to update a directory when I do prefetch? I know 
about ‘find .’ or ‘ls –lsR’ but this really is no option for us as it takes too 
long. Mostly I want to update the directories to make AFM cache aware of file 
deletions on home. On home I can use a policy run to find all directories which 
changed since the last update and pass them to prefetch on AFM cache.  

I know that I can find some workaround based on the directory list, like an ‘ls 
–lsa’ just for those directories, but this doesn’t sound very efficient. And 
depending on cache effects and timeout settings it may work or not (o.k. – it 
will work most time).

We do regular file deletions and will accumulated millions of deleted files on 
cache over time if we don’t update the directories to make AFM cache aware of 
the deletion.

Background:
We will use AFM to migrate data on filesets to another cluster. We have to do 
this several times in the next few months, hence I want to get a reliable and 
easy to use procedure. The old system is home, the new system is a read-only 
AFM cache. We want to use ‘mmafmctl prefetch’ to move the data. Home will be in 
use while we run the migration. Once almost all data is moved we do a (short) 
break for a last sync and make the read-only AFM cache a ‘normal’ fileset. 
During the break I want  to use policy runs and prefetch only and no time 
consuming ‘ls –lsr’ or ‘find .’ I don’t want to use this metadata intensive 
posix operation during operation, either.

More general:
AFM can be used for data migration. But I don’t see how to use it efficiently. 
How to do incremental transfers, how to ensure that the we really have 
identical copies before we switch and how to keep the switch time short , i.e. 
the time when both old and new aren’t accessible for clients,

Wish – maybe an RFE?
I can use policy runs to collect all changed items on home since the last 
update. I wish that I can pass this list to afm prefetch to do all updates on 
AFM cache, too. Same as backup tools use the list to do incremental backups.

And a tool to create policy lists of home and cache and to compare the lists 
would be nice, too. As you do this during the break/switch it should be fast 
and reliable and leave no doubts.

Kind regards,

Heiner

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to