Re: [gpfsug-discuss] migrating data from GPFS3.5 to ESS appliance (GPFS4.1)

2016-02-01 Thread Wahl, Edward
Along the same vein I've patched rsync to maintain source atimes in Linux for 
large transitions such as this.  Along with the stadnard "patches" mod for 
destination atimes it is quite useful.   Works in 3.0.8 and 3.0.9.  I've not 
yet ported it to 3.1.x
https://www.osc.edu/sites/osc.edu/files/staff_files/ewahl/onoatime.diff

Ed Wahl
OSC


From: gpfsug-discuss-boun...@spectrumscale.org 
[gpfsug-discuss-boun...@spectrumscale.org] on behalf of Orlando Richards 
[orlando.richa...@ed.ac.uk]
Sent: Monday, February 01, 2016 4:25 AM
To: gpfsug-discuss@spectrumscale.org
Subject: Re: [gpfsug-discuss] migrating data from GPFS3.5 to ESS appliance 
(GPFS4.1)

For what it's worth - there's a patch for rsync which IBM provided a
while back that will copy NFSv4 ACLs (maybe other stuff?). I put it up
on the gpfsug github here:

   https://github.com/gpfsug/gpfsug-tools/tree/master/bin/rsync



On 29/01/16 22:36, Sven Oehme wrote:
> Doug,
>
> This won't really work if you make use of ACL's or use special GPFS
> extended attributes or set quotas, filesets, etc
> so unfortunate the answer is you need to use a combination of things and
> there is work going on to make some of this simpler (e.g. for ACL's) ,
> but its a longer road to get there.  so until then you need to think
> about multiple aspects .
>
> 1. you need to get the data across and there are various ways to do this.
>
> a) AFM is the simplest of all as it not just takes care of ACL's and
> extended attributes and alike as it understands the GPFS internals it
> also is operating in parallel can prefetch data, etc so its a efficient
> way to do this but as already pointed out doesn't transfer quota or
> fileset informations.
>
> b) you can either use rsync or any other pipe based copy program. the
> downside is that they are typical single threaded and do a file by file
> approach, means very metadata intensive on the source as well as target
> side and cause a lot of ios on both side.
>
> c) you can use the policy engine to create a list of files to transfer
> to at least address the single threaded scan part, then partition the
> data and run multiple instances of cp or rsync in parallel, still
> doesn't fix the ACL / EA issues, but the data gets there faster.
>
> 2. you need to get ACL/EA informations over too. there are several
> command line options to dump the data and restore it, they kind of
> suffer the same problem as data transfers , which is why using AFM is
> the best way of doing this if you rely on ACL/EA  informations.
>
> 3. transfer quota / fileset infos.  there are several ways to do this,
> but all require some level of scripting to do this.
>
> if you have TSM/HSM you could also transfer the data using SOBAR it's
> described in the advanced admin book.
>
> sven
>
>
> On Fri, Jan 29, 2016 at 11:35 AM, Hughes, Doug
> <douglas.hug...@deshawresearch.com
> <mailto:douglas.hug...@deshawresearch.com>> wrote:
>
> I have found that a tar pipe is much faster than rsync for this sort
> of thing. The fastest of these is ‘star’ (schily tar). On average it
> is about 2x-5x faster than rsync for doing this. After one pass with
> this, you can use rsync for a subsequent or last pass synch.
>
> __ __
>
> e.g.
>
> $ cd /export/gpfs1/foo
>
> $ star –c H=xtar | (cd /export/gpfs2/foo; star –xp)
>
> __ __
>
> This also will not preserve filesets and quotas, though. You should
> be able to automate that with a little bit of awk, perl, or whatnot.
>
> __ __
>
> __ __
>
> *From:*gpfsug-discuss-boun...@spectrumscale.org
> <mailto:gpfsug-discuss-boun...@spectrumscale.org>
> [mailto:gpfsug-discuss-boun...@spectrumscale.org
>     <mailto:gpfsug-discuss-boun...@spectrumscale.org>] *On Behalf Of
> *Damir Krstic
> *Sent:* Friday, January 29, 2016 2:32 PM
> *To:* gpfsug main discussion list
> *Subject:* [gpfsug-discuss] migrating data from GPFS3.5 to ESS
> appliance (GPFS4.1)
>
> __ __
>
> We have recently purchased ESS appliance from IBM (GL6) with 1.5PT
> of storage. We are in planning stages of implementation. We would
> like to migrate date from our existing GPFS installation (around
> 300TB) to new solution. 
>
> __ __
>
> We were planning of adding ESS to our existing GPFS cluster and
> adding its disks and then deleting our old disks and having the data
> migrated this way. However, our existing block size on our projects
> filesystem is 1M and in order to extract as much performance out of
> ESS we would like it

Re: [gpfsug-discuss] migrating data from GPFS3.5 to ESS appliance(GPFS4.1)

2016-01-30 Thread Marc A Kaplan
You may also want to use and/or adapt samples/ilm/tspcp   which uses 
mmapplypolicy to drive parallel cp commands.

The script was written by my colleague and manager, but I'm willing to 
entertain
questions and suggestions...

Here are some of the comments:

# Run "cp" in parallel over a list of files/directories
#
# This is a sample script showing how to use the GPFS ILM policy
# to copy a list of files or directories. Its takes advantage of GPFS 3.4
# input file list argument to mmapplypolicy to limit the directory scan
# to only the files and directories in the input file list. It also uses
# the GPFS 3.5 feature for DIRECTORY_HASH to sort candidate list by 
directories
# and execute them in a top-down order.
#
# This script converts the list of files from the argument to a file list 
file
# as input to mmapplypolicy. It then generates a simple ILM policy file to
# match all of the files in the list and if -r is specified to match all 
files
# and directories below any directories that are specified. The files are
# then sorted by policy into directory order and dispatched in top down 
order.
# Each work unit is assigned to a node and executed by a call to this 
script,
# which simply reads its input file list  and calls "cp" on each. The
# original script scans stdout from each of the workers looking  for
# messages from "cp" or any possible errors.

The script was written by my colleague and manager, but I'm willing to 
entertain
questions and suggestions...


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] migrating data from GPFS3.5 to ESS appliance(GPFS4.1)

2016-01-29 Thread Stijn De Weirdt
we wrapped something base on zookeeper around rsync to be able to use
rsync in parallel by splitting the path in subdirectories, and
distribute those
https://github.com/hpcugent/vsc-zk

works really well if the number of files in directories is somewhat
balanced. we use it to rsync some gpfs filesystems (200TB, 100M inodes ;)

stijn

On 01/29/2016 09:38 PM, Marc A Kaplan wrote:
> mmbackupconfig may be of some help.  The output is eyeball-able, so one 
> could tweak and then feed into mmrestoreconfig on the new system.
> Even if you don't use mmrestoreconfig, you might like to have the info 
> collected by mmbackupconfig.
> 
> 
> 
> 
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss