In the case where tar -C doesn’t work, you can always use a subshell (I do this
regularly):
tar -cf . | ssh someguy@otherhost "(cd targetdir; tar -xvf - )"
Only use -v on one end. :)
Also, for parallel work that’s not designed that way, don't underestimate the
-P option to GNU and BSD xargs! With the amount of stuff to be copied, making
sure a subjob doesn’t finish right after you go home leaving a slot idle for
several hours is a medium deal.
In Bob’s case, however, treating it like a DR exercise where users "restore"
their own files by accessing them (using AFM instead of HSM) is probably the
most convenient.
--
Stephen
> On Mar 6, 2019, at 8:13 AM, Uwe Falke <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi, in that case I'd open several tar pipes in parallel, maybe using
> directories carefully selected, like
>
> tar -c <source_dir> | ssh <target_host> "tar -x"
>
> I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but
> along these lines might be a good efficient method. target_hosts should be
> all nodes haveing the target file system mounted, and you should start
> those pipes on the nodes with the source file system.
> It is best to start with the largest directories, and use some
> masterscript to start the tar pipes controlled by semaphores to not
> overload anything.
>
>
>
> Mit freundlichen Grüßen / Kind regards
>
>
> Dr. Uwe Falke
>
> IT Specialist
> High Performance Computing Services / Integrated Technology Services /
> Data Center Services
> -------------------------------------------------------------------------------------------------------------------------------------------
> IBM Deutschland
> Rathausstr. 7
> 09111 Chemnitz
> Phone: +49 371 6978 2165
> Mobile: +49 175 575 2877
> E-Mail: [email protected] <mailto:[email protected]>
> -------------------------------------------------------------------------------------------------------------------------------------------
> IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
> Thomas Wolter, Sven Schooß
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
> HRB 17122
>
>
>
>
> From: "Oesterlin, Robert" <[email protected]
> <mailto:[email protected]>>
> To: gpfsug main discussion list <[email protected]
> <mailto:[email protected]>>
> Date: 06/03/2019 13:44
> Subject: [gpfsug-discuss] Follow-up: migrating billions of files
> Sent by: [email protected]
> <mailto:[email protected]>
>
>
>
> Some of you had questions to my original post. More information:
>
> Source:
> - Files are straight GPFS/Posix - no extended NFSV4 ACLs
> - A solution that requires $?s to be spent on software (ie, Aspera) isn?t
> a very viable option
> - Both source and target clusters are in the same DC
> - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage
> - Approx 40 file systems, a few large ones with 300M-400M files each,
> others smaller
> - no independent file sets
> - migration must pose minimal disruption to existing users
>
> Target architecture is a small number of file systems (2-3) on ESS with
> independent filesets
> - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4)
>
> My current thinking is AFM with a pre-populate of the file space and
> switch the clients over to have them pull data they need (most of the data
> is older and less active) and them let AFM populate the rest in the
> background.
>
>
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=J5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM&s=fEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI&e=
>
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=J5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM&s=fEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI&e=>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss