To tell you the truth, I don't. It's on my radar but I haven't done it
yet. I *have* run hadoop on GPFS w/o magpie though and on only a couple
of nodes was able to pound 1GB/s out to GPFS w/ the terasort benchmark.
I know our GPFS FS can go much faster than that but java was cpu-bound
as it often seems to be.
-Aaron
On 8/23/16 7:56 AM, Brian Marshall wrote:
Aaron,
Do you have experience running this on native GPFS? The docs say Lustre
and any NFS filesystem.
Thanks,
Brian
On Aug 22, 2016 10:37 PM, "Aaron Knister" <[email protected]
<mailto:[email protected]>> wrote:
Yes, indeed. Note that these are my personal opinions.
It seems to work quite well and it's not terribly hard to set up or
get running. That said, if you've got a traditional HPC cluster with
reasonably good bandwidth (and especially if your data is already on
the HPC cluster) I wouldn't bother with FPO and just use something
like magpie (https://github.com/LLNL/magpie
<https://github.com/LLNL/magpie>) to run your hadoopy workload on
GPFS on your traditional HPC cluster. I believe FPO (and by
extension data locality) is important when the available bandwidth
between your clients and servers/disks (in a traditional GPFS
environment) is less than the bandwidth available within a node
(e.g. between your local disks and the host CPU).
-Aaron
On 8/22/16 10:23 PM, Brian Marshall wrote:
Does anyone have any experiences to share (good or bad) about
setting up
and utilizing FPO for hadoop compute on top of GPFS?
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
<http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776 <tel:%28301%29%20286-2776>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
<http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss