Yes, indeed. Note that these are my personal opinions.

It seems to work quite well and it's not terribly hard to set up or get running. That said, if you've got a traditional HPC cluster with reasonably good bandwidth (and especially if your data is already on the HPC cluster) I wouldn't bother with FPO and just use something like magpie (https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on your traditional HPC cluster. I believe FPO (and by extension data locality) is important when the available bandwidth between your clients and servers/disks (in a traditional GPFS environment) is less than the bandwidth available within a node (e.g. between your local disks and the host CPU).

-Aaron

On 8/22/16 10:23 PM, Brian Marshall wrote:
Does anyone have any experiences to share (good or bad) about setting up
and utilizing FPO for hadoop compute on top of GPFS?


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to