Yes, indeed. Note that these are my personal opinions.
It seems to work quite well and it's not terribly hard to set up or get running. That said, if you've got a traditional HPC cluster with reasonably good bandwidth (and especially if your data is already on the HPC cluster) I wouldn't bother with FPO and just use something like magpie (https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on your traditional HPC cluster. I believe FPO (and by extension data locality) is important when the available bandwidth between your clients and servers/disks (in a traditional GPFS environment) is less than the bandwidth available within a node (e.g. between your local disks and the host CPU).
-Aaron On 8/22/16 10:23 PM, Brian Marshall wrote:
Does anyone have any experiences to share (good or bad) about setting up and utilizing FPO for hadoop compute on top of GPFS? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
