To tell you the truth, I don't. It's on my radar but I haven't done it yet. I *have* run hadoop on GPFS w/o magpie though and on only a couple of nodes was able to pound 1GB/s out to GPFS w/ the terasort benchmark. I know our GPFS FS can go much faster than that but java was cpu-bound as it often seems to be.

-Aaron

On 8/23/16 7:56 AM, Brian Marshall wrote:
Aaron,

Do you have experience running this on native GPFS?  The docs say Lustre
and any NFS filesystem.

Thanks,
Brian


On Aug 22, 2016 10:37 PM, "Aaron Knister" <[email protected]
<mailto:[email protected]>> wrote:

    Yes, indeed. Note that these are my personal opinions.

    It seems to work quite well and it's not terribly hard to set up or
    get running. That said, if you've got a traditional HPC cluster with
    reasonably good bandwidth (and especially if your data is already on
    the HPC cluster) I wouldn't bother with FPO and just use something
    like magpie (https://github.com/LLNL/magpie
    <https://github.com/LLNL/magpie>) to run your hadoopy workload on
    GPFS on your traditional HPC cluster. I believe FPO (and by
    extension data locality) is important when the available bandwidth
    between your clients and servers/disks (in a traditional GPFS
    environment) is less than the bandwidth available within a node
    (e.g. between your local disks and the host CPU).

    -Aaron

    On 8/22/16 10:23 PM, Brian Marshall wrote:

        Does anyone have any experiences to share (good or bad) about
        setting up
        and utilizing FPO for hadoop compute on top of GPFS?


        _______________________________________________
        gpfsug-discuss mailing list
        gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
        http://gpfsug.org/mailman/listinfo/gpfsug-discuss
        <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>


    --
    Aaron Knister
    NASA Center for Climate Simulation (Code 606.2)
    Goddard Space Flight Center
    (301) 286-2776 <tel:%28301%29%20286-2776>
    _______________________________________________
    gpfsug-discuss mailing list
    gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
    <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to