Thanks Zach, I was about to echo similar sentiments and you saved me a ton of typing :)

Bob, I know this doesn't help you today since I'm pretty sure its not yet available, but if one scours the interwebs they can find mention of something called Mestor.

There's very very limited information here:

- https://indico.cern.ch/event/531810/contributions/2306222/attachments/1357265/2053960/Spectrum_Scale-HEPIX_V1a.pdf - https://www.yumpu.com/en/document/view/5544551/ibm-system-x-gpfs-storage-server-stfc (slide 20)

Sounds like if it were available it would fit this use case very well.

I also had preliminary success with using sheepdog (https://sheepdog.github.io/sheepdog/) as a backing store for GPFS in a similar situation. It's perhaps at a very high conceptually level similar to Mestor. You erasure code your data across the nodes w/ the SAS disks and then present those block devices to your NSD servers. I proved it could work but never tried to to much with it because the requirements changed.

My money would be on your first option-- creating local RAIDs and then replicating to give you availability in the event a node goes offline.

-Aaron


On 11/30/16 10:59 PM, Zachary Giles wrote:
Just remember that replication protects against data availability, not
integrity. GPFS still requires the underlying block device to return
good data.

If you're using it on plain disks (SAS or SSD), and the drive returns
corrupt data, GPFS won't know any better and just deliver it to the
client. Further, if you do a partial read followed by a write, both
replicas could be destroyed. There's also no efficient way to force use
of a second replica if you realize the first is bad, short of taking the
first entirely offline. In that case while migrating data, there's no
good way to prevent read-rewrite of other corrupt data on your drive
that has the "good copy" while restriping off a faulty drive.

Ideally RAID would have a goal of only returning data that passed the
RAID algorithm, so shouldn't be corrupt, or made good by recreating from
parity. However, as we all know RAID controllers are definitely prone to
failures as well for many reasons, but at least a drive can go bad in
various ways (bad sectors, slow, just dead, poor SSD cell wear, etc)
without (hopefully) silent corruption..

Just something to think about while considering replication ..



On Wed, Nov 30, 2016 at 11:28 AM, Uwe Falke <[email protected]
<mailto:[email protected]>> wrote:

    I have once set up a small system with just a few SSDs in two NSD
    servers,
    providin a scratch file system in a computing cluster.
    No RAID, two replica.
    works, as long the admins do not do silly things (like rebooting servers
    in sequence without checking for disks being up in between).
    Going for RAIDs without GPFS replication protects you against single
    disk
    failures, but you're lost if just one of your NSD servers goes off.

    FPO makes sense only sense IMHO if your NSD servers are also processing
    the data (and then you need to control that somehow).

    Other ideas? what else can you do with GPFS and local disks than
    what you
    considered? I suppose nothing reasonable ...


    Mit freundlichen Grüßen / Kind regards


    Dr. Uwe Falke

    IT Specialist
    High Performance Computing Services / Integrated Technology Services /
    Data Center Services
    
-------------------------------------------------------------------------------------------------------------------------------------------
    IBM Deutschland
    Rathausstr. 7
    09111 Chemnitz
    Phone: +49 371 6978 2165 <tel:%2B49%20371%206978%202165>
    Mobile: +49 175 575 2877 <tel:%2B49%20175%20575%202877>
    E-Mail: [email protected] <mailto:[email protected]>
    
-------------------------------------------------------------------------------------------------------------------------------------------
    IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
    Frank Hammer, Thorsten Moehring
    Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
    Stuttgart,
    HRB 17122




    From:   "Oesterlin, Robert" <[email protected]
    <mailto:[email protected]>>
    To:     gpfsug main discussion list
    <[email protected]
    <mailto:[email protected]>>
    Date:   11/30/2016 03:34 PM
    Subject:        [gpfsug-discuss] Strategies - servers with local SAS
    disks
    Sent by:        [email protected]
    <mailto:[email protected]>



    Looking for feedback/strategies in setting up several GPFS servers with
    local SAS. They would all be part of the same file system. The
    systems are
    all similar in configuration - 70 4TB drives.

    Options I?m considering:

    - Create RAID arrays of the disks on each server (worried about the RAID
    rebuild time when a drive fails with 4, 6, 8TB drives)
    - No RAID with 2 replicas, single drive per NSD. When a drive fails,
    recreate the NSD ? but then I need to fix up the data replication via
    restripe
    - FPO ? with multiple failure groups -  letting the system manage
    replica
    placement and then have GPFS due the restripe on disk failure
    automatically

    Comments or other ideas welcome.

    Bob Oesterlin
    Sr Principal Storage Engineer, Nuance
    507-269-0413 <tel:507-269-0413>

     _______________________________________________
    gpfsug-discuss mailing list
    gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
    <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>




    _______________________________________________
    gpfsug-discuss mailing list
    gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
    <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>




--
Zach Giles
[email protected] <mailto:[email protected]>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to