All of Marc’s points are good. A few more things to be aware of with regard to replicated writes:
· Each client performs its own replication when it writes file data. So if you have several clients, each writing files concurrently, the “bandwidth burden” of the replication is distributed among them. It’s typical that your write throughput will be limited by disk in this case. · Because clients perform their own write replication, the max write throughput of a NSD client is limited to <50% of its available network bandwidth for 2x replication, or <33% for 3x replication, since it must share the network interface (Ethernet, IB) to access the NSDs in each failure group. · If your network topology is asymmetric (e.g. multiple dataceters with higher latency and limited bandwidth between them) you may also benefit from using “readReplicaPolicy=fastest” to keep read traffic “local” and avoid crossing congested or high-latency paths. From: [email protected] [mailto:[email protected]] On Behalf Of Marc A Kaplan Sent: Tuesday, December 01, 2015 12:02 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] IO performance of replicated GPFS filesystem Generally yes. When reading, more disks is always better then fewer disks, both for replication and with striping over several or many disks. When writing, more disks is good with striping. But yes, replication costs you extra writes. Those writes don't necessarily cost you loss of time, provided they can be done concurrently. When I write "disks" I mean storage devices that can be accessed concurrently. Watch out for virtual LUNs. With conventional controllers and drives, it does GPFS little or no good when multiple LUNs map to the same real disk device, since multiple operations to different LUNs will ultimately be serialized at one real disk arm/head! For high performance, you should not be thinking about "two NSDs" ... you should be thinking about many NSD, so data and metadata can be striped, and written and read concurrently. But yes, for replication purposes you have to consider defining and properly configuring at least two "failure groups". From: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: 11/30/2015 05:46 AM Subject: [gpfsug-discuss] IO performance of replicated GPFS filesystem Sent by: [email protected]<mailto:[email protected]> ________________________________ Hi All, I could use some help of the experts here ☺Please correct me if I’m wrong: I suspect that GPFS filesystem READ performance is better when filesystem is replicated to i.e. two failure groups, where these failure groups are placed on separate RAID controllers. In this case WRITE performance should be worse, since the same data must go to two locations. What about situation where GPFS filesystem has two metadataOnly NSDs which are also replicated? Does metadata READ performance increase in this way as well (and WRITE decreases)? Best regards, Tomasz Wolski_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
