Perhaps an environment where one has OPA and IB fabrics. Taken from here
(https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html):
RDMA is not supported on a node when both Mellanox HCAs and Intel
Omni-Path HFIs are enabled for RDMA.
The alternative being a situation where multiple IB fabrics exist that
require different OFED versions from each other (and most likely from
ESS) for support reasons (speaking from experience). That is to say if
$VENDOR supports OFED version X on an IB fabric, and ESS/GSS ships with
version Y and there's a problem on the IB fabric $VENDOR may point at
the different OFED version on the ESS/GSS and say they don't support it
and then one is in a bad spot.
-Aaron
On 3/16/17 9:50 AM, Jan-Frode Myklebust wrote:
Why would you need a NSD protocol router when the NSD servers can have a
mix of infiniband and ethernet adapters? F.ex. 4x EDR + 2x 100GbE per
io-node in an ESS should give you lots of bandwidth for your common
ethernet medium.
-jf
On Thu, Mar 16, 2017 at 1:52 AM, Aaron Knister <[email protected]
<mailto:[email protected]>> wrote:
*drags out soapbox*
Sorry in advance for the rant, this is one of my huge pet peeves :)
There are some serious blockers for GNR adoption in my environment.
It drives me up a wall that the only way to get end to end checksums
in GPFS is with vendor hardware lock-in. I find it infuriating.
Lustre can do this for free with ZFS. Historically it has also
offered various other features too like eating your data so I guess
it's a tradeoff ;-) I believe that either GNR should be available
for any hardware that passes a validation suite or GPFS should
support checksums on non-GNR NSDs either by leveraging T10-PI
information or by checksumming blocks/subblocks and storing that
somewhere. I opened an RFE for this and it was rejected and I was
effectively told to go use GNR/ESS, but well... can't do GNR.
But lets say I could run GNR on any hardware of my choosing after
perhaps paying some modest licensing fee and passing a hardware
validation test there's another blocker for me. Because GPFS doesn't
support anything like an LNet router I'm fairly limited on the
number of high speed verbs rdma fabrics I can connect GNR to.
Furthermore even if I had enough PCIe slots the configuration may
not be supported (e.g. a site with an OPA and an IB fabric that
would like to use rdma verbs on both). There could even be a
situation where a vendor of an HPC solution requires a specific OFED
version for support purposes that's not the version running on the
GNR nodes. If an NSD protocol router were available I could perhaps
use ethernet as a common medium to work around this.
I'd really like IBM to *do* something about this situation but I've
not gotten any traction on it so far.
-Aaron
On Wed, Mar 15, 2017 at 8:26 PM, Steve Duersch <[email protected]
<mailto:[email protected]>> wrote:
>>For me it's the protection against bitrot and added protection
against silent data corruption
GNR has this functionality. Right now that is available through
ESS though. Not yet as software only.
Steve Duersch
Spectrum Scale
845-433-7902 <tel:(845)%20433-7902>
IBM Poughkeepsie, New York
[email protected]
<mailto:[email protected]> wrote on
03/15/2017 10:25:59 AM:
>
> Message: 6
> Date: Wed, 15 Mar 2017 14:25:41 +0000
> From: "Buterbaugh, Kevin L" <[email protected]>
> To: gpfsug main discussion list <[email protected]
<mailto:[email protected]>>
> Subject: Re: [gpfsug-discuss] mmcrfs issue
> Message-ID: <[email protected]
<mailto:[email protected]>>
> Content-Type: text/plain; charset="utf-8"
>
> Hi All,
>
> Since I started this thread I guess I should chime in, too ? for us
> it was simply that we were testing a device that did not have
> hardware RAID controllers and we were wanting to implement something
> roughly equivalent to RAID 6 LUNs.
>
> Kevin
>
> > On Mar 14, 2017, at 5:16 PM, Aaron Knister <[email protected]
<mailto:[email protected]>> wrote:
> >
> > For me it's the protection against bitrot and added protection
> against silent data corruption and in theory the write caching
> offered by adding log devices that could help with small random
> writes (although there are other problems with ZFS + synchronous
> workloads that stop this from actually materializing).
> >
> > -Aaron
> >
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
<http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
<http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss