Well, I did not plan to test all the possible versions of the kernel; for sure 
improvements are on their way, what just confirms the assumption that this 
'technology' is not mature yet.

With IPoIB an NFS server can easily export (for instance) up to 1.2GB/s (at 
least this is what I can measure), with the data in the page cache. No problem 
up to that point at least.
I clearly understand the theoretical benefits of RDMA and it's a clear 
improvement over TCP, for MPI. However, the drastic change for MPI is even more 
on the latency side, though the peak message bandwidth is also improved as one 
might expect for NFS.
Registration/deregistration issues are also well-known to the MPI developpers, 
and all this is certainly not that easy to manage in other areas.

Still, NFS-RDMA remains NFS. If the bottleneck is not in the transport, nothing 
will be improved by RDMA from the performance point of view.
Even worse, what I saw with the 2.6.27 kernel + OFED1.4-rc3 is the inability of 
NFS-RDMA to match the performance of NFS-TCP for some patterns of IOzone, with 
a filesystem able to sustain itself several hundreds of MB/s (using exactly the 
same hardware and software in both cases). We are far from a pure IB bandwidth 
issue here, we are just facing an issue in how the requests are handled 
probably, perhaps when paging occurs, I can't tell.
I could not find any tuning to solve the more obvious problem, i.e. the low 
bandwidth for reading, except mounting with '-o rsize=4096'; probably not what 
people expect, as this will have other effects. Anyway this does improve only 
the sequential read bandwidth.
But of course I will repeat my tests with the latest release of everything when 
I have time, still making sure I compare apples to apples...
Again, I'm sure improvements are on their way !

Fred.


-----Original Message-----
From: Talpey, Thomas [mailto:[EMAIL PROTECTED]
Sent: Tuesday, 11 November, 2008 17:02
To: Ciesielski, Frederic (EMEA HPC&OSLO CC)
Cc: Jeff Becker; [email protected]
Subject: RE: [ofa-general] NFS-RDMA (OFED1.4) with standard distributions ?

At 11:27 AM 11/10/2008, Ciesielski, Frederic (EMEA HPC&OSLO CC) wrote:
>That's great, thanks.
>
>I ran some tests with the 2.6.27 kernel as server and client, and
>basically it works fine.
>
>I could not find yet any situation where NFS-RDMA would outperform
>NFS/IPoIB, at least when you compare apples to apples (same clients,
>same server, same protocol, and not just write to/read from the
>caches), and it even seems to have severe performance issues for
>reading with files larger than the memory size of the client and the server.
>Hopefully this will improve when more users will be able to give
>valuable feedback...

I have a couple of questions, and perhaps suggestions as well.
First the questions...

- Have you tried with a 2.6.28-rc4 client and server at all? There are a number 
of significant NFS/RDMA improvements queued in kernel.org, especially around 
RDMA memory registration as well as RDMA operation scheduling. We've seen some 
significant throughput improvement even for basic tunings.

- What type of storage are you using at the server, and have you attempted to 
tune the server at all? For example, if you are storage
(spindle) limited, no network tuning is likely to help and you should address 
that first. Also, there are tunings such as nfsd thread count, export options, 
and adapter choice that can make a large difference.

Bottom line, you should be able to reach multi-hundred-MB/sec of read/write 
throughput with NFS/RDMA, but there may be issues on specific systems, or 
perhaps with the OFED1.4 code, that need to be accounted for. If possible, you 
may want to set expectations based on mainline, then try to duplicate them in 
the OFED backport.
The current OFED NFS/RDMA support is still evolving, while we consider the 
mainline kernel.org version to be rather solid.

Tom.

>
>Fred.
>
>-----Original Message-----
>From: Jeff Becker [mailto:[EMAIL PROTECTED]
>Sent: Saturday, 08 November, 2008 22:35
>To: Ciesielski, Frederic (EMEA HPC&OSLO CC)
>Cc: [email protected]
>Subject: Re: [ofa-general] NFS-RDMA (OFED1.4) with standard distributions ?
>
>Ciesielski, Frederic (EMEA HPC&OSLO CC) wrote:
>> Is there any chance that the new NFS-RDMA features coming with OFED
>> 1.4 work with standard and current distributions, like RHEL5, SLES10 ?
>Not yet, but I'm working on it. I intend for NFSRDMA to work on 2.6.27
>and 2.6.26 for OFED 1.4. The RHEL5 and SLES10 backports will likely be
>done for OFED 1.4.1. Thanks.
>
>-jeff
>
>> Did anybody test this, or would pretend it is supposed to work ?
>>
>> I mean without building a 2.6.27 or equivalent kernel on top of it,
>> keeping almost full support from the vendors.
>>
>> Enhanced kernel modules may not be sufficient to work around the
>> limitations of old kernels...
>>
>>
>>

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to