On Thu, Aug 15, 2013 at 11:08 AM, Wendy Cheng <[email protected]> wrote:
> On Thu, Aug 15, 2013 at 5:46 AM, Tom Talpey <[email protected]> wrote:
>> On 8/14/2013 8:14 PM, Wendy Cheng wrote:
>>>
>>> Longer version of the question:
>>> I'm trying to enable NFS-RDMA on an embedded system (based on 2.6.38
>>> kernel) as a client. The IB stacks are taken from OFED 1.5.4. NFS
>>> server is a RHEL 6.3 Xeon box. The connection uses mellox-4 driver.
>>> Memory registration is "RPCRDMA_ALLPHYSICAL". There are many issues so
>>> far but I do manage to get nfs mount working. Simple file operations
>>> (such as "ls", file read/write, "scp", etc) seem to work as well.
>>

Yay ... got this up .. amazingly on a uOS that does not have much of
the conventional kernel debug facilities.

The hang was caused by auto disconnect, triggered by xprt->timer. The
task was carried out by xprt_init_autodisconnect(). It silently
disconnects the xprt w/out sensible warning. The uOS is on a
small-core (slower) hardware. Instead of a hard number, this timeout
value needs to be at least a "proc" tunable. Will check newer kernels
to see whether it's been improved and/or draft a patch later.

One thing I'm still scratching my head is that ... by looking at the
raw IOPS, I don't see dramatic difference between NFS-RDMA vs. NFS
over IPOIB (TCP). However, the total run time differs greatly. NFS
over RDMA seems to take a much longer time to finish (vs. NFS over
IPOIB). Not sure why is that .... Maybe by the constant
connect/disconnect triggered by reestablish_timeout ? The connection
re-establish is known to be expensive on this uOS. Why do we need two
sets of timeout where
1. xprt->timer disconnects (w/out reconnect) ?
2. reestablish_timeout constantly disconnect/re-connect ?

-- Wendy
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to