Hi,

I'm working with Celine trying to make NFS RDMA work. We installed a new firmware (2.6.636). We still have the problem but now we have more information on client side.

- With the workaround (memreg 6) we can mount without any problem. We can read a file but if we try to create a file with dd, application hangs and then we have to do 'umount -f'. There is no message on server. Message on client:

rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)


- With fast registration:

There is no message on server. dmesg client output with fast registration:


rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
------------[ cut here ]------------
WARNING: at kernel/softirq.c:136 local_bh_enable_ip+0x3c/0x92()
Modules linked in: xprtrdma autofs4 hidp nfs lockd nfs_acl rfcomm l2cap bluetooth sunrpc iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables cpufreq_ondemand acpi_cpufreq freq_table rdma_ucm ib_sdp rdma_cm iw_cm ib_addr ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad iw_nes ib_ipath ib_mthca dm_multipath scsi_dh raid0 sbs sbshc battery acpi_memhotplug ac parport_pc lp parport mlx4_ib ib_mad ib_core e1000e sr_mod joydev cdrom mlx4_core i5000_edac edac_core shpchp rtc_cmos sg pcspkr rtc_core rtc_lib i2c_i801 i2c_core serio_raw button dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Pid: 0, comm: swapper Not tainted 2.6.27_ofa_compil #2

Call Trace:
 <IRQ>  [<ffffffff80235b8d>] warn_on_slowpath+0x51/0x77
 [<ffffffff80229b79>] __wake_up+0x38/0x4f
 [<ffffffff80246d57>] __wake_up_bit+0x28/0x2d
 [<ffffffffa05485af>] rpc_wake_up_task_queue_locked+0x223/0x24b [sunrpc]
 [<ffffffffa054861e>] rpc_wake_up_status+0x47/0x82 [sunrpc]
 [<ffffffff80239c49>] local_bh_enable_ip+0x3c/0x92
 [<ffffffffa0638fd1>] rpcrdma_conn_func+0x6d/0x7c [xprtrdma]
 [<ffffffffa063b316>] rpcrdma_qp_async_error_upcall+0x45/0x5a [xprtrdma]
 [<ffffffffa0294bb3>] mlx4_ib_qp_event+0xf9/0x100 [mlx4_ib]
 [<ffffffff802443da>] __queue_work+0x22/0x32
 [<ffffffffa01fc5d4>] mlx4_qp_event+0x8a/0xad [mlx4_core]
 [<ffffffffa01f50a5>] mlx4_eq_int+0x55/0x291 [mlx4_core]
 [<ffffffffa01f52f0>] mlx4_msi_x_interrupt+0xf/0x16 [mlx4_core]
 [<ffffffff802624f4>] handle_IRQ_event+0x25/0x53
 [<ffffffff80263c0a>] handle_edge_irq+0xe3/0x123
 [<ffffffff8020e907>] do_IRQ+0xf1/0x15e
 [<ffffffff8020c381>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffffa0549c3e>] nul_marshal+0x0/0x20 [sunrpc]
 [<ffffffff80212474>] mwait_idle+0x41/0x45
 [<ffffffff8020abdf>] cpu_idle+0x7e/0x9c

---[ end trace 5cc994fbe7e141af ]---
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)


Thanks,

Diego

Vu Pham wrote:
Celine Bourde wrote:
We have still the same problem, even changing the registration method.

mount doesn't reply and this is the output of dmesg on client:

rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0001, status -22 rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)

I have still another doubt: if the firmware is the problem, why is NFS RDMA working with a kernel 2.6.27.10 and without OFED 1.4 with these same cards??

On 2.6.27.10 nfsrdma does not use fast registration work request; therefore, it works well with connectX

From 2.6.28 and so on, nfsrdma start implementing/using fast registration work request and commit without verifying it with connectX

I'm looking and trying to resolve those glitches/issues now

-vu


Thanks,

Céline Bourde.

Tom Talpey wrote:
At 06:56 AM 4/27/2009, Celine Bourde wrote:
Thanks for the explanation.
Let me know if you have additional information.

We have a contact at Mellanox. I will contact him.

Thanks,

Céline.

Vu Pham wrote:
Celine,

I'm seeing mlx4 in the log so it is connectX.

nfsrdma does not work with any official connectX' fw release 2.6.0 because of fast registering work request problems between nfsrdma and the firmware.

There is a very simple workaround if you don't have the latest mlx4 firmware.

Just set the client to use the all-physical memory registration mode. This will avoid making unsupported reregistration requests, which the firmware advertised.

Before mounting, enter (as root)

    sysctl -w sunrpc.rdma_memreg_strategy = 6

The client should work properly after this.

If you do have access to the fixed firmware, I recommend using the default
setting (5) as it provides greater safety on the client.

Tom.

We are currently debugging/fixing those problems.

Do you have direct contact with Mellanox field application engineer? Please contact him/her.
If not I can send you a contact on private channel.

thanks,
-vu

Hi Celine,

What HCA do you have on your system? Is it ConnectX? If yes, what is its firmware version?

-vu

Hey Celine,

Thanks for gathering all this info! So the rdma connections work fine with everything _but_ nfsrdma. And errno 103 indicates the connection was aborted, maybe by the server (since no failures are logged by the client).


More below:


Celine Bourde wrote:
Hi Steve,

This email summarizes the situation:

Standard mount -> OK
---------------------

[r...@twind ~]# mount -o rw 192.168.0.215:/vol0 /mnt/
Command works fine.

rdma mount -> KO
-----------------

[r...@twind ~]# mount -o rdma,port=2050 192.168.0.215:/vol0 /mnt/
Command blocks ! I should perform Ctr+C to kill process.

or

[r...@twind ofa_kernel-1.4.1]# strace mount.nfs 192.168.0.215:/vol0 /mnt/ -o rdma,port=2050
[..]
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(3, {sa_family=AF_INET, sin_port=htons(610), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
fcntl(3, F_SETFL, O_RDWR)               = 0
sendto(3,
"-3\245\357\0\0\0\0\0\0\0\2\0\1\206\270\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0"...,
40, 0, {sa_family=AF_INET, sin_port=htons(610), sin_addr=inet_addr("127.0.0.1")}, 16) = 40 poll([{fd=3, events=POLLIN}], 1, 3000) = 1 ([{fd=3, revents=POLLIN}]) recvfrom(3, "-3\245\357\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 8800, MSG_DONTWAIT, {sa_family=AF_INET, sin_port=htons(610), sin_addr=inet_addr("127.0.0.1")}, [16]) = 24
close(3)                                = 0
mount("192.168.0.215:/vol0", "/mnt", "nfs", 0, "rdma,port=2050,addr=192.168.0.215"
..same problem

[r...@twind tmp]# dmesg
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)


Is there anything logged on the server side?

Also, can you try this again, but on both systems do this before attempting the mount:

echo 32768 > /proc/sys/sunrpc/rpc_debug

This will enable all the rpc trace points and add a bunch of logging to /var/log/messages. Maybe that will show us something. It think the server is aborting the connection for some reason.

Steve.




_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general






_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to