I'm having issues with nfs-rdma. The server has a Mellanox QDR card in it:

# ibv_devinfo
hca_id: mlx4_0
        transport:                      InfiniBand (0)
        fw_ver:                         2.7.626
        node_guid:                      0002:c903:0007:f352
        sys_image_guid:                 0002:c903:0007:f355
        vendor_id:                      0x02c9
        vendor_part_id:                 26428
        hw_ver:                         0xB0
        board_id:                       MT_0D90110009
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 1
                        port_lid:               1
                        port_lmc:               0x00

The client is a Supermicro motherboard with the IB QDR built in:

# ibv_devinfo
hca_id: mlx4_0
        transport:                      InfiniBand (0)
        fw_ver:                         2.7.700
        node_guid:                      0025:90ff:ff16:0240
        sys_image_guid:                 0025:90ff:ff16:0243
        vendor_id:                      0x02c9
        vendor_part_id:                 26428
        hw_ver:                         0xB0
        board_id:                       SM_2121000001000
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 1
                        port_lid:               6
                        port_lmc:               0x00

The 2.7.700 firmware is brand new from Supermicro. Prior to this it
was 2.7.0 and it showed the same thing.

NFS is working for the most part but the client always logs messages
like below after a modestly sizable transfer ( like 2GB) with dd:

# dd if=/dev/zero of=node4.dat bs=1M count=2K oflag=direct
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 11.7277 seconds, 183 MB/s

messages.log:

Nov 15 09:39:00 node4 kernel: rpcrdma: connection to
192.168.0.100:20049 on mlx4_0, memreg 5 slots 32 ird 16

and then 5 minutes later:

Nov 15 09:44:00 node4 kernel: rpcrdma: connection to
192.168.0.100:20049 closed (-103)

The server has 16 GB of RAM and the client has 32 GB. They are running
CentOS 5.5 with a 2.6.35.4 kernel.

I have tried changing memreg to 6 and it gives the same corresponding message.

Thanks for any help.

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to