Hi Pete,
Thanks for the help.
>
> Looks good. By the way, once this starts working, you may want to
> grab CVS head rather than 2.6.3 as it has the important configure
> option "--disable-tcp" that is required to get good IB performance.
>
Ok I will give CVS a try. I am assuming I only need the "--disable-tcp"
option on the client side rigtht? I need to have both tcp and IB on the
server.
>
> You have three IB cards in the client? Wow. You might take a look
> at the output of "ibv_devices" and "ibv_devinfo" to see if the first
> NIC is the one that is connected to your server. Please send me the
> output too. If not, we will have to add code to let you specify an
> interface name to PVFS.
>
> The error comes from this line (simplified):
>
> mr = ibv_reg_mr(pd, buf, len, IBV_ACCESS_LOCAL_WRITE |
> IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ);
>
> where len is 20 * 8 kB == 160 kB. That's a pretty small amount
> of memory. This is the first memory registration call on the
> client.
>
> Maybe the output from ibv_devinfo will tell us that you have some
> interesting NIC that perhaps doesn't support the remote access flags,
> or some other clue.
>
Here is the output from ifconfig on the client:
ib0 Link encap:UNSPEC HWaddr
80-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
inet6 addr: fe80::208:f104:398:2991/64 Scope:Link
inet addr:10.1.1.2 Bcast:10.1.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
ib1 Link encap:UNSPEC HWaddr
80-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
inet addr:10.2.1.2 Bcast:10.2.255.255 Mask:255.255.0.0
inet6 addr: fe80::208:f104:398:2aed/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
ib2 Link encap:UNSPEC HWaddr
80-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
inet addr:10.3.1.2 Bcast:10.3.255.255 Mask:255.255.0.0
inet6 addr: fe80::208:f104:398:dc5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
The pvfs server is at IPoIB 10.1.1.106 so the client should be
connecting via ib0 or mthca0 or the first device as expected. Here is
the output for the ibv_devices and ibv_devinfo:
hpca4000:/test # ibv_devices
device node GUID
------ ----------------
mthca0 0008f10403982990
mthca1 0008f10403982aec
mthca2 0008f10403980dc4
hpca4000:/test # ibv_devinfo
hca_id: mthca0
fw_ver: 1.0.800
node_guid: 0008:f104:0398:2990
sys_image_guid: 0008:f104:0398:2993
vendor_id: 0x08f1
vendor_part_id: 25204
hw_ver: 0xA0
board_id: VLT0050010001
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 2
port_lmc: 0x00
hca_id: mthca1
fw_ver: 1.0.800
node_guid: 0008:f104:0398:2aec
sys_image_guid: 0008:f104:0398:2aef
vendor_id: 0x08f1
vendor_part_id: 25204
hw_ver: 0xA0
board_id: VLT0050010001
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 13
port_lmc: 0x00
hca_id: mthca2
fw_ver: 1.0.800
node_guid: 0008:f104:0398:0dc4
sys_image_guid: 0008:f104:0398:0dc7
vendor_id: 0x08f1
vendor_part_id: 25204
hw_ver: 0xA0
board_id: VLT0050010001
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 14
port_lmc: 0x00
> > this is the message I see on the server as it segfaults:
> > ----------------------------------------------------------
> > [E 09/05 12:39] Warning: exchange_data: partial read, 1/12 bytes.
> > Segmentation fault
>
> It's just complaining that the client died. But it shouldn't SEGV.
> I'll take a look at that.
>
This only happens when I try to use pvfs2-cp. If i run an application,
cp, or even a dd on the pvfs2 file system things seem to behave as
expected.
Thanks
Rene
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users