Hello list,
I want to apologize if I am intruding some development-only mailing list
with my questions, but that is the only mailing list considering
InfiniBand and Linux which I was able to find.
But let me tell you why I am writing this email - we have one dual-GPU
server with and InfiniBand HCA on it. In the future we would like to test
GPU-to-GPU communication between two or more hosts through the IB HCA, but
for now we just want to test how much time is needed by some packet to
travel from system memory / GPU memory to the IB HCA.
I think this is achievable on a single host by using the loopback
capabilities of the InfiniBand HCA. The problem is, that I was not able to
find a comprehensive description of how one sets up such loopback
operation on the HCA chip.
The only thing i have found in this regard is a snippet from a rather old
SunVTS 6.2 Test Reference Manual for x86:
<CITE>
The HCA supports internal loopback for packets transmitted between QPs
that are assigned to the same HCA port. If a packet is being transmitted
to a DLID that is equivalent to the Port LID with the LMC bits masked out
or the packet DLID is a multicast LID, the packet goes on the loopback
path. In this latter case, the packet also is transmitted to the fabric.
In the inbound direction, the ICRC and VCRC checks are blindly passed for
looped back packets. Note that internal loopback is supported only for
packets that are transmitted and received on the same port. Packets that
are transmitted on one port and received on another port are transmitted
to the fabric. The fabric directs these packets to the destination port.
<ENDCITE>
I don't know whether or not this is still true (or true at all) for the
case of our HCA (Mellanox ConnectX dual port QDR MT25408 chip). Can
someone with experience in setting up such loopback shed some light on
this?
Another question - must there be a subnet manager running on the box, so
the port(s) get configured properly or the loopback operation of the HCA
does not require it?
I have dug through the examples in OFED-1.4/src/perftest-1.2 and with its
help have up until now managed to create a single-threaded program which
can sucesfully open the HCA, set up two different QPs. Unfortunately the
programm crashes with a segmentation fault just at begining of
transmission of data between the two QPs and I am wondering if this is not
due to the lack of and subnet manager, wrong (or lacking configuration) or
just my awesome programming skills (see end of mail).
You can find the source of my program here:
http://www.ifh.de/~boyanov/gpeIBloopback.cc
http://www.ifh.de/~boyanov/gpeIBloopback.h
Any ideas, comments or suggestions regarding the questions described above
are highly appreciated! Please let me know if anything does not make sense
or you need more information on the subject.
With best regards,
Konstantin Boyanov
# uname -a
Linux gpu1.ifh.de 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:46:16 EST 2010
x86_64 x86_64 x86_64 GNU/Linux
# ibv_devinfo:
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.626
node_guid: 0002:c903:000b:e242
sys_image_guid: 0002:c903:000b:e245
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: MT_0D90110009
phys_port_cnt: 1
port: 1
state: PORT_DOWN (1)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
Output from GDB:
################
Starting program: /user/b/boyanov/workspace/GPEubench/src/ibloop
--len-min=1024 --len-max=8192 --len-inc=1024 --nmeas=1 --npass=1 --conn=0
--txdpth=64 --port=1
[Thread debugging using libthread_db enabled]
optLenMin = 1024, optLenMax = 8192, optLenInc = 1024, optNmeas = 1,
optNpass = 1
# dev_name: uverbs0
# dev_path: /sys/class/infiniband_verbs/uverbs0
# ibdev_path: /sys/class/infiniband/mlx4_0
# name: mlx4_0
Data fields in ibv_device_attr:
atomic_cap = 1
device_cap_flags = 7117942
local_ca_ack_delay = 15
max_ah = 0
max_cq = 65408
max_cqe = 4194303
max_ee = 0
max_ee_init_rd_atom = 0
max_ee_rd_atom = 0
max_fmr = 0
max_map_per_fmr = 8191
max_map_per_fmr = 8192
max_mcast_qp_attach = 56
max_mcast_qp_attach = 524272
max_mr_size = 18446744073709551615
max_mw = 0
max_pd = 32764
max_pkeys = 128
max_qp = 261824
max_qp_init_rd_atom = 128
max_qp_rd_atom = 16
max_qp_wr = 16351
max_raw_ethy_qp = 1
max_raw_ipv6_qp = 0
max_rdd = 0
max_res_rd_atom = 4189184
max_sg = e32
max_sge_rd = 0
max_srq = 65472
max_srq_sge = 31
max_srq_wr = 16383
max_total_mcast_qp_attach = 458752
node_guid = 4819426645931262464
page_size_cap = 4294966784
phys_port_cnt = 1
sys_image_guid = 5035599428045046272
vendor_id = 713
vendor_part_id = 26428
qp_state = 1
path_mig_state = 0
qkey = 286331153
rq_psn = 0
sq_psn = 1441792
dest_qp_num = 0
qp_access_flags = 352
pkey_index = 0
alt_pkey_index = 0
en_sqd_async_notify = 55
sq_draining = 0
max_rd_atomic = 0
max_dest_rd_atomic = 0
min_rnr_timer = 0
port_num = 1
timeout = 0
retry_cnt = 0
rnr_retry = 0
alt_port_num = 0
alt_timeout = 0
qp_state = 1
path_mig_state = 0
qkey = 286331153
rq_psn = 0
sq_psn = 1441792
dest_qp_num = 0
qp_access_flags = 352
pkey_index = 0
alt_pkey_index = 0
en_sqd_async_notify = 170
sq_draining = 0
max_rd_atomic = 0
max_dest_rd_atomic = 0
min_rnr_timer = 0
port_num = 1
timeout = 0
retry_cnt = 0
rnr_retry = 0
alt_port_num = 0
alt_timeout = 0
QP number = 2097225
QP handle = 0
QP state = 1
QP type = 4
QP events completed = 0
QP number = 2097226
QP handle = 1
QP state = 1
QP type = 4
QP events completed = 0
set the send work request fields
set the receive work request fields
local address: LID 0000 QPN 0x200049 PSN 0x204a16 RKEY
0x000000b0041c24 VADDR 0x00000000606010
remote address: LID 0000 QPN 0x20004a PSN 0x442a26 RKEY 0x000000b0041c24
VADDR 0x00000000606010
PING
Program received signal SIGSEGV, Segmentation fault.
0x00002aaaab006037 in ibv_cmd_create_qp () from
/usr/lib64/libmlx4-rdmav2.so
(gdb) bt
#0 0x00002aaaab006037 in ibv_cmd_create_qp () from
/usr/lib64/libmlx4-rdmav2.so
#1 0x00000000004010ba in ibv_post_send (qp=0x605da0, wr=0x7fffffffdf10,
bad_wr=0x7fffffffe0b0) at /usr/include/infiniband/verbs.h:1000
#2 0x000000000040270b in main (argc=9, argv=0x7fffffffe1f8) at
gpeIBloopback.cc:557
Konstantin Boyanov
DESY Zeuthen, Platanenallee 6, 15738 Zeuthen
Tel.:+49(33762)77178
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html