hello, 2010/6/19 Dotan Barak <[email protected]>: > >> I call rdma_create_id to create an ib id, then do resolve remote addr, >> resolve route work, then >> setup qp and call rdma_connect to setup connection, before ack or >> error replies, the thread will >> wait on a wait queue. The listening ib id of remote node will catch >> the connect request, >> setup qp, allocate and map pages to construct the RDMA-WRITE space, >> and call rdma_accept to reply >> the request. >> >> Some other information which may be useful: >> 1.All the "RETRY EXCEEDED" problems happened when there were two >> connections which use RDMA-WRITE to transfer things. >> And the latter connection had a high possibility to get into this problem. >> 2. All the "RETRY EXCEEDED" problems happened when the RMDA-WRITE >> space is 256MB each(that is, for two connections, consumes 512MB mem), >> when the RDMA-WRITE space is 64MB, this problem never happened in our >> test. Remote node's total memory is 2GB. >> >> Thanks a lot. >> > > Some more questions: > * Is the WR that "produces" the RETRY EXCEEDED is the first one/last one/in > the middle?
it's the first one > * Which values are you using in the QP context for retry exceeded counter + > retry timeout? > * Did you try to increase those values? I haven't set these values(actually I don't know where to set these values), i just set max_send_wr and max_send_sge fields of struct ib_qp_cap when creating qp. > * How many more QPs do you have between those nodes and which operations do > they use > (only RDMA-WRITEs?) > 4096 QPs for each connection, only do RDMA-WRITES. > Thanks > Dotan > -- Ding Dinghua -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
