lihaidong wrote:
Mr.Wise
I'm actually rewritting your program in order to get familiar with
Verbs+CMA API. In order to make the procedure more clearly, I put
nearly all the stuff into two long functions, one for server, the
other for client, the cq/cma event handler are the only exception.
I get the program run step by step.I use DMA mode firstly. As mlx4
driver don't support MR/MW mode, I turned to FMR.
Before adding FMR codes, I want make local_dma_lkey option work.ie.
mem_mode=dma,local_dma_lkey
I run into a strange problem here.
I changed the sgl's lkey into local_dma_lkey when preparing recv send
, rdma write wrs.
The problem is : After server post RDMA Read wr, get completion, and
print the data read from client.These are all normal. But after post a
send wr to indicate client to go ahead, instead of receving a
IB_WC_SEND wc ,the cq event handler get an event whose status is not
0, so it print something as follows:
cq completion failed with wr_id 0 opcode 2 status 4 vendor_err 52<3>
the opcode is 2, so it is an event of RDMA read, isn't weird? Why it
comes again and in wrong status?
the status 4 means IB_WC_LOC_PROT_ERR, is it a base/bounds violation?
How could this happen? The remote_len told by client is equal to cb->size.
Maybe the opcode is not valid for error CQEs with mlx4? I seem to
remember that was the case for mthca. You could make the wr_id's in the
WRs unique, then correlate the wr_id in the CQE to verify this.
LOC_PROT_ERR usually means the MR doesn't have the appropriate access
rights.
ps: Why recv_sgl send_sgl uses dma_mr->lkey while rdma_sgl use
dma_mr->rkey?
Could recv_sgl uses dma_mr->rkey or rdma_sgl use dma_mr_lkey? Why?
For iWARP, the targer or sink of a read must have remote write. So you
must use the rkey if you want the code to run on both IB and iWARP...
Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html