On Tue, Oct 12, 2010 at 8:50 PM, Ralph Campbell
<[email protected]> wrote:
> On Tue, 2010-10-12 at 11:38 -0700, Bart Van Assche wrote:
>> Hello,
>>
>> Has anyone already tried to process the work completions generated by
>> a HCA after the state of a queue pair has been changed to IB_QPS_ERR ?
>> With the hardware/firmware/driver combination I have tested I have
>> observed the following:
>> * Multiple completions with the same wr_id and nonzero (error) status
>> were received by the application, while all work requests queued with
>> the flag IB_SEND_SIGNALED had a unique wr_id.
>> * Completions with non-zero (error) status and a wr_id / opcode
>> combination were received that were never queued by the application.
>> Note: some work requests were queued with and some without the flag
>> IB_SEND_SIGNALED. I'm not sure however whether that has anything to do
>> with the observed behavior.
>>
>> This behavior is easy to reproduce. If I interpret the InfiniBand
>> Architecture Specification correctly, this behavior is non-compliant.
>>
>> Has anyone been looking into this before ?
>
> I haven't seen it. It isn't supposed to happen.
>
> What hardware and software are you using and how do you
> reproduce it?

Hello Ralph and Or,

The way I reproduce that behavior is by modifying the state of a queue
pair into IB_QPS_ERR while RDMA is ongoing. The application, which is
multithreaded, performs RDMA by calling ib_post_recv() and
ib_post_send() (opcodes IB_WR_SEND, IB_WR_RDMA_READ and
IB_WR_RDMA_WRITE). This has been observed with the mlx4 driver, a
ConnectX HCA and firmware version 2.7.0.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to