Hi,

I am developing in kernel features that use the RDMA or exchanging data.

I am currently testing the RDMA communication engine i coded and i am
experiencing a strange behaviour.

Setup :
kernel : linux 3.0
RDMA : soft Iwarp ( for dev / debug purpose)

Testing setup :
Establish communication between 2 machine A and B

A send a request (IB_POST_SEND / IB_SEND_SIGNALLED)  to B with a
message containing a variable status  set 1
B receive the message , process it and reply back with an and
(IB_WR_SEND + IBSEND_SIGNALED)  . The RDMA write , write a page, the
send , just to notify the reception of the page the message of
notification contain a variable status set to 2.

Note the IB_POST_SEND are only performed if the status change has been
confirmed. so i cannot try to send a message that doesn't contain a
status of 1 or 2 .

I have a flow control that prevent me from overflowing the CQ.
Also i protect my IB_POST_SEND by spinlock with irqsave/restore


Now here is the strange behaviour:

When i send 100 msg, 1000 msg , No problem .

When i start moving toward 100 000 msg and more i start seeing some
strange message in my "CQ send handler ".
I start to get notification of message send , however when i check the
status of the message , it is 0 ;

Note so far this doesn't create any disturbance and the exchange of
the 100k+ message finish ok.

However sometimes , it looks like the system is repeatedly trying to
send the message and  then the value of the status variable change and
is suddenly pushed over to B. Which naturally receive it and discarded
it because of the non valid status value.

It would be ok if it didn't also drop the message it was supposed to
send in the first place at the same time.

So from time to time i  get a random message being dropped ( only when
i push a high number of message as fast as possible).


My question is : did any of you experience similar issue, if yes , how
did you solve it?

Regards
Benoit

--
" The production of too many useful things results in too many useless people"
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to