On Sat, Apr 19, 2014 at 7:25 PM, Christoph Lameter <[email protected]> wrote:
> On Fri, 18 Apr 2014, Nicolas Carlier wrote:
>
>> > If the receiving QP does not have buffers available then the HCA will
>> > silently drop UD packets. This is somethig that tripped us up initialy. So
>> > its lossless only from HCA to HCA not QP to QP.
>> >
>> >     Christoph, I'm very interesting by this. Do you have any documentation
>> related to this mechanisms? I'm troubleshooting an infiniband fabric where
>> I use heavily ipoib to collect multicast traffic. I see lot of gaps on the
>> application, but I can't map any of this gaps to drop counters or dropwatch
>> output.
>
> There was a promise that MOFED 2.1 or 2.2 should support a counter for QP
> overrun but I have never seen that work. The counter is there but does not
> increment.
>
Interresting, I don't see this counter neither on fw/sw release notes nor on the
systems. What the name of this counter ? On which HCA do you see this
counter ?

>> I'm sure drops don't occur upstream because I'm able to dump traffic before
>> the Infiniband Fabric without observing them.
>> I'm also able to dump traffic on another server inside the Infiniband Fabric
>> connected to the same leaf switch without observing them.
>
> Well looks like silent drops to me. One way to check is to verify the
> sequence ids in each multicast packet.
>
It is through this that I discovered that I was losing packets. But it
need to dissect
packet. I dreamed to use the psn, but it was a wrong direction.

> We finally ended up with looking at all the counters in the fabric and if
> we could not account for it we assumed it was QP overrun.
>

Yes. Do you have clue how interpret XmitWait and how a port is
notified to wait when
we use UD mode ?


--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to