On Sat, Apr 19, 2014 at 7:25 PM, Christoph Lameter <[email protected]> wrote: > On Fri, 18 Apr 2014, Nicolas Carlier wrote: > >> > If the receiving QP does not have buffers available then the HCA will >> > silently drop UD packets. This is somethig that tripped us up initialy. So >> > its lossless only from HCA to HCA not QP to QP. >> > >> > Christoph, I'm very interesting by this. Do you have any documentation >> related to this mechanisms? I'm troubleshooting an infiniband fabric where >> I use heavily ipoib to collect multicast traffic. I see lot of gaps on the >> application, but I can't map any of this gaps to drop counters or dropwatch >> output. > > There was a promise that MOFED 2.1 or 2.2 should support a counter for QP > overrun but I have never seen that work. The counter is there but does not > increment. > Interresting, I don't see this counter neither on fw/sw release notes nor on the systems. What the name of this counter ? On which HCA do you see this counter ?
>> I'm sure drops don't occur upstream because I'm able to dump traffic before >> the Infiniband Fabric without observing them. >> I'm also able to dump traffic on another server inside the Infiniband Fabric >> connected to the same leaf switch without observing them. > > Well looks like silent drops to me. One way to check is to verify the > sequence ids in each multicast packet. > It is through this that I discovered that I was losing packets. But it need to dissect packet. I dreamed to use the psn, but it was a wrong direction. > We finally ended up with looking at all the counters in the fabric and if > we could not account for it we assumed it was QP overrun. > Yes. Do you have clue how interpret XmitWait and how a port is notified to wait when we use UD mode ? -- -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
