On Monday 20 October 2008, Roland Dreier wrote: > > Some architectures support weak ordering in which case better > > performance is possible. IB registered memory used for data can be > > weakly ordered becuase the the completion queues' buffers are > > registered as strongly ordered. This will result in flushing all data > > related outstanding DMA requests by the HCA when a completion is DMAed > > to a completion queue buffer. > > This would break the Mellanox HW's guarantee of writing the last byte of > an RDMA last, right? So on platforms where this has an effect (only > Cell at the moment) some applications could be subtly broken?
Yes, that is true. In our testing with openmpi, we had to disable eager RDMA. However, without this patch RDMA infiniband performance on some of our machines sucks so bad that we would not want to advertise support for it and I would really love to see this patch make it into OFED-1.4 and Linux-2.6.28. We (IBM and Mellanox) have discussed adding a module parameter for whether or not this should be enabled at runtime, and possibly extending the ibverbs interface to allow the application to choose. The question there remains what the default should be. AFAIU, the IB specification does not give any such guarantees about the ordering within RDMA transfers, right? If that is so, applications relying on the ordering would be broken to start with. Also, the existing code evidently does not use DMA_ATTR_WRITE_BARRIER for the allocation where we would use DMA_ATTR_WEAK_ORDERING. Because of what I can only explain as lack of coordination, DMA_ATTR_WRITE_BARRIER seems to be an almost exact opposite of DMA_ATTR_WEAK_ORDERING, which means that on platforms that use the former (SGI Altix so far), not passing any DMA attribute would break these applications in exactly the same way that weak ordering on cell is breaking them. Arnd <>< _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general