On Tue, Nov 12, 2013 at 04:59:19PM -0400, Anuj Kalia wrote:

> Thanks again. So we conclude there is nothing like an atomic cacheline
> read. Then my current design is a dud. But there should be 8 byte
> atomicity, right? I think I can leverage that to get what I want.

64 bit CPUs do have 64 bit atomic stores, so you can rely on DMAs
seeing only values you've written and not some combination of old/new
bits.

> This part is interesting (from Jason's reply):
> "If you burst read from the HCA value and counter then the result is
> undefined, you don't know if counter was read before value, or the
> other way around."
 
> Is there a way of knowing the order in which they are read - for
> example, I heard in a talk that there is a left-to-right ordering
> when

So, this I don't know. I don't think anyone has ever had a need to
look into that, it is certainly not defined. What you are asking is
how does memory write ordering interact with a burst read.

> a HCA reads a contiguous buffer. This could be totally architecture
> specific, for example, I just want the answer for Mellanox ConnectX-3
> cards. I think I can check this experimentally, but a definitive
> answer would be great.

The talk you heard about left-to-write ordering was probably in the
context of DMA burst writes and MPI polling.

In this case the DMA would write DDDDDP, and the MPI would poll on
P. Once P is written it assumes that D is visible.

This is undefined in general, but ensured in some cases on Intel and
Mellanox. I'm not sure if D and P have to be in the same cache line,
but you probably need a fence after reading P..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to