One could usefully distinguish between a vanilla RFO and an "RFO prefetch".

An RFO prefetch would when the core sends the RFO before it is ready to 
commit the store to cache (generally, before the store is at the head of 
the store buffer: i.e., next in line). It might still send an RFO prefetch 
early in this case, because your scenario *usually *doesn't happen, and if 
it waited until each store was at the head of the queue before processing 
it, no memory level parallelism would be possible for stores. This RFO 
prefetch could be triggered when the the store address (STA) part of the 
store executes: i.e., when its address is calculated, or it could also be 
triggered by some component that looks at the upcoming entries in the store 
buffer and issues RFO prefetches for the request.

In the case of an RFO prefetch, the line could be lost before you the core 
is ready to commit the line, as you have described. *Usually *this does not 
happen because *most *lines are not heavily contented (or contended at 
all), but it could. It only causes a performance problem, not a forward 
progress one, because the core can ask for the line again.

The second type of RFO, what I call "vanilla", would occur when the store 
is at the head of the store queue. In this case, the store can be committed 
as soon as the line is received in the exclusive state, so there is "no 
time" for another core to interrupt the process (in practice, it may not be 
instantaneous, but the core back either temporarily ignore or NACK incoming 
requests for this line by other cores).

On Monday, November 25, 2019 at 8:49:53 AM UTC-8 Peter Veentjer wrote:

> I have a question about MESI.
>
> My question isn't about atomic operations; but about an ordinary write to 
> the same cacheline done by 2 CPU's.
>
> If a CPU does a write, the write is placed on the store buffer.
>
> Then the CPU will send a invalidation request to the other cores (RFO)  
> for the given cacheline if the cacheline isn't in Exclusive or Modified 
> state, and once acknowledgement of the other CPUS have been received, the 
> write is allowed to move from the store buffer into the L1 cache.
>
> My confusion is about the 'atomic' behavior of requesting ownership till 
> writing the change on the cacheline in the L1 cache. What prevents another 
> CPU directly after the first CPU has requested ownership to do the same? So 
> what prevents another CPU getting lucky and stealing the cacheline after 
> the acknowledgements to the first CPU have been received, but before the 
> first CPU writes to the L1 cache.
>
> I guess that the first CPU will just ignore any competing bus transactions 
> as long as it has not completed the write. There is a ton of information 
> about MESI, but I could not find a lot of sensible information about this 
> behavior.
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/cc8b5af5-b5cd-4219-b777-2b4e26917515n%40googlegroups.com.

Reply via email to