Replacing with an actually meaningful example (sorry - it is sunday after all): 

Initial:
[a1] = [a2] = 0


Core1:
st.rel  [a1] = 0x1
ld.acq  r1 = [a2]
cmp     .eq     p1 = r1, zero
p1              <exclusive region>
st.rel  [a1] = zero


Core2:
st.rel  [a2] = 0x1
ld.acq  r1 = [a1]
cmp     .eq     p2 = r1, zero
p2              <exclusive region>
st.rel  [a2] = zero



This is the typical try-acquire exclusive region as per Peterson's, compiled 
with a ISO/ECMA CLI C compiler.
p1 (on core 1) = 1 and p2 (on code 1) = 1 is a possible combination on ia64, 
letting both cores executing a critical region concurrently in 2 cores.

PT


On Aug 29, 2010, at 11:45 AM, Pedro Miguel Sequeira de Justo Teixeira wrote:

> 
> 
> Acquire operations are allowed to occur before Release operations. There is 
> nothing preventing that from happening. If this code's synchronization safety 
> (I am not looking at the code) is based on st.rel being a full fence, than it 
> is wrong.
> 
> 
> Initial [addr1] = [addr2] = 0x0
> 
> Core1:
> st            [addr1] = 0x5A
> st.rel        [addr2] = 0xA5
> 
> Core2:
> ld.acq        r3 = [addr2]
> ld            r4 = [addr1]
> 
> 
> It is possible to have this resulting in r3 = 0 and r4 = 0x5A.
> 
> This is why Dekkard's/Peterson's doesn't work when "volatile" is simply 
> implemented by st.rel and ld.acq.
> 
> PT
> 
> 
> On Aug 29, 2010, at 5:34 AM, Petr Tesarik wrote:
> 
>> On Saturday 28 of August 2010 00:30:11 you wrote:
>>> Sorry to barge in but... what is preventing fetchadd4.acq from reaching the
>>> value present before st4.rel?
>> 
>> First, I'm no expert on ia64 low-level detail, such as the formal 
>> specification of the memory ordering. So I don't know. ;)
>> 
>> Second, there is no st4.rel, there is only st2.rel on the upper half on the 
>> double-word. The main problem here is that a subsequenct ld4.acq still sees 
>> the unincremented value.
>> 
>> Petr Tesarik
>> 
>>> On Fri, Aug 27, 2010 at 3:13 PM, Petr Tesarik <[email protected]> wrote:
>>>> On Friday 27 of August 2010 23:11:55 Luck, Tony wrote:
>>>>>> One more idea. The wrap-around case is the only one when the high
>>>>>> word
>>>> 
>>>> is
>>>> 
>>>>>> modified. This is in fact the only case when the fetchadd.acq
>>>>>> competes with the st2.rel about the actual contents of that location.
>>>>>> I don't
>>>> 
>>>> know
>>>> 
>>>>>> if it matters...
>>>>> 
>>>>> I pondered that for a while - but I have difficulty believing that
>>>>> fetchadd looks at which bits changed and only writes back the bytes
>>>>> that did.
>>>> 
>>>> OTOH the counter is only 15-bit, so it also wraps around at 0xfffe7fff,
>>>> but I have never seen it fail there. It always fails after the
>>>> wrap-around from 0xfffeffff.
>>>> 
>>>> Petr Tesarik
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]
Archive: 
http://lists.debian.org/[email protected]

Reply via email to