On Mar 25, 2013, at 12:37 PM, Brian Geffon <[email protected]> wrote:

> That's great news. When you commit the change w/ sync_val_compare_and_swap
> will you reenable 128 bit freelist cas in configure.ac?

Done.

> 
> Brian
> 
> On Mon, Mar 25, 2013 at 12:26 PM, James Peach <[email protected]> wrote:
> 
>> On Mar 25, 2013, at 11:46 AM, John Plevyak <[email protected]> wrote:
>> 
>>> Sure, whatever works.  I did a little looking around, but I couldn't find
>>> any other way to do an atomic read of 128 bits on x86_64.
>> 
>> Yeh, __sync_val_compare_and_swap seems to work ...
>> 
>>> 
>>> john
>>> 
>>> On Mon, Mar 25, 2013 at 11:44 AM, Brian Geffon <[email protected]>
>> wrote:
>>> 
>>>> It turns out that __sync_fetch_and_add(__int128_t*, int) just uses
>>>> cmpxchg16b under the covers anyway...
>>>> 
>>>> int main ( void ) {
>>>> __int128_t src=10, dest=20;
>>>> *(__int128_t*)&(dest) = __sync_fetch_and_add((__int128_t*)&src, 0);
>>>> 
>>>> return 0;
>>>> }
>>>> 
>>>> When building using gcc -mcx16 -S -o test.S test.c, we get the following
>>>> assembly:
>>>> 
>>>> main:
>>>> .LFB0:
>>>>       .cfi_startproc
>>>>       pushq   %rbp
>>>>       .cfi_def_cfa_offset 16
>>>>       .cfi_offset 6, -16
>>>>       movq    %rsp, %rbp
>>>>       .cfi_def_cfa_register 6
>>>>       pushq   %rbx
>>>>       movq    $10, -32(%rbp)
>>>>       movq    $0, -24(%rbp)
>>>>       movq    $20, -48(%rbp)
>>>>       movq    $0, -40(%rbp)
>>>>       leaq    -32(%rbp), %r8
>>>>       movq    (%r8), %rax
>>>>       movq    8(%r8), %rdx
>>>> .L2:
>>>>       movq    %rax, %rsi
>>>>       movq    %rdx, %rdi
>>>>       movq    %rax, %rcx
>>>>       movq    %rdx, %rbx
>>>>       .cfi_offset 3, -24
>>>>       addq    $0, %rcx
>>>>       adcq    $0, %rbx
>>>>       movq    %rcx, %r9
>>>>       movq    %rbx, %rcx
>>>>       movq    %r9, %rbx
>>>>       lock cmpxchg16b (%r8)
>>>>       jne     .L2
>>>>       movq    %rsi, %rax
>>>>       movq    %rdi, %rdx
>>>>       movq    %rax, -48(%rbp)
>>>>       movq    %rdx, -40(%rbp)
>>>>       movl    $0, %eax
>>>>       popq    %rbx
>>>>       leave
>>>>       .cfi_def_cfa 7, 8
>>>>       ret
>>>>       .cfi_endproc
>>>> 
>>>> So instead of __sync_fetch_and_add() it appears we can use
>>>> __sync_val_compare_and_swap() which should have no problems with
>> clang...?
>>>> 
>>>> Brian
>>>> 
>>>> 
>>>> On Mon, Mar 25, 2013 at 8:43 AM, James Peach <[email protected]> wrote:
>>>> 
>>>>> On Mar 24, 2013, at 2:11 PM, Brian Geffon <[email protected]> wrote:
>>>>> 
>>>>>> I'm confused, I reenabled 128bit cas in configure.ac and verified
>> that
>>>>> it
>>>>>> is building with cmpxchg16b and it's running perfectly. The
>>>> test_freelist
>>>>>> regression tests pass with no problems. So what were the issues
>>>> remaining
>>>>>> that required the revert? How can I possibly reproduce them so I can
>>>> get
>>>>>> this back on track?
>>>>> 
>>>>> From my testing, the changes from John and Weijin fixed all the tests.
>>>> The
>>>>> last remaining issue that I have is that John's fix doesn't build with
>>>>> clang. I guess that we can disable this code path with clang though ...
>>>>> 
>>>>>> 
>>>>>> Brian
>>>>>> 
>>>>>> 
>>>>>> On Fri, Mar 22, 2013 at 10:06 AM, James Peach <[email protected]>
>>>> wrote:
>>>>>> 
>>>>>>> On Mar 21, 2013, at 1:54 PM, [email protected] wrote:
>>>>>>> 
>>>>>>>> Updated Branches:
>>>>>>>> refs/heads/master 1a2ebccb1 -> f41323e01
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Fix bug with 128 bit compare and swap.
>>>>>>> 
>>>>>>> Hmm, this doesn't build with clang ... it makes clang generate a call
>>>>> to a
>>>>>>> ___sync_fetch_and_add_1 stub that doesn't exist:
>>>>>>> 
>>>>>>> Undefined symbols for architecture x86_64:
>>>>>>> "___sync_fetch_and_add_1", referenced from:
>>>>>>>    _ink_freelist_new in ink_queue.o
>>>>>>>    _ink_freelist_free in ink_queue.o
>>>>>>>    _ink_atomiclist_pop in ink_queue.o
>>>>>>>    _ink_atomiclist_popall in ink_queue.o
>>>>>>>    _ink_atomiclist_push in ink_queue.o
>>>>>>>    _ink_atomiclist_remove in ink_queue.o
>>>>>>> 
>>>>>>> I have an email out to the clang devs, but no response yet.
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Project: http://git-wip-us.apache.org/repos/asf/trafficserver/repo
>>>>>>>> Commit:
>>>>>>> http://git-wip-us.apache.org/repos/asf/trafficserver/commit/4a090831
>>>>>>>> Tree:
>>>>> http://git-wip-us.apache.org/repos/asf/trafficserver/tree/4a090831
>>>>>>>> Diff:
>>>>> http://git-wip-us.apache.org/repos/asf/trafficserver/diff/4a090831
>>>>>>>> 
>>>>>>>> Branch: refs/heads/master
>>>>>>>> Commit: 4a0908314acb301f1e9eb691bbcd8d956e86c321
>>>>>>>> Parents: 57ffdf5
>>>>>>>> Author: [email protected] <[email protected]>
>>>>>>>> Authored: Thu Mar 21 10:10:16 2013 -0700
>>>>>>>> Committer: John Plevyak <[email protected]>
>>>>>>>> Committed: Thu Mar 21 10:10:16 2013 -0700
>>>>>>>> 
>>>>>>>> 
>>>> ----------------------------------------------------------------------
>>>>>>>> lib/ts/ink_queue.h |    2 +-
>>>>>>>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>>>>>>> 
>>>> ----------------------------------------------------------------------
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> http://git-wip-us.apache.org/repos/asf/trafficserver/blob/4a090831/lib/ts/ink_queue.h
>>>>>>>> 
>>>> ----------------------------------------------------------------------
>>>>>>>> diff --git a/lib/ts/ink_queue.h b/lib/ts/ink_queue.h
>>>>>>>> index 6ac9945..20fbf9a 100644
>>>>>>>> --- a/lib/ts/ink_queue.h
>>>>>>>> +++ b/lib/ts/ink_queue.h
>>>>>>>> @@ -72,7 +72,7 @@ extern "C"
>>>>>>>> #endif
>>>>>>>> 
>>>>>>>> #if TS_HAS_128BIT_CAS
>>>>>>>> -#define INK_QUEUE_LD(dst,src) *((__int128_t*)&(dst)) =
>>>>>>> *((__int128_t*)&(src))
>>>>>>>> +#define INK_QUEUE_LD(dst,src) *(__int128_t*)&(dst) =
>>>>>>> __sync_fetch_and_add((__int128_t*)&(src), 0)
>>>>>>>> #else
>>>>>>>> #define INK_QUEUE_LD(dst,src) INK_QUEUE_LD64(dst,src)
>>>>>>>> #endif
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>> 
>> 

Reply via email to