On Mar 25, 2013, at 12:37 PM, Brian Geffon <[email protected]> wrote:
> That's great news. When you commit the change w/ sync_val_compare_and_swap > will you reenable 128 bit freelist cas in configure.ac? Done. > > Brian > > On Mon, Mar 25, 2013 at 12:26 PM, James Peach <[email protected]> wrote: > >> On Mar 25, 2013, at 11:46 AM, John Plevyak <[email protected]> wrote: >> >>> Sure, whatever works. I did a little looking around, but I couldn't find >>> any other way to do an atomic read of 128 bits on x86_64. >> >> Yeh, __sync_val_compare_and_swap seems to work ... >> >>> >>> john >>> >>> On Mon, Mar 25, 2013 at 11:44 AM, Brian Geffon <[email protected]> >> wrote: >>> >>>> It turns out that __sync_fetch_and_add(__int128_t*, int) just uses >>>> cmpxchg16b under the covers anyway... >>>> >>>> int main ( void ) { >>>> __int128_t src=10, dest=20; >>>> *(__int128_t*)&(dest) = __sync_fetch_and_add((__int128_t*)&src, 0); >>>> >>>> return 0; >>>> } >>>> >>>> When building using gcc -mcx16 -S -o test.S test.c, we get the following >>>> assembly: >>>> >>>> main: >>>> .LFB0: >>>> .cfi_startproc >>>> pushq %rbp >>>> .cfi_def_cfa_offset 16 >>>> .cfi_offset 6, -16 >>>> movq %rsp, %rbp >>>> .cfi_def_cfa_register 6 >>>> pushq %rbx >>>> movq $10, -32(%rbp) >>>> movq $0, -24(%rbp) >>>> movq $20, -48(%rbp) >>>> movq $0, -40(%rbp) >>>> leaq -32(%rbp), %r8 >>>> movq (%r8), %rax >>>> movq 8(%r8), %rdx >>>> .L2: >>>> movq %rax, %rsi >>>> movq %rdx, %rdi >>>> movq %rax, %rcx >>>> movq %rdx, %rbx >>>> .cfi_offset 3, -24 >>>> addq $0, %rcx >>>> adcq $0, %rbx >>>> movq %rcx, %r9 >>>> movq %rbx, %rcx >>>> movq %r9, %rbx >>>> lock cmpxchg16b (%r8) >>>> jne .L2 >>>> movq %rsi, %rax >>>> movq %rdi, %rdx >>>> movq %rax, -48(%rbp) >>>> movq %rdx, -40(%rbp) >>>> movl $0, %eax >>>> popq %rbx >>>> leave >>>> .cfi_def_cfa 7, 8 >>>> ret >>>> .cfi_endproc >>>> >>>> So instead of __sync_fetch_and_add() it appears we can use >>>> __sync_val_compare_and_swap() which should have no problems with >> clang...? >>>> >>>> Brian >>>> >>>> >>>> On Mon, Mar 25, 2013 at 8:43 AM, James Peach <[email protected]> wrote: >>>> >>>>> On Mar 24, 2013, at 2:11 PM, Brian Geffon <[email protected]> wrote: >>>>> >>>>>> I'm confused, I reenabled 128bit cas in configure.ac and verified >> that >>>>> it >>>>>> is building with cmpxchg16b and it's running perfectly. The >>>> test_freelist >>>>>> regression tests pass with no problems. So what were the issues >>>> remaining >>>>>> that required the revert? How can I possibly reproduce them so I can >>>> get >>>>>> this back on track? >>>>> >>>>> From my testing, the changes from John and Weijin fixed all the tests. >>>> The >>>>> last remaining issue that I have is that John's fix doesn't build with >>>>> clang. I guess that we can disable this code path with clang though ... >>>>> >>>>>> >>>>>> Brian >>>>>> >>>>>> >>>>>> On Fri, Mar 22, 2013 at 10:06 AM, James Peach <[email protected]> >>>> wrote: >>>>>> >>>>>>> On Mar 21, 2013, at 1:54 PM, [email protected] wrote: >>>>>>> >>>>>>>> Updated Branches: >>>>>>>> refs/heads/master 1a2ebccb1 -> f41323e01 >>>>>>>> >>>>>>>> >>>>>>>> Fix bug with 128 bit compare and swap. >>>>>>> >>>>>>> Hmm, this doesn't build with clang ... it makes clang generate a call >>>>> to a >>>>>>> ___sync_fetch_and_add_1 stub that doesn't exist: >>>>>>> >>>>>>> Undefined symbols for architecture x86_64: >>>>>>> "___sync_fetch_and_add_1", referenced from: >>>>>>> _ink_freelist_new in ink_queue.o >>>>>>> _ink_freelist_free in ink_queue.o >>>>>>> _ink_atomiclist_pop in ink_queue.o >>>>>>> _ink_atomiclist_popall in ink_queue.o >>>>>>> _ink_atomiclist_push in ink_queue.o >>>>>>> _ink_atomiclist_remove in ink_queue.o >>>>>>> >>>>>>> I have an email out to the clang devs, but no response yet. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Project: http://git-wip-us.apache.org/repos/asf/trafficserver/repo >>>>>>>> Commit: >>>>>>> http://git-wip-us.apache.org/repos/asf/trafficserver/commit/4a090831 >>>>>>>> Tree: >>>>> http://git-wip-us.apache.org/repos/asf/trafficserver/tree/4a090831 >>>>>>>> Diff: >>>>> http://git-wip-us.apache.org/repos/asf/trafficserver/diff/4a090831 >>>>>>>> >>>>>>>> Branch: refs/heads/master >>>>>>>> Commit: 4a0908314acb301f1e9eb691bbcd8d956e86c321 >>>>>>>> Parents: 57ffdf5 >>>>>>>> Author: [email protected] <[email protected]> >>>>>>>> Authored: Thu Mar 21 10:10:16 2013 -0700 >>>>>>>> Committer: John Plevyak <[email protected]> >>>>>>>> Committed: Thu Mar 21 10:10:16 2013 -0700 >>>>>>>> >>>>>>>> >>>> ---------------------------------------------------------------------- >>>>>>>> lib/ts/ink_queue.h | 2 +- >>>>>>>> 1 files changed, 1 insertions(+), 1 deletions(-) >>>>>>>> >>>> ---------------------------------------------------------------------- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >> http://git-wip-us.apache.org/repos/asf/trafficserver/blob/4a090831/lib/ts/ink_queue.h >>>>>>>> >>>> ---------------------------------------------------------------------- >>>>>>>> diff --git a/lib/ts/ink_queue.h b/lib/ts/ink_queue.h >>>>>>>> index 6ac9945..20fbf9a 100644 >>>>>>>> --- a/lib/ts/ink_queue.h >>>>>>>> +++ b/lib/ts/ink_queue.h >>>>>>>> @@ -72,7 +72,7 @@ extern "C" >>>>>>>> #endif >>>>>>>> >>>>>>>> #if TS_HAS_128BIT_CAS >>>>>>>> -#define INK_QUEUE_LD(dst,src) *((__int128_t*)&(dst)) = >>>>>>> *((__int128_t*)&(src)) >>>>>>>> +#define INK_QUEUE_LD(dst,src) *(__int128_t*)&(dst) = >>>>>>> __sync_fetch_and_add((__int128_t*)&(src), 0) >>>>>>>> #else >>>>>>>> #define INK_QUEUE_LD(dst,src) INK_QUEUE_LD64(dst,src) >>>>>>>> #endif >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>> >> >>
