That's great news. When you commit the change w/ sync_val_compare_and_swap
will you reenable 128 bit freelist cas in configure.ac?

Brian

On Mon, Mar 25, 2013 at 12:26 PM, James Peach <[email protected]> wrote:

> On Mar 25, 2013, at 11:46 AM, John Plevyak <[email protected]> wrote:
>
> > Sure, whatever works.  I did a little looking around, but I couldn't find
> > any other way to do an atomic read of 128 bits on x86_64.
>
> Yeh, __sync_val_compare_and_swap seems to work ...
>
> >
> > john
> >
> > On Mon, Mar 25, 2013 at 11:44 AM, Brian Geffon <[email protected]>
> wrote:
> >
> >> It turns out that __sync_fetch_and_add(__int128_t*, int) just uses
> >> cmpxchg16b under the covers anyway...
> >>
> >> int main ( void ) {
> >> __int128_t src=10, dest=20;
> >> *(__int128_t*)&(dest) = __sync_fetch_and_add((__int128_t*)&src, 0);
> >>
> >> return 0;
> >> }
> >>
> >> When building using gcc -mcx16 -S -o test.S test.c, we get the following
> >> assembly:
> >>
> >> main:
> >> .LFB0:
> >>        .cfi_startproc
> >>        pushq   %rbp
> >>        .cfi_def_cfa_offset 16
> >>        .cfi_offset 6, -16
> >>        movq    %rsp, %rbp
> >>        .cfi_def_cfa_register 6
> >>        pushq   %rbx
> >>        movq    $10, -32(%rbp)
> >>        movq    $0, -24(%rbp)
> >>        movq    $20, -48(%rbp)
> >>        movq    $0, -40(%rbp)
> >>        leaq    -32(%rbp), %r8
> >>        movq    (%r8), %rax
> >>        movq    8(%r8), %rdx
> >> .L2:
> >>        movq    %rax, %rsi
> >>        movq    %rdx, %rdi
> >>        movq    %rax, %rcx
> >>        movq    %rdx, %rbx
> >>        .cfi_offset 3, -24
> >>        addq    $0, %rcx
> >>        adcq    $0, %rbx
> >>        movq    %rcx, %r9
> >>        movq    %rbx, %rcx
> >>        movq    %r9, %rbx
> >>        lock cmpxchg16b (%r8)
> >>        jne     .L2
> >>        movq    %rsi, %rax
> >>        movq    %rdi, %rdx
> >>        movq    %rax, -48(%rbp)
> >>        movq    %rdx, -40(%rbp)
> >>        movl    $0, %eax
> >>        popq    %rbx
> >>        leave
> >>        .cfi_def_cfa 7, 8
> >>        ret
> >>        .cfi_endproc
> >>
> >> So instead of __sync_fetch_and_add() it appears we can use
> >> __sync_val_compare_and_swap() which should have no problems with
> clang...?
> >>
> >> Brian
> >>
> >>
> >> On Mon, Mar 25, 2013 at 8:43 AM, James Peach <[email protected]> wrote:
> >>
> >>> On Mar 24, 2013, at 2:11 PM, Brian Geffon <[email protected]> wrote:
> >>>
> >>>> I'm confused, I reenabled 128bit cas in configure.ac and verified
> that
> >>> it
> >>>> is building with cmpxchg16b and it's running perfectly. The
> >> test_freelist
> >>>> regression tests pass with no problems. So what were the issues
> >> remaining
> >>>> that required the revert? How can I possibly reproduce them so I can
> >> get
> >>>> this back on track?
> >>>
> >>> From my testing, the changes from John and Weijin fixed all the tests.
> >> The
> >>> last remaining issue that I have is that John's fix doesn't build with
> >>> clang. I guess that we can disable this code path with clang though ...
> >>>
> >>>>
> >>>> Brian
> >>>>
> >>>>
> >>>> On Fri, Mar 22, 2013 at 10:06 AM, James Peach <[email protected]>
> >> wrote:
> >>>>
> >>>>> On Mar 21, 2013, at 1:54 PM, [email protected] wrote:
> >>>>>
> >>>>>> Updated Branches:
> >>>>>> refs/heads/master 1a2ebccb1 -> f41323e01
> >>>>>>
> >>>>>>
> >>>>>> Fix bug with 128 bit compare and swap.
> >>>>>
> >>>>> Hmm, this doesn't build with clang ... it makes clang generate a call
> >>> to a
> >>>>> ___sync_fetch_and_add_1 stub that doesn't exist:
> >>>>>
> >>>>> Undefined symbols for architecture x86_64:
> >>>>> "___sync_fetch_and_add_1", referenced from:
> >>>>>     _ink_freelist_new in ink_queue.o
> >>>>>     _ink_freelist_free in ink_queue.o
> >>>>>     _ink_atomiclist_pop in ink_queue.o
> >>>>>     _ink_atomiclist_popall in ink_queue.o
> >>>>>     _ink_atomiclist_push in ink_queue.o
> >>>>>     _ink_atomiclist_remove in ink_queue.o
> >>>>>
> >>>>> I have an email out to the clang devs, but no response yet.
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> Project: http://git-wip-us.apache.org/repos/asf/trafficserver/repo
> >>>>>> Commit:
> >>>>> http://git-wip-us.apache.org/repos/asf/trafficserver/commit/4a090831
> >>>>>> Tree:
> >>> http://git-wip-us.apache.org/repos/asf/trafficserver/tree/4a090831
> >>>>>> Diff:
> >>> http://git-wip-us.apache.org/repos/asf/trafficserver/diff/4a090831
> >>>>>>
> >>>>>> Branch: refs/heads/master
> >>>>>> Commit: 4a0908314acb301f1e9eb691bbcd8d956e86c321
> >>>>>> Parents: 57ffdf5
> >>>>>> Author: [email protected] <[email protected]>
> >>>>>> Authored: Thu Mar 21 10:10:16 2013 -0700
> >>>>>> Committer: John Plevyak <[email protected]>
> >>>>>> Committed: Thu Mar 21 10:10:16 2013 -0700
> >>>>>>
> >>>>>>
> >> ----------------------------------------------------------------------
> >>>>>> lib/ts/ink_queue.h |    2 +-
> >>>>>> 1 files changed, 1 insertions(+), 1 deletions(-)
> >>>>>>
> >> ----------------------------------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> http://git-wip-us.apache.org/repos/asf/trafficserver/blob/4a090831/lib/ts/ink_queue.h
> >>>>>>
> >> ----------------------------------------------------------------------
> >>>>>> diff --git a/lib/ts/ink_queue.h b/lib/ts/ink_queue.h
> >>>>>> index 6ac9945..20fbf9a 100644
> >>>>>> --- a/lib/ts/ink_queue.h
> >>>>>> +++ b/lib/ts/ink_queue.h
> >>>>>> @@ -72,7 +72,7 @@ extern "C"
> >>>>>> #endif
> >>>>>>
> >>>>>> #if TS_HAS_128BIT_CAS
> >>>>>> -#define INK_QUEUE_LD(dst,src) *((__int128_t*)&(dst)) =
> >>>>> *((__int128_t*)&(src))
> >>>>>> +#define INK_QUEUE_LD(dst,src) *(__int128_t*)&(dst) =
> >>>>> __sync_fetch_and_add((__int128_t*)&(src), 0)
> >>>>>> #else
> >>>>>> #define INK_QUEUE_LD(dst,src) INK_QUEUE_LD64(dst,src)
> >>>>>> #endif
> >>>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

Reply via email to