It turns out that __sync_fetch_and_add(__int128_t*, int) just uses
cmpxchg16b under the covers anyway...
int main ( void ) {
__int128_t src=10, dest=20;
*(__int128_t*)&(dest) = __sync_fetch_and_add((__int128_t*)&src, 0);
return 0;
}
When building using gcc -mcx16 -S -o test.S test.c, we get the following
assembly:
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
pushq %rbx
movq $10, -32(%rbp)
movq $0, -24(%rbp)
movq $20, -48(%rbp)
movq $0, -40(%rbp)
leaq -32(%rbp), %r8
movq (%r8), %rax
movq 8(%r8), %rdx
.L2:
movq %rax, %rsi
movq %rdx, %rdi
movq %rax, %rcx
movq %rdx, %rbx
.cfi_offset 3, -24
addq $0, %rcx
adcq $0, %rbx
movq %rcx, %r9
movq %rbx, %rcx
movq %r9, %rbx
lock cmpxchg16b (%r8)
jne .L2
movq %rsi, %rax
movq %rdi, %rdx
movq %rax, -48(%rbp)
movq %rdx, -40(%rbp)
movl $0, %eax
popq %rbx
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
So instead of __sync_fetch_and_add() it appears we can use
__sync_val_compare_and_swap() which should have no problems with clang...?
Brian
On Mon, Mar 25, 2013 at 8:43 AM, James Peach <[email protected]> wrote:
> On Mar 24, 2013, at 2:11 PM, Brian Geffon <[email protected]> wrote:
>
> > I'm confused, I reenabled 128bit cas in configure.ac and verified that
> it
> > is building with cmpxchg16b and it's running perfectly. The test_freelist
> > regression tests pass with no problems. So what were the issues remaining
> > that required the revert? How can I possibly reproduce them so I can get
> > this back on track?
>
> From my testing, the changes from John and Weijin fixed all the tests. The
> last remaining issue that I have is that John's fix doesn't build with
> clang. I guess that we can disable this code path with clang though ...
>
> >
> > Brian
> >
> >
> > On Fri, Mar 22, 2013 at 10:06 AM, James Peach <[email protected]> wrote:
> >
> >> On Mar 21, 2013, at 1:54 PM, [email protected] wrote:
> >>
> >>> Updated Branches:
> >>> refs/heads/master 1a2ebccb1 -> f41323e01
> >>>
> >>>
> >>> Fix bug with 128 bit compare and swap.
> >>
> >> Hmm, this doesn't build with clang ... it makes clang generate a call
> to a
> >> ___sync_fetch_and_add_1 stub that doesn't exist:
> >>
> >> Undefined symbols for architecture x86_64:
> >> "___sync_fetch_and_add_1", referenced from:
> >> _ink_freelist_new in ink_queue.o
> >> _ink_freelist_free in ink_queue.o
> >> _ink_atomiclist_pop in ink_queue.o
> >> _ink_atomiclist_popall in ink_queue.o
> >> _ink_atomiclist_push in ink_queue.o
> >> _ink_atomiclist_remove in ink_queue.o
> >>
> >> I have an email out to the clang devs, but no response yet.
> >>
> >>>
> >>>
> >>> Project: http://git-wip-us.apache.org/repos/asf/trafficserver/repo
> >>> Commit:
> >> http://git-wip-us.apache.org/repos/asf/trafficserver/commit/4a090831
> >>> Tree:
> http://git-wip-us.apache.org/repos/asf/trafficserver/tree/4a090831
> >>> Diff:
> http://git-wip-us.apache.org/repos/asf/trafficserver/diff/4a090831
> >>>
> >>> Branch: refs/heads/master
> >>> Commit: 4a0908314acb301f1e9eb691bbcd8d956e86c321
> >>> Parents: 57ffdf5
> >>> Author: [email protected] <[email protected]>
> >>> Authored: Thu Mar 21 10:10:16 2013 -0700
> >>> Committer: John Plevyak <[email protected]>
> >>> Committed: Thu Mar 21 10:10:16 2013 -0700
> >>>
> >>> ----------------------------------------------------------------------
> >>> lib/ts/ink_queue.h | 2 +-
> >>> 1 files changed, 1 insertions(+), 1 deletions(-)
> >>> ----------------------------------------------------------------------
> >>>
> >>>
> >>>
> >>
> http://git-wip-us.apache.org/repos/asf/trafficserver/blob/4a090831/lib/ts/ink_queue.h
> >>> ----------------------------------------------------------------------
> >>> diff --git a/lib/ts/ink_queue.h b/lib/ts/ink_queue.h
> >>> index 6ac9945..20fbf9a 100644
> >>> --- a/lib/ts/ink_queue.h
> >>> +++ b/lib/ts/ink_queue.h
> >>> @@ -72,7 +72,7 @@ extern "C"
> >>> #endif
> >>>
> >>> #if TS_HAS_128BIT_CAS
> >>> -#define INK_QUEUE_LD(dst,src) *((__int128_t*)&(dst)) =
> >> *((__int128_t*)&(src))
> >>> +#define INK_QUEUE_LD(dst,src) *(__int128_t*)&(dst) =
> >> __sync_fetch_and_add((__int128_t*)&(src), 0)
> >>> #else
> >>> #define INK_QUEUE_LD(dst,src) INK_QUEUE_LD64(dst,src)
> >>> #endif
> >>>
> >>
> >>
>
>