I was hoping to see the actual inlined assembly for the code. Here is what gcc output:
LFB3: pushq %rbp LCFI0: movq %rsp, %rbp LCFI1: movl $0, -4(%rbp) leaq -4(%rbp), %rcx .align 4,0x90 L2: movl -4(%rbp), %eax leal 1(%rax), %edx lock;cmpxchgl %edx,(%rcx) sete %al testb %al, %al je L2 movl %edx, %eax leave ret As you can see there is no explicit call, the opal_atomic_cmpset_32 is really inlined. I think the problem is that you didn't specify the -O3 flag on your command line. OK, now that the assembly code is here, I can tell you what I was looking for. The pgi comiler generated two warnings: one about oldval being initialized but not used, and the second one about the cc being ignored. However, if we suppose that the assembly code generated by pgi is correct then there are two things we should have in the assembly output: 1. Initialization of %eax shortly before the cmpxchgl, but inside the internal loop (in my example two lines before). **This is the place where the oldval is supposed to be used** 2. Base the internal loop exit condition on the CCR register (the sete instruction on my code). **This is the place where the cc is important** george. On Jun 8, 2010, at 12:28 , Jeff Squyres wrote: > What exactly do you need? Your first mail said: > >>>> Can you send the assembly instructions generated by the PGI compiler for >>>> the following code: >>>> >>>> int32_t oldval; >>>> >>>> do { >>>> oldval = *addr; >>>> } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval + delta)); >>>> return (oldval + delta); > > > Which is what I sent...? > > > On Jun 8, 2010, at 8:22 AM, George Bosilca wrote: > >> The inline was ignored, and the code for the opal_atomic_cmpset_32 is not in >> there ... > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel