Re: [PATCH] [OPENMP] CodeGen for "omp atomic read [seq_cst]" directive.

Bataev, Alexey Tue, 06 Jan 2015 03:27:52 -0800

Hi John,

I think global registers are generally thread-local, aren't they?


Oh, yes, you're right, I missed it. I'll fix it.

Given this structure:

   struct S { int x: 32; int y: 32; };

Clang will currently emit an access to x by masking a 64-bit load.  This is an 
essentially arbitrary implementation choice in IRGen; we've changed it before, 
and we may change it again in the future.  You're generating code that depends 
on this arbitrary choice, because it ends up being the pointee type of the 
address stored in the bitfield LValue.


Ahh, you're talking about compatibility between different versions of 
clang/LLVM compilers? I see then. Ok, I'll try to fix it somehow.

It only seems more convenient because you're doing this logic in a deep place 
within the atomics code.  If you had a high-level routine that reasoned about 
the kind of LValue it was working with *before* committing to an evaluation 
strategy, and then just called lower-level atomic routines as if it was doing 
an atomic operation on a char/short/int (for a bitfield) or the entire vector 
(for a vector element), this would fall out more naturally.


Agree, I'll try to improve the code.

Best regards,
Alexey Bataev
=============
Software Engineer
Intel Compiler Team

06.01.2015 4:44, John McCall пишет:

In http://reviews.llvm.org/D6431#105585, @ABataev wrote:

Hi John, thanks for the review.

How are you planning to implement stores for any of the non-simple l-value 
cases?  Compare-and-swap loops?


Yes, that's the plan. Except for global registers: I did not find
  compare-and-swap op in LLVM IR for them, so I decided to use global
  locks for them.


I think global registers are generally thread-local, aren't they?

... You need to use narrower bounds than that because you need something that's 
guaranteed stable: ...

Hmm, I did not catch why there can be troubles with bitfileds. Why one
  compiler may use 12-byte atomic access, while another one will produce a
  4-byte access? I think all atomic accesses will be the same.


Given this structure:

   struct S { int x: 32; int y: 32; };

Clang will currently emit an access to x by masking a 64-bit load.  This is an 
essentially arbitrary implementation choice in IRGen; we've changed it before, 
and we may change it again in the future.  You're generating code that depends 
on this arbitrary choice, because it ends up being the pointee type of the 
address stored in the bitfield LValue.

Also, both bitfields and vector elements can often be accessed more efficiently 
than just a libcall, depending on how much space they need.

I thought about it. I agree, but also it may significantly complicate
  the code itself.  That's why I decided to use only libcalls, taking into
  account that atomic operations on bitfields/vector elements are very
  rarely used (if any, actually I did not see any, but it is good to have
  a working solution for all kinds of lvalues).


It only seems more convenient because you're doing this logic in a deep place 
within the atomics code.  If you had a high-level routine that reasoned about 
the kind of LValue it was working with *before* committing to an evaluation 
strategy, and then just called lower-level atomic routines as if it was doing 
an atomic operation on a char/short/int (for a bitfield) or the entire vector 
(for a vector element), this would fall out more naturally.

John.


http://reviews.llvm.org/D6431

EMAIL PREFERENCES
   http://reviews.llvm.org/settings/panel/emailpreferences/



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] [OPENMP] CodeGen for "omp atomic read [seq_cst]" directive.

Reply via email to