[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-22 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 added a comment.

The more I think about this, the more I have doubts about whether this should 
be supported. For example, what happens in cases like this?:

  #include 
  #include 
  
  struct Object {
  int i;
  Object() : i(3) {}
  Object(int v) : i(3 + v) {}
  };
  
  int main(void) {
  int w = 4;
  static thread_local Object o(w);
  
  std::cout << "[main] o.i = " << o.i << std::endl;
  std::thread([] {
  std::cout << "[new thread] o.i = " << o.i << std::endl;
  }).join();
  }

Should `w` be captured or not? Furthermore, if o referenced another block-scope 
thread-local that had an initializer referencing another local variable, that 
would have to be captured too. So I now think this should be an error.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-22 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 marked an inline comment as done.
Prince781 added a comment.

In D66122#1639947 , @efriedma wrote:

> But given that, I think we should submit a core issue, and hold off on 
> merging this until we hear back from the committee.


I agree here. There does appear to be some previous discussion on this matter, 
but the spec itself still doesn't contain any language addressing this issue. I 
will submit a core issue.




Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:317
+  deps.insert(V);
+  auto V_Refs = enumerateVarInitDependencies(V);
+  deps.insert(V_Refs.begin(), V_Refs.end());

efriedma wrote:
> Do you need to recurse here?  It looks like the caller should handle that.
Oops, I think you might be right.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-21 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 marked 3 inline comments as done.
Prince781 added inline comments.



Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:479
+return a_deps.find(b) != a_deps.end() 
+|| b->getLocation() < a->getLocation(); // ensure deterministic 
ordering
+  });

efriedma wrote:
> Prince781 wrote:
> > efriedma wrote:
> > > Is the call to a_deps.find() here actually necessary?  It shouldn't be 
> > > possible for an initializer to directly refer to a variable declared 
> > > later.
> > > 
> > > "<" on SourceLocations isn't source order, in general; you need 
> > > isBeforeInTranslationUnit.  (This should be documented somewhere, but I'm 
> > > not finding the documentation, unfortunately.  Any suggestions for where 
> > > it should be documented?)
> > > It shouldn't be possible for an initializer to directly refer to a 
> > > variable declared later.
> > 
> > That's true. I was using `deps.find()` to order the initialization of the 
> > variables. But since you mention `isBeforeInTranslationUnit`, I can use 
> > that instead. It appears to be documented [[ 
> > https://clang.llvm.org/doxygen/classclang_1_1SourceManager.html#af0ffe5c3a34c93204accb74f0f4717c5
> >  | here ]].
> The documentation question was more referring to the lack of documentation 
> for operator< on SourceLocation.
I see. I misinterpreted what you said. Well, then, I think that 
`operator<(SourceLocation &, SourceLocation &)` should have Doxygen comments 
and maybe a note saying, "this probably isn't what you want; see 
SourceManager::isBeforeInTranslationUnit()"


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-21 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 updated this revision to Diff 216333.
Prince781 added a comment.

Use SourceManager to order inits.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCXX/cxx11-thread-local.cpp

Index: clang/test/CodeGenCXX/cxx11-thread-local.cpp
===
--- clang/test/CodeGenCXX/cxx11-thread-local.cpp
+++ clang/test/CodeGenCXX/cxx11-thread-local.cpp
@@ -268,6 +268,33 @@
   return this->n;
 }
 
+namespace static_tls_in_lambda {
+  struct X {
+X() {}
+  };
+
+
+  X (*f())() {
+static thread_local X x;
+
+return [] { return x; };
+  }
+
+  auto y = f();
+
+  void g() { y(); }
+
+  void bar(X**, X**, X**);
+  void baz(void());
+  void f2() {
+  thread_local X x;
+  thread_local X* p = 
+  thread_local X* q = p;
+  thread_local X* r = q;
+  baz([]{bar(, , );});
+  }
+}
+
 namespace {
 thread_local int anon_i{1};
 }
@@ -303,6 +330,42 @@
 // CHECK: store i64 1, i64* @_ZGVN1XIiE1mE
 // CHECK: br label
 
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda1fEvENK3$_1clEv"
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda1fEvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x, align 1
+
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda2f2EvENK3$_2clEv"
+// init x
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x, align 1
+// init p
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: store %"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p, align 1
+// init q
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q, align 1
+// init r
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1r, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r, align 1
+
+
 // CHECK: define {{.*}}@[[GLOBAL_INIT:.*]]()
 // CHECK: call void @[[C_INIT]]()
 // CHECK: call void @[[E_INIT]]()
Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -467,6 +467,10 @@
   /// should emit cleanups.
   bool CurFuncIsThunk = false;
 
+  /// static thread-local variables we've referenced that were declared in a
+  /// parent function.
+  llvm::SmallSet ForeignStaticTLSVars;
+
   /// In ARC, whether we should autorelease the return value.
   bool AutoreleaseResult = false;
 
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -31,6 +31,7 @@
 #include "clang/Basic/TargetInfo.h"
 #include "clang/CodeGen/CGFunctionInfo.h"
 #include "clang/Frontend/FrontendDiagnostic.h"
+#include "llvm/ADT/STLExtras.h"
 #include "llvm/IR/DataLayout.h"

[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-20 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 marked an inline comment as done.
Prince781 added inline comments.



Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:479
+return a_deps.find(b) != a_deps.end() 
+|| b->getLocation() < a->getLocation(); // ensure deterministic 
ordering
+  });

efriedma wrote:
> Is the call to a_deps.find() here actually necessary?  It shouldn't be 
> possible for an initializer to directly refer to a variable declared later.
> 
> "<" on SourceLocations isn't source order, in general; you need 
> isBeforeInTranslationUnit.  (This should be documented somewhere, but I'm not 
> finding the documentation, unfortunately.  Any suggestions for where it 
> should be documented?)
> It shouldn't be possible for an initializer to directly refer to a variable 
> declared later.

That's true. I was using `deps.find()` to order the initialization of the 
variables. But since you mention `isBeforeInTranslationUnit`, I can use that 
instead. It appears to be documented [[ 
https://clang.llvm.org/doxygen/classclang_1_1SourceManager.html#af0ffe5c3a34c93204accb74f0f4717c5
 | here ]].


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-19 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 added a comment.

In D66122#1633990 , @efriedma wrote:

> I think we should send a defect report to the C++ standards committee to 
> clarify the ambiguity here.


I followed the instructions on this page  
and sent it to std-discussion first.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-19 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 updated this revision to Diff 215912.
Prince781 added a comment.

I think this should order the initializers deterministically according to their 
var declaration order. Let me know if there's something I haven't considered.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCXX/cxx11-thread-local.cpp

Index: clang/test/CodeGenCXX/cxx11-thread-local.cpp
===
--- clang/test/CodeGenCXX/cxx11-thread-local.cpp
+++ clang/test/CodeGenCXX/cxx11-thread-local.cpp
@@ -268,6 +268,33 @@
   return this->n;
 }
 
+namespace static_tls_in_lambda {
+  struct X {
+X() {}
+  };
+
+
+  X (*f())() {
+static thread_local X x;
+
+return [] { return x; };
+  }
+
+  auto y = f();
+
+  void g() { y(); }
+
+  void bar(X**, X**, X**);
+  void baz(void());
+  void f2() {
+  thread_local X x;
+  thread_local X* p = 
+  thread_local X* q = p;
+  thread_local X* r = q;
+  baz([]{bar(, , );});
+  }
+}
+
 namespace {
 thread_local int anon_i{1};
 }
@@ -303,6 +330,42 @@
 // CHECK: store i64 1, i64* @_ZGVN1XIiE1mE
 // CHECK: br label
 
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda1fEvENK3$_1clEv"
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda1fEvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x, align 1
+
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda2f2EvENK3$_2clEv"
+// init x
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x, align 1
+// init p
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: store %"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p, align 1
+// init q
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q, align 1
+// init r
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1r, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r, align 1
+
+
 // CHECK: define {{.*}}@[[GLOBAL_INIT:.*]]()
 // CHECK: call void @[[C_INIT]]()
 // CHECK: call void @[[E_INIT]]()
Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -467,6 +467,10 @@
   /// should emit cleanups.
   bool CurFuncIsThunk = false;
 
+  /// static thread-local variables we've referenced that were declared in a
+  /// parent function.
+  llvm::SmallSet ForeignStaticTLSVars;
+
   /// In ARC, whether we should autorelease the return value.
   bool AutoreleaseResult = false;
 
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -31,6 +31,7 @@
 #include "clang/Basic/TargetInfo.h"
 #include 

[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-16 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 updated this revision to Diff 215709.
Prince781 added a comment.

Use range-based version of llvm::sort


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCXX/cxx11-thread-local.cpp

Index: clang/test/CodeGenCXX/cxx11-thread-local.cpp
===
--- clang/test/CodeGenCXX/cxx11-thread-local.cpp
+++ clang/test/CodeGenCXX/cxx11-thread-local.cpp
@@ -268,6 +268,33 @@
   return this->n;
 }
 
+namespace static_tls_in_lambda {
+  struct X {
+X() {}
+  };
+
+
+  X (*f())() {
+static thread_local X x;
+
+return [] { return x; };
+  }
+
+  auto y = f();
+
+  void g() { y(); }
+
+  void bar(X**, X**, X**);
+  void baz(void());
+  void f2() {
+  thread_local X x;
+  thread_local X* p = 
+  thread_local X* q = p;
+  thread_local X* r = q;
+  baz([]{bar(, , );});
+  }
+}
+
 namespace {
 thread_local int anon_i{1};
 }
@@ -303,6 +330,42 @@
 // CHECK: store i64 1, i64* @_ZGVN1XIiE1mE
 // CHECK: br label
 
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda1fEvENK3$_1clEv"
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda1fEvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x, align 1
+
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda2f2EvENK3$_2clEv"
+// init x
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x, align 1
+// init p
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: store %"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p, align 1
+// init q
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q, align 1
+// init r
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1r, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r, align 1
+
+
 // CHECK: define {{.*}}@[[GLOBAL_INIT:.*]]()
 // CHECK: call void @[[C_INIT]]()
 // CHECK: call void @[[E_INIT]]()
Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -467,6 +467,10 @@
   /// should emit cleanups.
   bool CurFuncIsThunk = false;
 
+  /// static thread-local variables we've referenced that were declared in a
+  /// parent function.
+  llvm::SmallSet ForeignStaticTLSVars;
+
   /// In ARC, whether we should autorelease the return value.
   bool AutoreleaseResult = false;
 
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -31,6 +31,7 @@
 #include "clang/Basic/TargetInfo.h"
 #include "clang/CodeGen/CGFunctionInfo.h"
 #include "clang/Frontend/FrontendDiagnostic.h"
+#include "llvm/ADT/STLExtras.h"
 #include 

[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-16 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 updated this revision to Diff 215695.
Prince781 added a comment.
Herald added a subscriber: mgrang.

I've updated the patch to initialize, in the proper order, all foreign static 
TLS variables and the variables they depend on for initialization. I've also 
cleaned up the patch a bit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCXX/cxx11-thread-local.cpp

Index: clang/test/CodeGenCXX/cxx11-thread-local.cpp
===
--- clang/test/CodeGenCXX/cxx11-thread-local.cpp
+++ clang/test/CodeGenCXX/cxx11-thread-local.cpp
@@ -268,6 +268,33 @@
   return this->n;
 }
 
+namespace static_tls_in_lambda {
+  struct X {
+X() {}
+  };
+
+
+  X (*f())() {
+static thread_local X x;
+
+return [] { return x; };
+  }
+
+  auto y = f();
+
+  void g() { y(); }
+
+  void bar(X**, X**, X**);
+  void baz(void());
+  void f2() {
+  thread_local X x;
+  thread_local X* p = 
+  thread_local X* q = p;
+  thread_local X* r = q;
+  baz([]{bar(, , );});
+  }
+}
+
 namespace {
 thread_local int anon_i{1};
 }
@@ -303,6 +330,42 @@
 // CHECK: store i64 1, i64* @_ZGVN1XIiE1mE
 // CHECK: br label
 
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda1fEvENK3$_1clEv"
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda1fEvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x, align 1
+
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda2f2EvENK3$_2clEv"
+// init x
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1x, align 1
+// init p
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: store %"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda2f2EvE1x, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1p, align 1
+// init q
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1p, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1q, align 1
+// init r
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: %[[static_tls_var_prev:.*]] = load %"struct.static_tls_in_lambda::X"*, %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1q, align 8
+// CHECK: store %"struct.static_tls_in_lambda::X"* %[[static_tls_var_prev]], %"struct.static_tls_in_lambda::X"** @_ZZN20static_tls_in_lambda2f2EvE1r, align 8
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda2f2EvE1r, align 1
+
+
 // CHECK: define {{.*}}@[[GLOBAL_INIT:.*]]()
 // CHECK: call void @[[C_INIT]]()
 // CHECK: call void @[[E_INIT]]()
Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -467,6 +467,10 @@
   /// should emit cleanups.
   bool CurFuncIsThunk = false;
 
+  /// static thread-local variables we've referenced that were declared in a
+  /// parent function.
+  llvm::SmallSet ForeignStaticTLSVars;
+
   /// In ARC, whether we should autorelease the return value.
   bool AutoreleaseResult = false;
 
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -31,6 +31,7 @@
 

[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-13 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 added a comment.

In D66122#1627252 , @efriedma wrote:

> > If variable A's initializer references variable B, then it will call B's 
> > initializer.
>
> I don't think this patch adds any code that would address that, although I 
> could be missing something.


No, you're absolutely right. I'll fix my patch later to address this oversight.

> Thinking about it a bit more, I also have a general question: how is this 
> supposed to work?  What do other compilers do?  Does the C++ standard say 
> when the initializer is supposed to run?  [stmt.dcl]p4 just says "Dynamic 
> initialization [...] is performed the first time control passes through its 
> declaration."

This I was not too sure about. gcc also does the same thing. I was not able to 
find anything addressing this situation directly. clang should either generate 
an error or do "the right thing." If the latter, then I take this from the 
standard to mean, "we can initialize at any time before the variable is 
referenced":

[ 3.7.2 Thread storage duration 
 ] [basic.std.thread] 
A variable with thread storage duration shall be initialized before its first 
odr-use (3.2) and, if constructed, shall be destroyed on thread exit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-12 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 added a comment.

In D66122#1626412 , @efriedma wrote:

> This might be a silly question, but what happens if the initializer for a 
> thread-local variable refers to another thread-local variable?  Do you need 
> to initialize both variables?  In what order?


If variable A's initializer references variable B, then it will call B's 
initializer. So when we call A's initializer, B's initialization completes 
before A's.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66122/new/

https://reviews.llvm.org/D66122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes

2019-08-12 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 created this revision.
Prince781 added reviewers: ABataev, rsmith.
Herald added subscribers: cfe-commits, jfb.
Herald added a reviewer: jdoerfert.
Herald added a project: clang.

For static TLS vars only visible inside a function, clang will only generate an 
initializer inside the function body where the variable was declared. However, 
it is possible for the variable to be indirectly referenced without ever 
calling the function it was declared in, if a scope referring to the variable 
gets outlined into a function that is executed on a new thread. Here are two 
examples that demonstrate this:

  #include 
  #include 
  
  struct Object {
  int i;
  Object() : i(3) {}
  };
  
  int main(void) {
  static thread_local Object o;
  
  std::cout << "[main] o.i = " << o.i << std::endl;
  std::thread t([] { std::cout << "[new thread] o.i = " << o.i << 
std::endl; });
  t.join();
  }



  #include 
  #include 
  
  struct Object {
  int i;
  Object() : i(3) {}
  };
  
  int main(void) {
  static thread_local Object o;
  
  #pragma omp parallel
  #pragma omp critical
  std::cout << "[" << omp_get_thread_num() << "] o.i = " << o.i << 
std::endl;
  }

In this patch, we generate an initializer in a function for every unique 
reference to a static TLS var that was declared in a different function.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D66122

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCXX/cxx11-thread-local.cpp

Index: clang/test/CodeGenCXX/cxx11-thread-local.cpp
===
--- clang/test/CodeGenCXX/cxx11-thread-local.cpp
+++ clang/test/CodeGenCXX/cxx11-thread-local.cpp
@@ -268,15 +268,37 @@
   return this->n;
 }
 
+namespace static_tls_in_lambda {
+  struct X {
+X() {}
+  };
+
+
+  X (*f())() {
+static thread_local X x;
+
+return [] {
+return x;
+};
+  }
+
+  auto y = f();
+
+  void g() { y(); }
+}
+
 namespace {
 thread_local int anon_i{1};
 }
 void set_anon_i() {
   anon_i = 2;
 }
+
+
 // LINUX-LABEL: define internal i32* @_ZTWN12_GLOBAL__N_16anon_iE()
 // DARWIN-LABEL: define internal cxx_fast_tlscc i32* @_ZTWN12_GLOBAL__N_16anon_iE()
 
+
 // LINUX: define internal void @[[V_M_INIT]]()
 // DARWIN: define internal cxx_fast_tlscc void @[[V_M_INIT]]()
 // LINUX-SAME: comdat($_ZN1VIiE1mE)
@@ -290,6 +312,8 @@
 // CHECK: store i64 1, i64* @_ZGVN1VIiE1mE
 // CHECK: br label
 
+
+
 // LINUX: define internal void @[[X_M_INIT]]()
 // DARWIN: define internal cxx_fast_tlscc void @[[X_M_INIT]]()
 // LINUX-SAME: comdat($_ZN1XIiE1mE)
@@ -303,6 +327,14 @@
 // CHECK: store i64 1, i64* @_ZGVN1XIiE1mE
 // CHECK: br label
 
+// CHECK: define internal void @"_ZZN20static_tls_in_lambda1fEvENK3$_1clEv"
+// CHECK: %[[static_tls_guard_val:.*]] = load i8, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x
+// CHECK: %[[static_tls_guard_init:.*]] = icmp eq i8 %[[static_tls_guard_val]], 0
+// CHECK: br i1 %[[static_tls_guard_init]],
+// CHECK: call void @_ZN20static_tls_in_lambda1XC1Ev(%"struct.static_tls_in_lambda::X"* @_ZZN20static_tls_in_lambda1fEvE1x)
+// CHECK: store i8 1, i8* @_ZGVZN20static_tls_in_lambda1fEvE1x, align 1
+
+
 // CHECK: define {{.*}}@[[GLOBAL_INIT:.*]]()
 // CHECK: call void @[[C_INIT]]()
 // CHECK: call void @[[E_INIT]]()
Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -467,6 +467,10 @@
   /// should emit cleanups.
   bool CurFuncIsThunk = false;
 
+  /// static thread-local variables we've referenced that were declared in a
+  /// parent function.
+  llvm::SmallSet ForeignStaticTLSVars;
+
   /// In ARC, whether we should autorelease the return value.
   bool AutoreleaseResult = false;
 
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -31,12 +31,16 @@
 #include "clang/Basic/TargetInfo.h"
 #include "clang/CodeGen/CGFunctionInfo.h"
 #include "clang/Frontend/FrontendDiagnostic.h"
+#include "llvm/IR/BasicBlock.h"
+#include "llvm/IR/CFG.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/Dominators.h"
+#include "llvm/IR/GlobalVariable.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/MDBuilder.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/Transforms/Utils/PromoteMemToReg.h"
+#include "llvm/IR/ValueSymbolTable.h"
 using namespace clang;
 using namespace CodeGen;
 
@@ -384,6 +388,64 @@
 CGBuilderTy(*this, AllocaInsertPt).CreateCall(FrameEscapeFn, EscapeArgs);
   }
 
+  // Emit initializers for static local variables that we referenced that are
+  // declared in another function, which may be uninitialized on entry if this
+  // function may execute on a separate 

[PATCH] D64585: [OpenMP] With nested parallelism, threadprivate variables become shared on outer parallel when appearing in inner parallel copyin clause

2019-07-18 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 added a comment.

In D64585#1592207 , @ABataev wrote:

> Fixed this bug myself to be sure it will be merged with 9.0 release, sorry.


That's cool. Thanks for the fix.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64585/new/

https://reviews.llvm.org/D64585



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64889: [OPENMP] getDSA(): handle loop control variables

2019-07-17 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 updated this revision to Diff 210459.
Prince781 added a comment.

Added a lit test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64889/new/

https://reviews.llvm.org/D64889

Files:
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/loop_control_var_nested_task.cpp


Index: clang/test/OpenMP/loop_control_var_nested_task.cpp
===
--- /dev/null
+++ clang/test/OpenMP/loop_control_var_nested_task.cpp
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 %s -fopenmp -emit-llvm -o - | FileCheck %s
+// CHECK: {{%struct}}.kmp_task_t_with_privates = type { 
{{.*%struct\.\.kmp_privates.t.*}} }
+// CHECK: define internal void 
@.omp_task_privates_map.({{.*%struct\.\.kmp_privates.t.*}})
+#define N 100
+int main(void) {
+// declare this variable outside, so that it will be shared on the outer 
parallel construct
+int i;
+int arr[N];
+
+#pragma omp parallel // shared(i) shared(arr)
+#pragma omp for // private(i) - i should be privatized because it is an 
iteration variable
+for (i = 0; i < N; i++) {
+#pragma omp task
+// CHECK: %.firstpriv.ptr.addr.i = alloca i32*, align 8
+// CHECK: {{%[0-9]+}} = load i32*, i32** %.firstpriv.ptr.addr.i, align 
8
+arr[i] = i;
+}
+}
Index: clang/lib/Sema/SemaOpenMP.cpp
===
--- clang/lib/Sema/SemaOpenMP.cpp
+++ clang/lib/Sema/SemaOpenMP.cpp
@@ -424,6 +424,11 @@
   /// for-loops (from outer to inner).
   const LCDeclInfo isLoopControlVariable(const ValueDecl *D) const;
   /// Check if the specified variable is a loop control variable for
+  /// given region.
+  /// \return The index of the loop control variable in the list of associated
+  /// for-loops (from outer to inner).
+  const LCDeclInfo isLoopControlVariable(const ValueDecl *D, const 
SharingMapTy ) const;
+  /// Check if the specified variable is a loop control variable for
   /// parent region.
   /// \return The index of the loop control variable in the list of associated
   /// for-loops (from outer to inner).
@@ -946,11 +951,24 @@
   case DSA_none:
 return DVar;
   case DSA_unspecified:
+DVar.ImplicitDSALoc = Iter->DefaultAttrLoc;
+// OpenMP [2.9.1.1, Data-sharing Attribute Rules for Variables Referenced
+// in a Construct, implicitly determined]
+//  The loop iteration variable(s) in the associated for-loop(s) of a for 
or
+//  parallel for construct is (are) private.
+// OpenMP 5.0 includes taskloop and distribute directives
+if (!isOpenMPSimdDirective(DVar.DKind) &&
+isOpenMPLoopDirective(DVar.DKind) &&
+isLoopControlVariable(D, *Iter).first) {
+  DVar.CKind = OMPC_private;
+  // TODO: OpenMP 5.0: if (Dvar.DKind == OMPD_loop) DVar.CKind = 
OMPC_lastprivate;
+  return DVar;
+}
+
 // OpenMP [2.9.1.1, Data-sharing Attribute Rules for Variables Referenced
 // in a Construct, implicitly determined, p.2]
 //  In a parallel construct, if no default clause is present, these
 //  variables are shared.
-DVar.ImplicitDSALoc = Iter->DefaultAttrLoc;
 if (isOpenMPParallelDirective(DVar.DKind) ||
 isOpenMPTeamsDirective(DVar.DKind)) {
   DVar.CKind = OMPC_shared;
@@ -1018,8 +1036,13 @@
 const DSAStackTy::LCDeclInfo
 DSAStackTy::isLoopControlVariable(const ValueDecl *D) const {
   assert(!isStackEmpty() && "Data-sharing attributes stack is empty");
+  return isLoopControlVariable(D, getTopOfStack());
+}
+
+const DSAStackTy::LCDeclInfo
+DSAStackTy::isLoopControlVariable(const ValueDecl *D, const SharingMapTy 
) const {
   D = getCanonicalDecl(D);
-  const SharingMapTy  = getTopOfStack();
+  const SharingMapTy  = Region;
   auto It = StackElem.LCVMap.find(D);
   if (It != StackElem.LCVMap.end())
 return It->second;


Index: clang/test/OpenMP/loop_control_var_nested_task.cpp
===
--- /dev/null
+++ clang/test/OpenMP/loop_control_var_nested_task.cpp
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 %s -fopenmp -emit-llvm -o - | FileCheck %s
+// CHECK: {{%struct}}.kmp_task_t_with_privates = type { {{.*%struct\.\.kmp_privates.t.*}} }
+// CHECK: define internal void @.omp_task_privates_map.({{.*%struct\.\.kmp_privates.t.*}})
+#define N 100
+int main(void) {
+// declare this variable outside, so that it will be shared on the outer parallel construct
+int i;
+int arr[N];
+
+#pragma omp parallel // shared(i) shared(arr)
+#pragma omp for // private(i) - i should be privatized because it is an iteration variable
+for (i = 0; i < N; i++) {
+#pragma omp task
+// CHECK: %.firstpriv.ptr.addr.i = alloca i32*, align 8
+// CHECK: {{%[0-9]+}} = load i32*, i32** %.firstpriv.ptr.addr.i, align 8
+arr[i] = i;
+}
+}
Index: clang/lib/Sema/SemaOpenMP.cpp
===
--- 

[PATCH] D64889: [OPENMP] getDSA(): handle loop control variables

2019-07-17 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 created this revision.
Prince781 added reviewers: ABataev, rsmith.
Prince781 added projects: clang, OpenMP.
Herald added subscribers: cfe-commits, jdoerfert, guansong.

The following example compiles incorrectly since at least clang 8.0.0:

  #include 
  #include 
  
  #define N 100
  
  int main(void) {
  int i;
  int arr[N];
  
  #pragma omp parallel // shared(i) shared(arr)
  #pragma omp for // private(i)
  for (i = 0; i < N; i++) {
  #pragma omp task // firstprivate(i) shared(arr)
  {
  printf("[thread %2d] i = %d\n", omp_get_thread_num(), i);
  arr[i] = i;
  }
  }
  
  for (i = 0; i < N; i++) {
  if (arr[i] != i) {
  fprintf(stderr, "FAIL: arr[%d] == %d\n", i, arr[i]);
  }
  }
  }

The iteration variable, `i`, should become `private` at the `omp for` construct 
and then become implicit `firstprivate` within the task region. What happens 
instead is that `i` is never privatized within the task construct. As the task 
construct is parsed, when a reference to `i` is determined, the implicit 
data-sharing attributes are computed incorrectly.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D64889

Files:
  clang/lib/Sema/SemaOpenMP.cpp


Index: clang/lib/Sema/SemaOpenMP.cpp
===
--- clang/lib/Sema/SemaOpenMP.cpp
+++ clang/lib/Sema/SemaOpenMP.cpp
@@ -424,6 +424,11 @@
   /// for-loops (from outer to inner).
   const LCDeclInfo isLoopControlVariable(const ValueDecl *D) const;
   /// Check if the specified variable is a loop control variable for
+  /// given region.
+  /// \return The index of the loop control variable in the list of associated
+  /// for-loops (from outer to inner).
+  const LCDeclInfo isLoopControlVariable(const ValueDecl *D, const 
SharingMapTy ) const;
+  /// Check if the specified variable is a loop control variable for
   /// parent region.
   /// \return The index of the loop control variable in the list of associated
   /// for-loops (from outer to inner).
@@ -946,11 +951,24 @@
   case DSA_none:
 return DVar;
   case DSA_unspecified:
+DVar.ImplicitDSALoc = Iter->DefaultAttrLoc;
+// OpenMP [2.9.1.1, Data-sharing Attribute Rules for Variables Referenced
+// in a Construct, implicitly determined]
+//  The loop iteration variable(s) in the associated for-loop(s) of a for 
or
+//  parallel for construct is (are) private.
+// OpenMP 5.0 includes taskloop and distribute directives
+if (!isOpenMPSimdDirective(DVar.DKind) &&
+isOpenMPLoopDirective(DVar.DKind) &&
+isLoopControlVariable(D, *Iter).first) {
+  DVar.CKind = OMPC_private;
+  // TODO: OpenMP 5.0: if (Dvar.DKind == OMPD_loop) DVar.CKind = 
OMPC_lastprivate;
+  return DVar;
+}
+
 // OpenMP [2.9.1.1, Data-sharing Attribute Rules for Variables Referenced
 // in a Construct, implicitly determined, p.2]
 //  In a parallel construct, if no default clause is present, these
 //  variables are shared.
-DVar.ImplicitDSALoc = Iter->DefaultAttrLoc;
 if (isOpenMPParallelDirective(DVar.DKind) ||
 isOpenMPTeamsDirective(DVar.DKind)) {
   DVar.CKind = OMPC_shared;
@@ -1018,8 +1036,13 @@
 const DSAStackTy::LCDeclInfo
 DSAStackTy::isLoopControlVariable(const ValueDecl *D) const {
   assert(!isStackEmpty() && "Data-sharing attributes stack is empty");
+  return isLoopControlVariable(D, getTopOfStack());
+}
+
+const DSAStackTy::LCDeclInfo
+DSAStackTy::isLoopControlVariable(const ValueDecl *D, const SharingMapTy 
) const {
   D = getCanonicalDecl(D);
-  const SharingMapTy  = getTopOfStack();
+  const SharingMapTy  = Region;
   auto It = StackElem.LCVMap.find(D);
   if (It != StackElem.LCVMap.end())
 return It->second;


Index: clang/lib/Sema/SemaOpenMP.cpp
===
--- clang/lib/Sema/SemaOpenMP.cpp
+++ clang/lib/Sema/SemaOpenMP.cpp
@@ -424,6 +424,11 @@
   /// for-loops (from outer to inner).
   const LCDeclInfo isLoopControlVariable(const ValueDecl *D) const;
   /// Check if the specified variable is a loop control variable for
+  /// given region.
+  /// \return The index of the loop control variable in the list of associated
+  /// for-loops (from outer to inner).
+  const LCDeclInfo isLoopControlVariable(const ValueDecl *D, const SharingMapTy ) const;
+  /// Check if the specified variable is a loop control variable for
   /// parent region.
   /// \return The index of the loop control variable in the list of associated
   /// for-loops (from outer to inner).
@@ -946,11 +951,24 @@
   case DSA_none:
 return DVar;
   case DSA_unspecified:
+DVar.ImplicitDSALoc = Iter->DefaultAttrLoc;
+// OpenMP [2.9.1.1, Data-sharing Attribute Rules for Variables Referenced
+// in a Construct, implicitly determined]
+//  The loop iteration variable(s) in the associated for-loop(s) of a for or
+

[PATCH] D64585: [OpenMP] With nested parallelism, threadprivate variables become shared on outer parallel when appearing in inner parallel copyin clause

2019-07-11 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 added inline comments.



Comment at: clang/lib/Sema/SemaExpr.cpp:15329
 
+if (getLangOpts().OpenMP && getLangOpts().OpenMPUseTLS) {
+  // Avoid capturing TLS-backed threadprivate variables in outer scopes.

ABataev wrote:
> this is not the right place to fix this bug, it must be fixed in 
> SemaOpenMP.cpp. If you want, I can try to fix it.
I see. Thanks for your comment. I noticed elsewhere in this function 
`tryCaptureVariable` that there is a `if (getLangOpts().OpenMP) ...`, so I 
thought it would be okay.

If you don't mind, I wish to take another look at this and get back to you 
later.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64585/new/

https://reviews.llvm.org/D64585



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64585: [OpenMP] With nested parallelism, threadprivate variables become shared on outer parallel when appearing in inner parallel copyin clause

2019-07-11 Thread Princeton Ferro via Phabricator via cfe-commits
Prince781 created this revision.
Prince781 added reviewers: ABataev, faisalv, malcolm.parsons, efriedma, 
eli.friedman, maskray0, MaskRay, tareqsiraj, rsmith.
Prince781 added projects: clang, OpenMP.
Herald added subscribers: jdoerfert, jfb, guansong.

There is a bug since at least clang 8.0.0 wherein a static threadprivate 
variable appearing in a copyin() clause on a parallel construct (that is nested 
within another parallel construct) becomes shared on the outer parallel. This 
happens only when the threadprivate variable is backed by TLS and does not 
appear in global scope. Here is an example that compiles incorrectly:

  #include 
  #include 
  #include 
  #include 
  #define NT 4
   
  int main(void) {
  static int threadprivate_var = 1;
  #pragma omp threadprivate(threadprivate_var)
   
  omp_set_dynamic(false);
  omp_set_num_threads(NT);
  omp_set_nested(true);
   
  #pragma omp parallel
  {
  threadprivate_var = 1;
  printf("[B] thread %d: val %d: threadprivate @ %p\n", 
omp_get_thread_num(), threadprivate_var, _var);
   
  #pragma omp master
  {
  threadprivate_var = 2;
  #pragma omp parallel copyin(threadprivate_var)
  {
  printf("[B] thread %d, %d: val %d: threadprivate @ %p\n", 
omp_get_ancestor_thread_num(1), omp_get_thread_num(), threadprivate_var, 
_var);
  // check that copyin succeeded
  assert(threadprivate_var == 2);
  }
  }
  #pragma omp barrier
  printf("[A] thread %d: val %d: threadprivate @ %p\n", 
omp_get_thread_num(), threadprivate_var, _var);
  if (omp_get_thread_num() != 0)  // 0 is the master thread
  // non-master threads should not have seen changes
  assert(threadprivate_var == 1);
  }
  }

The resulting IR looks something like this:

  @main.threadprivate_var = internal thread_local global i32 1, align 4
  …
  main() {
 call void __kmpc_fork_call(omp_outlined_outer_parallel_region, 
_var)
  }
  …
  omp_outlined_outer_parallel_region(…, i32* %threadprivate_var) {
  if (I am the master thread)
  call void __kmpc_fork_call(omp_outlined_inner_parallel_region, 
%threadprivate_var)
  }
  …
  omp_outlined_inner_parallel_region(…, i32* %threadprivate_var) {
  }

When it should look something like this:

  @main.threadprivate_var = internal thread_local global i32 1, align 4
  …
  main() {
 call void __kmpc_fork_call(omp_outlined_outer_parallel_region)
  }
  …
  omp_outlined_outer_parallel_region(…) {
  if (I am the master thread)
  call void __kmpc_fork_call(omp_outlined_inner_parallel_region, 
_var)
  }
  …
  omp_outlined_inner_parallel_region(…, i32* %threadprivate_var) {
  }

Without the copyin, the function for the outer parallel region does not have 
the extra parameter. For the copyin clause above to work, the inner parallel 
needs a reference to the thread-local variable of the encountering thread (in 
this case, the master thread) in an extra parameter. It does not make sense for 
the outer parallel function(s) to capture the thread-local variable.

I’ve made a patch that prevents TLS-backed threadprivate variables from being 
captured in outer scopes. I don’t know if this is the best way to go about it, 
so I welcome feedback from someone with much more knowledge on clang’s OpenMP 
backend.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D64585

Files:
  clang/lib/Sema/SemaExpr.cpp
  clang/test/OpenMP/nested_parallel_threadprivate_copyin.cpp


Index: clang/test/OpenMP/nested_parallel_threadprivate_copyin.cpp
===
--- /dev/null
+++ clang/test/OpenMP/nested_parallel_threadprivate_copyin.cpp
@@ -0,0 +1,48 @@
+// RUN: %clang_cc1 %s -fopenmp -emit-llvm -o - | FileCheck %s
+#define NT 4   /* default number of threads */
+
+extern "C" {
+extern int printf(const char *, ...);
+extern void assert(int);
+extern void omp_set_dynamic(bool);
+extern void omp_set_num_threads(int);
+extern void omp_set_nested(bool);
+extern int omp_get_thread_num(void);
+extern int omp_get_ancestor_thread_num(int);
+};
+
+int main(void) {
+static int threadprivate_var = 1;
+#pragma omp threadprivate(threadprivate_var)
+
+// These commands are not strictly necessary, but they make it easier to
+// see when things go wrong.
+omp_set_dynamic(false);
+omp_set_num_threads(NT);
+omp_set_nested(true);
+
+// CHECK-NOT: call 
void.*@__kmpc_fork_call({{.*}}%{{\w+}}threadprivate{{.*}})
+// CHECK-NOT: define internal void 
@.omp_outlined.({{.*}}%threadprivate_var{{.*}})
+#pragma omp parallel
+{
+threadprivate_var = 1;
+printf("[B] thread %d: val %d: threadprivate @ %p\n", 
omp_get_thread_num(), threadprivate_var, _var);
+
+#pragma omp master
+{
+threadprivate_var = 2;
+//