[PATCH] D82415: [Coroutines] Special handle __builtin_coro_resume for final_suspend nothrow check

2020-06-23 Thread Xun Li via Phabricator via cfe-commits
lxfind updated this revision to Diff 272910.
lxfind added a comment.

rebase


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82415/new/

https://reviews.llvm.org/D82415

Files:
  clang/lib/Sema/SemaCoroutine.cpp


Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -614,6 +614,14 @@
 // In the case of dtor, the call to dtor is implicit and hence we should
 // pass nullptr to canCalleeThrow.
 if (Sema::canCalleeThrow(S, IsDtor ? nullptr : cast(E), D)) {
+  if (const auto *FD = dyn_cast(D)) {
+// co_await promise.final_suspend() could end up calling
+// __builtin_coro_resume for symmetric transfer if await_suspend()
+// returns a handle. In that case, even __builtin_coro_resume is not
+// declared as noexcept, we claim that logically it does not throw.
+if (FD->getBuiltinID() == Builtin::BI__builtin_coro_resume)
+  return;
+  }
   if (ThrowingDecls.empty()) {
 // First time seeing an error, emit the error message.
 S.Diag(cast(S.CurContext)->getLocation(),


Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -614,6 +614,14 @@
 // In the case of dtor, the call to dtor is implicit and hence we should
 // pass nullptr to canCalleeThrow.
 if (Sema::canCalleeThrow(S, IsDtor ? nullptr : cast(E), D)) {
+  if (const auto *FD = dyn_cast(D)) {
+// co_await promise.final_suspend() could end up calling
+// __builtin_coro_resume for symmetric transfer if await_suspend()
+// returns a handle. In that case, even __builtin_coro_resume is not
+// declared as noexcept, we claim that logically it does not throw.
+if (FD->getBuiltinID() == Builtin::BI__builtin_coro_resume)
+  return;
+  }
   if (ThrowingDecls.empty()) {
 // First time seeing an error, emit the error message.
 S.Diag(cast(S.CurContext)->getLocation(),
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82029: [Coroutines] Ensure co_await promise.final_suspend() does not throw

2020-06-23 Thread Mikael Holmén via Phabricator via cfe-commits
uabelho added a comment.

In D82029#2109167 , @lxfind wrote:

> Test failures are being fixed in https://reviews.llvm.org/D82338/new/


Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82029/new/

https://reviews.llvm.org/D82029



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82428: [clang][driver] allow `-arch arm64` to be used to build for mac when on Apple Silicon Mac without explicit `-target`

2020-06-23 Thread Alex Lorenz via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG050ed9720f84: [cmake] configure the host triple on an Apple 
Silicon machine correctly (authored by arphaman).

Changed prior to commit:
  https://reviews.llvm.org/D82428?vs=272891=272908#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82428/new/

https://reviews.llvm.org/D82428

Files:
  llvm/cmake/config.guess


Index: llvm/cmake/config.guess
===
--- llvm/cmake/config.guess
+++ llvm/cmake/config.guess
@@ -1263,6 +1263,23 @@
  UNAME_PROCESSOR="x86_64"
  fi
fi ;;
+   arm)
+   eval $set_cc_for_build
+   if [ "$CC_FOR_BUILD" != 'no_compiler_found' ]; then
+   if (echo '#ifdef __LP64__'; echo IS_64BIT_ARCH; echo 
'#endif') | \
+   (CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \
+   grep IS_64BIT_ARCH >/dev/null
+   then
+   if (echo '#ifdef __PTRAUTH_INTRINSICS__'; echo 
HAS_AUTH; echo '#endif') | \
+   (CCOPTS= $CC_FOR_BUILD -E - 
2>/dev/null) | \
+   grep HAS_AUTH >/dev/null
+   then
+   UNAME_PROCESSOR="arm64e"
+   else
+   UNAME_PROCESSOR="arm64"
+   fi
+   fi
+   fi ;;
unknown) UNAME_PROCESSOR=powerpc ;;
esac
echo ${UNAME_PROCESSOR}-apple-darwin${UNAME_RELEASE}


Index: llvm/cmake/config.guess
===
--- llvm/cmake/config.guess
+++ llvm/cmake/config.guess
@@ -1263,6 +1263,23 @@
 		  UNAME_PROCESSOR="x86_64"
 		  fi
 		fi ;;
+		arm)
+		eval $set_cc_for_build
+		if [ "$CC_FOR_BUILD" != 'no_compiler_found' ]; then
+			if (echo '#ifdef __LP64__'; echo IS_64BIT_ARCH; echo '#endif') | \
+(CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \
+grep IS_64BIT_ARCH >/dev/null
+			then
+if (echo '#ifdef __PTRAUTH_INTRINSICS__'; echo HAS_AUTH; echo '#endif') | \
+	(CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \
+	grep HAS_AUTH >/dev/null
+then
+	UNAME_PROCESSOR="arm64e"
+else
+	UNAME_PROCESSOR="arm64"
+fi
+			fi
+		fi ;;
 	unknown) UNAME_PROCESSOR=powerpc ;;
 	esac
 	echo ${UNAME_PROCESSOR}-apple-darwin${UNAME_RELEASE}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 1a342ff - test fix: add missing system-darwin REQUIRES

2020-06-23 Thread Alex Lorenz via cfe-commits

Author: Alex Lorenz
Date: 2020-06-23T21:17:58-07:00
New Revision: 1a342ff3753d0354bab3d82fa8e493e21d50c79f

URL: 
https://github.com/llvm/llvm-project/commit/1a342ff3753d0354bab3d82fa8e493e21d50c79f
DIFF: 
https://github.com/llvm/llvm-project/commit/1a342ff3753d0354bab3d82fa8e493e21d50c79f.diff

LOG: test fix: add missing system-darwin REQUIRES

The test should only run with a Darwin driver only.

Added: 


Modified: 
clang/test/Driver/apple-arm64-arch.c

Removed: 




diff  --git a/clang/test/Driver/apple-arm64-arch.c 
b/clang/test/Driver/apple-arm64-arch.c
index fd9f9a2ccedb..a37346b1a9bb 100644
--- a/clang/test/Driver/apple-arm64-arch.c
+++ b/clang/test/Driver/apple-arm64-arch.c
@@ -1,6 +1,7 @@
 // RUN: env SDKROOT="/" %clang -arch arm64 -c -### %s 2>&1 | \
 // RUN:   FileCheck %s
 //
+// REQUIRES: system-darwin
 // XFAIL: apple-silicon-mac
 //
 // CHECK: "-triple" "arm64-apple-ios{{[0-9.]+}}"



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 565603c - [clang][driver] set macOS as the target OS for -arch arm64 when clang

2020-06-23 Thread Alex Lorenz via cfe-commits

Author: Alex Lorenz
Date: 2020-06-23T21:08:11-07:00
New Revision: 565603cc94d79a8d0de6df840fd53714899fb890

URL: 
https://github.com/llvm/llvm-project/commit/565603cc94d79a8d0de6df840fd53714899fb890
DIFF: 
https://github.com/llvm/llvm-project/commit/565603cc94d79a8d0de6df840fd53714899fb890.diff

LOG: [clang][driver] set macOS as the target OS for -arch arm64 when clang
is running on an Apple Silicon mac

This change allows users to use `-arch arm64` to build for mac when
running it on Apple Silicon mac without explicit `-target` option.

Differential Revision: https://reviews.llvm.org/D82428

Added: 
clang/test/Driver/apple-arm64-arch.c
clang/test/Driver/apple-silicon-arch.c

Modified: 
clang/lib/Driver/ToolChains/Darwin.cpp
clang/test/lit.cfg.py

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Darwin.cpp 
b/clang/lib/Driver/ToolChains/Darwin.cpp
index bb7c7f768b35..1b3a3e934995 100644
--- a/clang/lib/Driver/ToolChains/Darwin.cpp
+++ b/clang/lib/Driver/ToolChains/Darwin.cpp
@@ -1672,8 +1672,16 @@ inferDeploymentTargetFromArch(DerivedArgList , 
const Darwin ,
   llvm::Triple::OSType OSTy = llvm::Triple::UnknownOS;
 
   StringRef MachOArchName = Toolchain.getMachOArchName(Args);
-  if (MachOArchName == "armv7" || MachOArchName == "armv7s" ||
-  MachOArchName == "arm64")
+  if (MachOArchName == "arm64") {
+#if __arm64__
+// A clang running on an Apple Silicon mac defaults
+// to building for mac when building for arm64 rather than
+// defaulting to iOS.
+OSTy = llvm::Triple::MacOSX;
+#else
+OSTy = llvm::Triple::IOS;
+#endif
+  } else if (MachOArchName == "armv7" || MachOArchName == "armv7s")
 OSTy = llvm::Triple::IOS;
   else if (MachOArchName == "armv7k" || MachOArchName == "arm64_32")
 OSTy = llvm::Triple::WatchOS;

diff  --git a/clang/test/Driver/apple-arm64-arch.c 
b/clang/test/Driver/apple-arm64-arch.c
new file mode 100644
index ..fd9f9a2ccedb
--- /dev/null
+++ b/clang/test/Driver/apple-arm64-arch.c
@@ -0,0 +1,6 @@
+// RUN: env SDKROOT="/" %clang -arch arm64 -c -### %s 2>&1 | \
+// RUN:   FileCheck %s
+//
+// XFAIL: apple-silicon-mac
+//
+// CHECK: "-triple" "arm64-apple-ios{{[0-9.]+}}"

diff  --git a/clang/test/Driver/apple-silicon-arch.c 
b/clang/test/Driver/apple-silicon-arch.c
new file mode 100644
index ..b1201fa2d7dd
--- /dev/null
+++ b/clang/test/Driver/apple-silicon-arch.c
@@ -0,0 +1,6 @@
+// RUN: env SDKROOT="/" %clang -arch arm64 -c -### %s 2>&1 | \
+// RUN:   FileCheck %s
+//
+// REQUIRES: apple-silicon-mac
+//
+// CHECK: "-triple" "arm64-apple-macosx{{[0-9.]+}}"

diff  --git a/clang/test/lit.cfg.py b/clang/test/lit.cfg.py
index 413f81175420..ade32988b9a8 100644
--- a/clang/test/lit.cfg.py
+++ b/clang/test/lit.cfg.py
@@ -155,6 +155,10 @@ def is_filesystem_case_insensitive():
 if not re.match(r'.*-(cygwin)$', config.target_triple):
 config.available_features.add('clang-driver')
 
+# Tests that are specific to the Apple Silicon macOS.
+if re.match(r'^arm64(e)?-apple-(macos|darwin)', config.target_triple):
+config.available_features.add('apple-silicon-mac')
+
 # [PR18856] Depends to remove opened file. On win32, a file could be removed
 # only if all handles were closed.
 if platform.system() not in ['Windows']:



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82431: [PowerPC][Power10] Implement Test LSB by Byte Builtins in LLVM/Clang

2020-06-23 Thread Amy Kwan via Phabricator via cfe-commits
amyk created this revision.
amyk added reviewers: nemanjai, lei, saghir, hfinkel, power-llvm-team, PowerPC.
amyk added projects: LLVM, clang, PowerPC.
Herald added subscribers: shchenz, hiraditya.

This patch implements builtins for the following prototypes:

  int vec_test_lsbb_all_ones (vector unsigned char a);  
  int vec_test_lsbb_all_zeros (vector unsigned char a);  


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82431

Files:
  clang/include/clang/Basic/BuiltinsPPC.def
  clang/lib/Headers/altivec.h
  clang/test/CodeGen/builtins-ppc-p10vector.c
  llvm/include/llvm/IR/IntrinsicsPowerPC.td
  llvm/lib/Target/PowerPC/PPCInstrPrefix.td
  llvm/test/CodeGen/PowerPC/p10-vsx-builtins.ll
  llvm/test/MC/Disassembler/PowerPC/p10insts.txt
  llvm/test/MC/PowerPC/p10.s

Index: llvm/test/MC/PowerPC/p10.s
===
--- llvm/test/MC/PowerPC/p10.s
+++ llvm/test/MC/PowerPC/p10.s
@@ -33,3 +33,6 @@
 # CHECK-BE: vclrrb 1, 4, 3# encoding: [0x10,0x24,0x19,0xcd]
 # CHECK-LE: vclrrb 1, 4, 3# encoding: [0xcd,0x19,0x24,0x10]
 vclrrb 1, 4, 3
+# CHECK-BE: xvtlsbb 1, 7  # encoding: [0xf0,0x82,0x3f,0x6c]
+# CHECK-LE: xvtlsbb 1, 7  # encoding: [0x6c,0x3f,0x82,0xf0]
+xvtlsbb 1, 7
Index: llvm/test/MC/Disassembler/PowerPC/p10insts.txt
===
--- llvm/test/MC/Disassembler/PowerPC/p10insts.txt
+++ llvm/test/MC/Disassembler/PowerPC/p10insts.txt
@@ -30,3 +30,6 @@
 
 # CHECK: vclrrb 1, 4, 3
 0x10 0x24 0x19 0xcd
+
+# CHECK: xvtlsbb 1, 7
+0xf0 0x82 0x3f 0x6c
Index: llvm/test/CodeGen/PowerPC/p10-vsx-builtins.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/p10-vsx-builtins.ll
@@ -0,0 +1,35 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu \
+; RUN:   -mcpu=pwr10 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s | \
+; RUN:   FileCheck %s
+
+; This test case aims to test the builtins for VSX vector instructions
+; on Power10.
+
+declare i32 @llvm.ppc.vsx.xvtlsbb(<16 x i8>, i1)
+
+define signext i32 @test_vec_test_lsbb_all_ones(<16 x i8> %vuca) {
+; CHECK-LABEL: test_vec_test_lsbb_all_ones:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xvtlsbb cr0, v2
+; CHECK-NEXT:mfocrf r3, 128
+; CHECK-NEXT:srwi r3, r3, 31
+; CHECK-NEXT:extsw r3, r3
+; CHECK-NEXT:blr
+entry:
+  %0 = tail call i32 @llvm.ppc.vsx.xvtlsbb(<16 x i8> %vuca, i1 1)
+  ret i32 %0
+}
+
+define signext i32 @test_vec_test_lsbb_all_zeros(<16 x i8> %vuca) {
+; CHECK-LABEL: test_vec_test_lsbb_all_zeros:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xvtlsbb cr0, v2
+; CHECK-NEXT:mfocrf r3, 128
+; CHECK-NEXT:rlwinm r3, r3, 3, 31, 31
+; CHECK-NEXT:extsw r3, r3
+; CHECK-NEXT:blr
+entry:
+  %0 = tail call i32 @llvm.ppc.vsx.xvtlsbb(<16 x i8> %vuca, i1 0)
+  ret i32 %0
+}
Index: llvm/lib/Target/PowerPC/PPCInstrPrefix.td
===
--- llvm/lib/Target/PowerPC/PPCInstrPrefix.td
+++ llvm/lib/Target/PowerPC/PPCInstrPrefix.td
@@ -177,6 +177,25 @@
   let Inst{31} = XT{5};
 }
 
+// [PO BF / XO2 B XO BX /]
+class XX2_BF3_XO5_XB6_XO9 opcode, bits<5> xo2, bits<9> xo, dag OOL,
+  dag IOL, string asmstr, InstrItinClass itin,
+  list pattern>
+  : I {
+  bits<3> BF;
+  bits<6> XB;
+
+  let Pattern = pattern;
+
+  let Inst{6-8}   = BF;
+  let Inst{9-10}  = 0;
+  let Inst{11-15} = xo2;
+  let Inst{16-20} = XB{4-0};
+  let Inst{21-29} = xo;
+  let Inst{30}= XB{5};
+  let Inst{31}= 0;
+}
+
 multiclass MLS_DForm_R_SI34_RTA5_MEM_p opcode, dag OOL, dag IOL,
dag PCRel_IOL, string asmstr,
InstrItinClass itin> {
@@ -552,6 +571,8 @@
  "vclrrb $vD, $vA, $rB", IIC_VecGeneral,
  [(set v16i8:$vD,
(int_ppc_altivec_vclrrb v16i8:$vA, i32:$rB))]>;
+   def XVTLSBB : XX2_BF3_XO5_XB6_XO9<60, 2, 475, (outs crrc:$BF), (ins vsrc:$XB),
+ "xvtlsbb $BF, $XB", IIC_VecGeneral, []>;
 }
 
 // Anonymous Patterns //
@@ -564,4 +585,8 @@
 (v4i32 (COPY_TO_REGCLASS (XXGENPCVWM $VRB, imm:$IMM), VRRC))>;
   def : Pat<(v2i64 (int_ppc_vsx_xxgenpcvdm v2i64:$VRB, imm:$IMM)),
 (v2i64 (COPY_TO_REGCLASS (XXGENPCVDM $VRB, imm:$IMM), VRRC))>;
+  def : Pat<(i32 (int_ppc_vsx_xvtlsbb v16i8:$XB, -1)),
+(EXTRACT_SUBREG (XVTLSBB (COPY_TO_REGCLASS $XB, VSRC)), sub_lt)>;
+  def : Pat<(i32 (int_ppc_vsx_xvtlsbb v16i8:$XB, 0)),
+(EXTRACT_SUBREG (XVTLSBB (COPY_TO_REGCLASS $XB, VSRC)), sub_eq)>;
 }

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-23 Thread Michael Liao via Phabricator via cfe-commits
hliao added a comment.

ping for code review


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81938/new/

https://reviews.llvm.org/D81938



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82428: [clang][driver] allow `-arch arm64` to be used to build for mac when on Apple Silicon Mac without explicit `-target`

2020-06-23 Thread Alex Lorenz via Phabricator via cfe-commits
arphaman added a comment.

In D82428#2110506 , @steven_wu wrote:

> LGTM.
>
> Not sure if it makes more sense to break the patch into two commits:
>
> - config.guess change is for building the correct host triple on apple 
> silicon machine without explicitly specify it.
> - the driver change is for better default on Apple silicon Mac.


That's a good idea, I will do that.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82428/new/

https://reviews.llvm.org/D82428



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82415: [Coroutines] Special handle __builtin_coro_resume for final_suspend nothrow check

2020-06-23 Thread Xun Li via Phabricator via cfe-commits
lxfind updated this revision to Diff 272905.
lxfind added a comment.

Address lint


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82415/new/

https://reviews.llvm.org/D82415

Files:
  clang/lib/Sema/SemaCoroutine.cpp


Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -614,6 +614,14 @@
 // In the case of dtor, the call to dtor is implicit and hence we should
 // pass nullptr to canCalleeThrow.
 if (Sema::canCalleeThrow(S, IsDtor ? nullptr : cast(E), D)) {
+  if (const auto *FD = dyn_cast(D)) {
+// co_await promise.final_suspend() could end up calling
+// __builtin_coro_resume for symmetric transfer if await_suspend()
+// returns a handle. In that case, even __builtin_coro_resume is not
+// declared as noexcept, we claim that logically it does not throw.
+if (FD->getBuiltinID() == Builtin::BI__builtin_coro_resume)
+  return;
+  }
   if (ThrowingDecls.empty()) {
 // First time seeing an error, emit the error message.
 S.Diag(cast(S.CurContext)->getLocation(),


Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -614,6 +614,14 @@
 // In the case of dtor, the call to dtor is implicit and hence we should
 // pass nullptr to canCalleeThrow.
 if (Sema::canCalleeThrow(S, IsDtor ? nullptr : cast(E), D)) {
+  if (const auto *FD = dyn_cast(D)) {
+// co_await promise.final_suspend() could end up calling
+// __builtin_coro_resume for symmetric transfer if await_suspend()
+// returns a handle. In that case, even __builtin_coro_resume is not
+// declared as noexcept, we claim that logically it does not throw.
+if (FD->getBuiltinID() == Builtin::BI__builtin_coro_resume)
+  return;
+  }
   if (ThrowingDecls.empty()) {
 // First time seeing an error, emit the error message.
 S.Diag(cast(S.CurContext)->getLocation(),
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread Xun Li via Phabricator via cfe-commits
lxfind updated this revision to Diff 272904.
lxfind added a comment.
Herald added subscribers: llvm-commits, hiraditya.
Herald added a project: LLVM.

Tackle this problem inside CoroSplit as an optimization. Instead of only 
handling one particular case, we now look at every local variable in the 
coroutine, and sink their lifetime start markers when possible. This will bring 
in more benefits than doing so during IR emit. Confirmed that it works.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314

Files:
  llvm/lib/Transforms/Coroutines/CoroSplit.cpp

Index: llvm/lib/Transforms/Coroutines/CoroSplit.cpp
===
--- llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -75,7 +75,7 @@
 
 namespace {
 
-/// A little helper class for building 
+/// A little helper class for building
 class CoroCloner {
 public:
   enum class Kind {
@@ -563,7 +563,7 @@
   // In the original function, the AllocaSpillBlock is a block immediately
   // following the allocation of the frame object which defines GEPs for
   // all the allocas that have been moved into the frame, and it ends by
-  // branching to the original beginning of the coroutine.  Make this 
+  // branching to the original beginning of the coroutine.  Make this
   // the entry block of the cloned function.
   auto *Entry = cast(VMap[Shape.AllocaSpillBlock]);
   auto *OldEntry = >getEntryBlock();
@@ -1239,6 +1239,106 @@
   S.resize(N);
 }
 
+/// For every local variable that has lifetime intrinsics markers, we sink
+/// their lifetime.start marker to the places where the variable is being
+/// used for the first time. Doing so minimizes the lifetime of each variable,
+/// hence minimizing the amount of data we end up putting on the frame.
+static void sinkLifetimeStartMarkers(Function ) {
+  DominatorTree Dom(F);
+  for (Instruction  : instructions(F)) {
+// We look for this particular pattern:
+//   %tmpX = alloca %.., align ...
+//   %0 = bitcast %...* %tmpX to i8*
+//   call void @llvm.lifetime.start.p0i8(i64 ..., i8* nonnull %0) #2
+if (!isa())
+  continue;
+BitCastInst *CastInst = nullptr;
+// There can be multiple lifetime start markers for the same variable.
+SmallPtrSet LifetimeStartInsts;
+// SinkBarriers stores all instructions that use this local variable.
+// When sinking the lifetime start intrinsics, we can never sink past
+// these barriers.
+SmallPtrSet SinkBarriers;
+bool Valid = true;
+auto addSinkBarrier = [&](Instruction *I) {
+  // When adding a new barrier to SinkBarriers, we maintain the case
+  // that no instruction in SinkBarriers dominates another instruction.
+  bool FoundDom = false;
+  SmallPtrSet ToRemove;
+  for (auto *S : SinkBarriers) {
+if (Dom.dominates(S, I)) {
+  FoundDom = true;
+  break;
+} else if (Dom.dominates(I, S)) {
+  ToRemove.insert(S);
+}
+  }
+  if (!FoundDom) {
+SinkBarriers.insert(I);
+for (auto *R : ToRemove) {
+  SinkBarriers.erase(R);
+}
+  }
+};
+for (User *U : I.users()) {
+  if (!isa(U))
+continue;
+  if (CastInst) {
+// If we have multiple cast instructions for the alloca, don't
+// deal with it beause it's too complex.
+Valid = false;
+break;
+  }
+  CastInst = cast(U);
+  for (User *CU : CastInst->users()) {
+// If we see any user of CastInst that's not lifetime start/end
+// intrinsics, give up because it's too complex.
+if (auto *CUI = dyn_cast(CU)) {
+  if (CUI->getIntrinsicID() == Intrinsic::lifetime_start)
+LifetimeStartInsts.insert(CUI);
+  else if (CUI->getIntrinsicID() == Intrinsic::lifetime_end)
+addSinkBarrier(CUI);
+  else
+Valid = false;
+} else {
+  Valid = false;
+}
+  }
+}
+if (!Valid || LifetimeStartInsts.empty())
+  continue;
+
+for (User *U : I.users()) {
+  if (U == CastInst)
+continue;
+  // Every user of the variable is also a sink barrier.
+  addSinkBarrier(cast(U));
+}
+
+// For each sink barrier, we insert a lifetime start marker right
+// before it.
+const auto *LifetimeStartInst = *LifetimeStartInsts.begin();
+for (auto *S : SinkBarriers) {
+  if (auto *IS = dyn_cast(S)) {
+if (IS->getIntrinsicID() == Intrinsic::lifetime_end) {
+  // If we have a lifetime end marker in SinkBarriers, meaning it's
+  // not dominated by any other users, we can safely delete it.
+  IS->eraseFromParent();
+  continue;
+}
+  }
+  LifetimeStartInst->clone()->insertBefore(S);
+}
+// All the old markers are no longer necessary.
+for (auto *S 

[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, 
ctetreau.
Herald added subscribers: llvm-commits, cfe-commits, psnobl, rkruppe, 
hiraditya, tschuett.
Herald added projects: clang, LLVM.

The following intrinsics has been added:

svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op)

svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices)

svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices)

svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t 
indices)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82429

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_tbl-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbx-bfloat.c
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll

Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
===
--- llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
+++ llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
@@ -122,6 +122,16 @@
   ret  %out
 }
 
+define  @ftbx_h_bf16( %a,  %b,  %c) {
+; CHECK-LABEL: ftbx_h_bf16:
+; CHECK: tbx z0.h, z1.h, z2.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbx.nxv8bf16( %a,
+%b,
+%c)
+  ret  %out
+}
+
 define  @tbx_s( %a,  %b,  %c) {
 ; CHECK-LABEL: tbx_s:
 ; CHECK: tbx z0.s, z1.s, z2.s
@@ -179,3 +189,5 @@
 declare  @llvm.aarch64.sve.tbx.nxv8f16(, , )
 declare  @llvm.aarch64.sve.tbx.nxv4f32(, , )
 declare  @llvm.aarch64.sve.tbx.nxv2f64(, , )
+
+declare  @llvm.aarch64.sve.tbx.nxv8bf16(, , )
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1009,6 +1009,15 @@
   ret  %out
 }
 
+define  @tbl_bf16( %a,  %b) {
+; CHECK-LABEL: tbl_bf16:
+; CHECK: tbl z0.h, { z0.h }, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbl.nxv8bf16( %a,
+%b)
+  ret  %out
+}
+
 define  @tbl_f32( %a,  %b) {
 ; CHECK-LABEL: tbl_f32:
 ; CHECK: tbl z0.s, { z0.s }, z1.s
@@ -1859,6 +1868,7 @@
 declare  @llvm.aarch64.sve.tbl.nxv4i32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2i64(, )
 declare  @llvm.aarch64.sve.tbl.nxv8f16(, )
+declare  @llvm.aarch64.sve.tbl.nxv8bf16(, )
 declare  @llvm.aarch64.sve.tbl.nxv4f32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2f64(, )
 
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
@@ -145,6 +145,16 @@
   ret  %out
 }
 
+define  @cnt_bf16( %a,  %pg,  %b) {
+; CHECK-LABEL: cnt_bf16:
+; CHECK: cnt z0.h, p0/m, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.cnt.nxv8bf16( %a,
+%pg,
+%b)
+  ret  %out
+}
+
 define  @cnt_f32( %a,  %pg,  %b) {
 ; CHECK-LABEL: cnt_f32:
 ; CHECK: cnt z0.s, p0/m, z1.s
@@ -180,5 +190,6 @@
 declare  @llvm.aarch64.sve.cnt.nxv4i32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2i64(, , )
 declare  @llvm.aarch64.sve.cnt.nxv8f16(, , )
+declare  @llvm.aarch64.sve.cnt.nxv8bf16(, , )
 declare  @llvm.aarch64.sve.cnt.nxv4f32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2f64(, , )
Index: llvm/lib/Target/AArch64/SVEInstrFormats.td
===
--- llvm/lib/Target/AArch64/SVEInstrFormats.td
+++ llvm/lib/Target/AArch64/SVEInstrFormats.td
@@ -1020,6 +1020,8 @@
   def : SVE_2_Op_Pat(NAME # _H)>;
   def : SVE_2_Op_Pat(NAME # _S)>;
   def : SVE_2_Op_Pat(NAME # _D)>;
+
+  def : SVE_2_Op_Pat(NAME # _H)>;
 }
 
 multiclass sve2_int_perm_tbl {
@@ -1053,6 +1055,11 @@
 nxv8f16:$Op2, zsub1),
  nxv8i16:$Op3))>;
 
+  def : Pat<(nxv8bf16 (op nxv8bf16:$Op1, nxv8bf16:$Op2, nxv8i16:$Op3)),
+(nxv8bf16 (!cast(NAME # _H) (REG_SEQUENCE ZPR2, nxv8bf16:$Op1, zsub0,
+   

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread JunMa via Phabricator via cfe-commits
junparser added a comment.

In D82314#2109437 , @lxfind wrote:

> In D82314#2107910 , @junparser wrote:
>
> > Rather than doing it here, can we build await_resume call expression with 
> > MaterializedTemporaryExpr when expand the coawait expression. That's how 
> > gcc does.
>
>
> There doesn't appear to be a way to do that in Clang. It goes from the AST to 
> IR directly, and there needs to be a MaterializedTemporaryExpr to wrap the 
> result of co_await. Could you elaborate on how this might be done in Clang?


For now, we only wrap coawait expression with MaterializedTemporaryExpr when 
the kind of result is VK_RValue, We can wrap await_resume call instead in such 
case when build coawait expression. so in emitSuspendExpression, we can 
directly emit await_call expression with MaterializedTemporaryExpr.

I think this should work, although i'm not so sure.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81816: [PowerPC] Add support for vector bool __int128 for Power10

2020-06-23 Thread Ahsan Saghir via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
saghir marked an inline comment as done.
Closed by commit rGf4c337ab85c0: [PowerPC] Add support for vector bool __int128 
for Power10 (authored by saghir).

Changed prior to commit:
  https://reviews.llvm.org/D81816?vs=271874=272896#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81816/new/

https://reviews.llvm.org/D81816

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/DeclSpec.cpp
  clang/test/Parser/altivec-bool-128.c
  clang/test/Parser/cxx-altivec-bool-128.cpp
  clang/test/Parser/p10-vector-bool-128.c

Index: clang/test/Parser/p10-vector-bool-128.c
===
--- /dev/null
+++ clang/test/Parser/p10-vector-bool-128.c
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -triple=powerpc64-unknown-linux-gnu -target-cpu pwr10 \
+// RUN:-target-feature +vsx -target-feature +power10-vector \
+// RUN:-fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple=powerpc64le-unknown-linux-gnu -target-cpu pwr10 \
+// RUN:-target-feature +power10-vector -fsyntax-only -verify %s
+// expected-no-diagnostics
+
+// Test legitimate uses of 'vector bool __int128' with VSX.
+__vector bool __int128 v1_bi128;
+__vector __bool __int128 v2_bi128;
+vector bool __int128 v3_bi128;
+vector __bool __int128 v4_bi128;
Index: clang/test/Parser/cxx-altivec-bool-128.cpp
===
--- /dev/null
+++ clang/test/Parser/cxx-altivec-bool-128.cpp
@@ -0,0 +1,23 @@
+// RUN: %clang_cc1 -triple=powerpc64-unknown-linux-gnu \
+// RUN:-target-feature +altivec -fsyntax-only -verify -std=c++11 %s
+// RUN: %clang_cc1 -triple=powerpc64le-unknown-linux-gnu \
+// RUN:-target-feature +altivec -fsyntax-only -verify -std=c++11 %s
+// RUN: %clang_cc1 -triple=powerpc64le-unknown-linux-gnu -target-cpu pwr10 \
+// RUN:-target-feature +altivec -target-feature +vsx \
+// RUN:-target-feature -power10-vector -fsyntax-only -verify %s
+
+#include 
+
+// Test 'vector bool __int128' type.
+
+// These should have errors.
+__vector bool __int128 v1_bi128;  // expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+__vector __bool __int128 v2_bi128;// expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+vector bool __int128 v3_bi128;// expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+vector __bool __int128 v4_bi128;  // expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+__vector bool unsigned __int128 v5_bi128; // expected-error {{cannot use 'unsigned' with '__vector bool'}} expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+__vector bool signed __int128 v6_bi128;   // expected-error {{cannot use 'signed' with '__vector bool'}} expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+vector bool unsigned __int128 v7_bi128;   // expected-error {{cannot use 'unsigned' with '__vector bool'}} expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+vector bool signed __int128 v8_bi128; // expected-error {{cannot use 'signed' with '__vector bool'}} expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+__vector __bool signed __int128 v9_bi128; // expected-error {{cannot use 'signed' with '__vector bool'}} expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
+vector __bool signed __int128 v10_bi128;  // expected-error {{cannot use 'signed' with '__vector bool'}} expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}
Index: clang/test/Parser/altivec-bool-128.c
===
--- /dev/null
+++ clang/test/Parser/altivec-bool-128.c
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -triple=powerpc64-unknown-linux-gnu \
+// RUN:-target-feature +altivec -fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple=powerpc64le-unknown-linux-gnu \
+// RUN:-target-feature +altivec -fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple=powerpc64le-unknown-linux-gnu -target-cpu pwr10 \
+// RUN:-target-feature +vsx -target-feature -power10-vector \
+// RUN:-fsyntax-only -verify %s
+
+// Test 'vector bool __int128' type.
+
+// These should have errors.
+__vector bool __int128 v1_bi128;  // expected-error {{use of '__int128' with '__vector bool' requires VSX support enabled (on POWER10 or later)}}

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread JunMa via Phabricator via cfe-commits
junparser added a comment.

In D82314#2109893 , @rsmith wrote:

> In D82314#2109728 , @lxfind wrote:
>
> > @rsmith Thanks. That's a good point. Do you know if there already exists 
> > optimization passes in LLVM that attempts to shrink the range of lifetime 
> > intrinsics? If so, I am curious why that does not help in this case. Or is 
> > it generally unsafe to move the lifetime intrinsics, and we could only do 
> > it here with specific context knowledge about coroutines.
>
>
> I don't know for sure, but I would expect someone to have implemented such a 
> pass already. Moving a lifetime start intrinsic later, past instructions that 
> can't possibly reference the object in question, seems like it should always 
> be safe and (presumably) should always be a good thing to do, and similarly 
> for moving lifetime end markers earlier. It could be that such a pass exists 
> but it is run too late in the pass pipeline, so the coroutine split pass 
> doesn't get to take advantage of it.


@lxfind,  Also lifetime marker of variable are much complex because of the 
existing of exceptional path(multiple lifetime start & multiple lifetime end) , 
so it is hard to optimize such cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79719: [AIX] Implement AIX special alignment rule about double/long double

2020-06-23 Thread Jason Liu via Phabricator via cfe-commits
jasonliu added inline comments.



Comment at: clang/lib/AST/RecordLayoutBuilder.cpp:1881
+  if (isAIXLayout(Context) && FieldOffset == CharUnits::Zero() &&
+  (IsUnion || NonOverlappingEmptyFieldFound)) {
+FirstNonOverlappingEmptyFieldHandled = true;

Xiangling_L wrote:
> jasonliu wrote:
> > Maybe it's a naive thought, but is it possible to replace 
> > `NonOverlappingEmptyFieldFound` with `IsOverlappingEmptyField && 
> > FieldOffsets.size() == 0`?
> I don't think these two work the same. `NonOverlappingEmptyFieldFound` 
> represents the 1st non-empty and non-overlapping field in the record. 
> `IsOverlappingEmptyField && FieldOffsets.size() == 0` represents something 
> opposite.
You are right. I meant could we replace it with `!(IsOverlappingEmptyField && 
FieldOffsets.size() == 0)`?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79719/new/

https://reviews.llvm.org/D79719



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76342: [OpenMP] Implement '#pragma omp tile'

2020-06-23 Thread Michael Kruse via Phabricator via cfe-commits
Meinersbur marked an inline comment as done.
Meinersbur added inline comments.



Comment at: clang/include/clang/AST/StmtOpenMP.h:4781-4784
+/// This represents the '#pragma omp tile' loop transformation directive.
+class OMPTileDirective final
+: public OMPLoopDirective,
+  private llvm::TrailingObjects {

ABataev wrote:
> Meinersbur wrote:
> > ABataev wrote:
> > > Not sure that this is a good idea to treat this directive as the 
> > > executable directive. To me, it looks like kind of `AttributedStmt`. 
> > > Maybe better to introduce some kind of a new base node for this and 
> > > similar constructs, which does not own the loop but is its kind of 
> > > attribute-like entity?
> > > Also, can we have something like:
> > > ```
> > > #pragma omp simd
> > > #pragma omp tile ...
> > > for(...) ;
> > > ```
> > > Thoughts?
> > While not executed at runtime, syntactically it is parsed like a executable 
> > (loop-associated) directive. IMHO it does 'own' the loop, but produces 
> > another one for to be owned(/associated) by a different directive, as in 
> > your tile/simd example, which should already work. Allowing this was the 
> > motivation to do the transformation on the AST-level for now.
> I'm not saying that we should separate parsing of this directive from others, 
> it is just better to treat this directive as a little bit different node. 
> Currently, it introduces too many changes in the base classes. Better to 
> create a new base class, that does not relies on `CapturedStmt` as the base, 
> and derive `OMPExecutableDirective` and this directive and other similar (+ 
> maybe, `OMPSimdDirective`) from this new base class.
Unless you tell me otherwise, `OMPLoopDirective` represents a loop-associated 
directive. `#pragma omp tile` is a loop-associated directive. 
`OMPLoopDirective` contains all the functionality to parse associated loops, 
and unfortunately if derived from `OMPExecutableDirective`.

You seem to ask me to create a new class 
"OMPDirectiveAssociatedWithLoopButNotExecutable" that duplicates the parsing 
part of "OMPLoopDirective"? This will either be a lot of duplicated code or 
result in even more changes to the base classes due to the refactoring.

By the OpenMP specification, simd and tile are executable directives, so 
structurally I think the class hierarchy as-is makes sense. From the glossary 
of the upcoming OpenMP 5.1:
> An OpenMP directive that appears in an executable context and results in 
> implementation code and/or prescribes the manner in which associated user 
> code must execute.

Avoiding a CapturedStmt when not needed would a modification of 
`clang::getOpenMPCaptureRegions` which currently adds a capture of type 
`OMPD_unknown` for such directives. This is unrelated to loop-associated 
directives.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76342/new/

https://reviews.llvm.org/D76342



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82428: [clang][driver] allow `-arch arm64` to be used to build for mac when on Apple Silicon Mac without explicit `-target`

2020-06-23 Thread Steven Wu via Phabricator via cfe-commits
steven_wu accepted this revision.
steven_wu added a comment.
This revision is now accepted and ready to land.

LGTM.

Not sure if it makes more sense to break the patch into two commits:

- config.guess change is for building the correct host triple on apple silicon 
machine without explicitly specify it.
- the driver change is for better default on Apple silicon Mac.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82428/new/

https://reviews.llvm.org/D82428



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82428: [clang][driver] allow `-arch arm64` to be used to build for mac when on Apple Silicon Mac without explicit `-target`

2020-06-23 Thread Alex Lorenz via Phabricator via cfe-commits
arphaman created this revision.
arphaman added reviewers: steven_wu, dexonsmith.
Herald added subscribers: llvm-commits, danielkiss, ributzka, jkorous, 
kristof.beyls, mgorny.
Herald added a project: LLVM.
steven_wu accepted this revision.
steven_wu added a comment.
This revision is now accepted and ready to land.

LGTM.

Not sure if it makes more sense to break the patch into two commits:

- config.guess change is for building the correct host triple on apple silicon 
machine without explicitly specify it.
- the driver change is for better default on Apple silicon Mac.


This patch allows a user to compile for the `arm64-apple-macos` target when 
invoking a clang running on an Apple Silicon machine by passing `-arch arm64` 
only.


https://reviews.llvm.org/D82428

Files:
  clang/lib/Driver/ToolChains/Darwin.cpp
  clang/test/Driver/apple-arm64-arch.c
  clang/test/Driver/apple-silicon-arch.c
  clang/test/lit.cfg.py
  llvm/cmake/config.guess


Index: llvm/cmake/config.guess
===
--- llvm/cmake/config.guess
+++ llvm/cmake/config.guess
@@ -1263,6 +1263,23 @@
  UNAME_PROCESSOR="x86_64"
  fi
fi ;;
+   arm)
+   eval $set_cc_for_build
+   if [ "$CC_FOR_BUILD" != 'no_compiler_found' ]; then
+   if (echo '#ifdef __LP64__'; echo IS_64BIT_ARCH; echo 
'#endif') | \
+   (CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \
+   grep IS_64BIT_ARCH >/dev/null
+   then
+   if (echo '#ifdef __PTRAUTH_INTRINSICS__'; echo 
HAS_AUTH; echo '#endif') | \
+   (CCOPTS= $CC_FOR_BUILD -E - 
2>/dev/null) | \
+   grep HAS_AUTH >/dev/null
+   then
+   UNAME_PROCESSOR="arm64e"
+   else
+   UNAME_PROCESSOR="arm64"
+   fi
+   fi
+   fi ;;
unknown) UNAME_PROCESSOR=powerpc ;;
esac
echo ${UNAME_PROCESSOR}-apple-darwin${UNAME_RELEASE}
Index: clang/test/lit.cfg.py
===
--- clang/test/lit.cfg.py
+++ clang/test/lit.cfg.py
@@ -155,6 +155,10 @@
 if not re.match(r'.*-(cygwin)$', config.target_triple):
 config.available_features.add('clang-driver')
 
+# Tests that are specific to the Apple Silicon macOS.
+if re.match(r'^arm64(e)?-apple-(macos|darwin)', config.target_triple):
+config.available_features.add('apple-silicon-mac')
+
 # [PR18856] Depends to remove opened file. On win32, a file could be removed
 # only if all handles were closed.
 if platform.system() not in ['Windows']:
Index: clang/test/Driver/apple-silicon-arch.c
===
--- /dev/null
+++ clang/test/Driver/apple-silicon-arch.c
@@ -0,0 +1,6 @@
+// RUN: env SDKROOT="/" %clang -arch arm64 -c -### %s 2>&1 | \
+// RUN:   FileCheck %s
+//
+// REQUIRES: apple-silicon-mac
+//
+// CHECK: "-triple" "arm64-apple-macosx{{[0-9.]+}}"
Index: clang/test/Driver/apple-arm64-arch.c
===
--- /dev/null
+++ clang/test/Driver/apple-arm64-arch.c
@@ -0,0 +1,6 @@
+// RUN: env SDKROOT="/" %clang -arch arm64 -c -### %s 2>&1 | \
+// RUN:   FileCheck %s
+//
+// XFAIL: apple-silicon-mac
+//
+// CHECK: "-triple" "arm64-apple-ios{{[0-9.]+}}"
Index: clang/lib/Driver/ToolChains/Darwin.cpp
===
--- clang/lib/Driver/ToolChains/Darwin.cpp
+++ clang/lib/Driver/ToolChains/Darwin.cpp
@@ -1672,8 +1672,16 @@
   llvm::Triple::OSType OSTy = llvm::Triple::UnknownOS;
 
   StringRef MachOArchName = Toolchain.getMachOArchName(Args);
-  if (MachOArchName == "armv7" || MachOArchName == "armv7s" ||
-  MachOArchName == "arm64")
+  if (MachOArchName == "arm64") {
+#if __arm64__
+// A clang running on an Apple Silicon mac defaults
+// to building for mac when building for arm64 rather than
+// defaulting to iOS.
+OSTy = llvm::Triple::MacOSX;
+#else
+OSTy = llvm::Triple::IOS;
+#endif
+  } else if (MachOArchName == "armv7" || MachOArchName == "armv7s")
 OSTy = llvm::Triple::IOS;
   else if (MachOArchName == "armv7k" || MachOArchName == "arm64_32")
 OSTy = llvm::Triple::WatchOS;


Index: llvm/cmake/config.guess
===
--- llvm/cmake/config.guess
+++ llvm/cmake/config.guess
@@ -1263,6 +1263,23 @@
 		  UNAME_PROCESSOR="x86_64"
 		  fi
 		fi ;;
+		arm)
+		eval $set_cc_for_build
+		if [ "$CC_FOR_BUILD" != 'no_compiler_found' ]; then
+			if (echo '#ifdef __LP64__'; echo IS_64BIT_ARCH; echo '#endif') | \
+(CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \
+

[clang] f4c337a - [PowerPC] Add support for vector bool __int128 for Power10

2020-06-23 Thread Ahsan Saghir via cfe-commits

Author: Ahsan Saghir
Date: 2020-06-23T21:25:56-05:00
New Revision: f4c337ab85c0b7ec206da0f2c6576730eefb36c2

URL: 
https://github.com/llvm/llvm-project/commit/f4c337ab85c0b7ec206da0f2c6576730eefb36c2
DIFF: 
https://github.com/llvm/llvm-project/commit/f4c337ab85c0b7ec206da0f2c6576730eefb36c2.diff

LOG: [PowerPC] Add support for vector bool __int128 for Power10

Summary:
This patch adds support for `vector bool __int128` type for Power10.

Reviewers: #powerpc, hfinkel, lei, stefanp, amyk

Reviewed By: #powerpc, lei, amyk

Subscribers: lei, amyk, wuzish, nemanjai, shchenz, cfe-commits

Tags: #llvm, #powerpc, #clang

Differential Revision: https://reviews.llvm.org/D81816

Added: 
clang/test/Parser/altivec-bool-128.c
clang/test/Parser/cxx-altivec-bool-128.cpp
clang/test/Parser/p10-vector-bool-128.c

Modified: 
clang/include/clang/Basic/DiagnosticSemaKinds.td
clang/lib/Sema/DeclSpec.cpp

Removed: 




diff  --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 66856834a98f..ec31178389f9 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -259,6 +259,9 @@ def err_invalid_vector_float_decl_spec : Error<
 def err_invalid_vector_double_decl_spec : Error <
   "use of 'double' with '__vector' requires VSX support to be enabled "
   "(available on POWER7 or later)">;
+def err_invalid_vector_bool_int128_decl_spec : Error <
+  "use of '__int128' with '__vector bool' requires VSX support enabled (on "
+  "POWER10 or later)">;
 def err_invalid_vector_long_long_decl_spec : Error <
   "use of 'long long' with '__vector bool' requires VSX support (available on "
   "POWER7 or later) or extended Altivec support (available on POWER8 or later) 
"

diff  --git a/clang/lib/Sema/DeclSpec.cpp b/clang/lib/Sema/DeclSpec.cpp
index 6ad50e18cd52..f4c30c90ad27 100644
--- a/clang/lib/Sema/DeclSpec.cpp
+++ b/clang/lib/Sema/DeclSpec.cpp
@@ -1150,14 +1150,20 @@ void DeclSpec::Finish(Sema , const PrintingPolicy 
) {
 S.Diag(TSSLoc, diag::err_invalid_vector_bool_decl_spec)
   << getSpecifierName((TSS)TypeSpecSign);
   }
-
-  // Only char/int are valid with vector bool. (PIM 2.1)
+  // Only char/int are valid with vector bool prior to Power10.
+  // Power10 adds instructions that produce vector bool data
+  // for quadwords as well so allow vector bool __int128.
   if (((TypeSpecType != TST_unspecified) && (TypeSpecType != TST_char) &&
-   (TypeSpecType != TST_int)) || TypeAltiVecPixel) {
+   (TypeSpecType != TST_int) && (TypeSpecType != TST_int128)) ||
+  TypeAltiVecPixel) {
 S.Diag(TSTLoc, diag::err_invalid_vector_bool_decl_spec)
   << (TypeAltiVecPixel ? "__pixel" :
  getSpecifierName((TST)TypeSpecType, Policy));
   }
+  // vector bool __int128 requires Power10.
+  if ((TypeSpecType == TST_int128) &&
+  (!S.Context.getTargetInfo().hasFeature("power10-vector")))
+S.Diag(TSTLoc, diag::err_invalid_vector_bool_int128_decl_spec);
 
   // Only 'short' and 'long long' are valid with vector bool. (PIM 2.1)
   if ((TypeSpecWidth != TSW_unspecified) && (TypeSpecWidth != TSW_short) &&
@@ -1174,7 +1180,7 @@ void DeclSpec::Finish(Sema , const PrintingPolicy 
) {
 
   // Elements of vector bool are interpreted as unsigned. (PIM 2.1)
   if ((TypeSpecType == TST_char) || (TypeSpecType == TST_int) ||
-  (TypeSpecWidth != TSW_unspecified))
+  (TypeSpecType == TST_int128) || (TypeSpecWidth != TSW_unspecified))
 TypeSpecSign = TSS_unsigned;
 } else if (TypeSpecType == TST_double) {
   // vector long double and vector long long double are never allowed.

diff  --git a/clang/test/Parser/altivec-bool-128.c 
b/clang/test/Parser/altivec-bool-128.c
new file mode 100644
index ..049ca3cc839d
--- /dev/null
+++ b/clang/test/Parser/altivec-bool-128.c
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -triple=powerpc64-unknown-linux-gnu \
+// RUN:-target-feature +altivec -fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple=powerpc64le-unknown-linux-gnu \
+// RUN:-target-feature +altivec -fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple=powerpc64le-unknown-linux-gnu -target-cpu pwr10 \
+// RUN:-target-feature +vsx -target-feature -power10-vector \
+// RUN:-fsyntax-only -verify %s
+
+// Test 'vector bool __int128' type.
+
+// These should have errors.
+__vector bool __int128 v1_bi128;  // expected-error {{use of 
'__int128' with '__vector bool' requires VSX support enabled (on POWER10 or 
later)}}
+__vector __bool __int128 v2_bi128;// expected-error {{use of 
'__int128' with '__vector bool' requires VSX support enabled (on POWER10 or 
later)}}
+vector bool __int128 v3_bi128;// 

[PATCH] D82425: [SemaCXX] Fix false positive of -Wuninitialized-const-reference in empty function body.

2020-06-23 Thread Zequan Wu via Phabricator via cfe-commits
zequanwu created this revision.
zequanwu added reviewers: hans, nick.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Some libraries use empty function to ignore unused variable warnings, which 
gets a new warning from `-Wuninitialized-const-reference`, discussed here 
https://reviews.llvm.org/D79895#2107604.
This patch should fix that.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82425

Files:
  clang/lib/Analysis/UninitializedValues.cpp
  clang/test/SemaCXX/warn-uninitialized-const-reference.cpp


Index: clang/test/SemaCXX/warn-uninitialized-const-reference.cpp
===
--- clang/test/SemaCXX/warn-uninitialized-const-reference.cpp
+++ clang/test/SemaCXX/warn-uninitialized-const-reference.cpp
@@ -9,6 +9,9 @@
   bool operator!=(const A &);
 };
 
+template 
+inline void ignore_template(T const &) {}
+void ignore(const int ) {}
 A const_ref_use_A(const A );
 int const_ref_use(const int );
 A const_use_A(const A a);
@@ -33,4 +36,8 @@
   if (a < 42)
 m = 1;
   const_ref_use(m);
+
+  int l;
+  ignore_template(l);
+  ignore(l);
 }
Index: clang/lib/Analysis/UninitializedValues.cpp
===
--- clang/lib/Analysis/UninitializedValues.cpp
+++ clang/lib/Analysis/UninitializedValues.cpp
@@ -405,6 +405,15 @@
   return QT->isAnyPointerType() && QT->getPointeeType().isConstQualified();
 }
 
+static bool hasTrivialBody(CallExpr *CE) {
+  if (FunctionDecl *fd = CE->getDirectCallee()) {
+if (FunctionTemplateDecl *ftd = fd->getPrimaryTemplate())
+  return ftd->getTemplatedDecl()->hasTrivialBody();
+return fd->hasTrivialBody();
+  }
+  return false;
+}
+
 void ClassifyRefs::VisitCallExpr(CallExpr *CE) {
   // Classify arguments to std::move as used.
   if (CE->isCallToStdMove()) {
@@ -423,7 +432,8 @@
I != E; ++I) {
 if ((*I)->isGLValue()) {
   if ((*I)->getType().isConstQualified())
-classify((*I), ConstRefUse);
+if (!hasTrivialBody(CE))
+  classify((*I), ConstRefUse);
 } else if (isPointerToConst((*I)->getType())) {
   const Expr *Ex = stripCasts(DC->getParentASTContext(), *I);
   const auto *UO = dyn_cast(Ex);


Index: clang/test/SemaCXX/warn-uninitialized-const-reference.cpp
===
--- clang/test/SemaCXX/warn-uninitialized-const-reference.cpp
+++ clang/test/SemaCXX/warn-uninitialized-const-reference.cpp
@@ -9,6 +9,9 @@
   bool operator!=(const A &);
 };
 
+template 
+inline void ignore_template(T const &) {}
+void ignore(const int ) {}
 A const_ref_use_A(const A );
 int const_ref_use(const int );
 A const_use_A(const A a);
@@ -33,4 +36,8 @@
   if (a < 42)
 m = 1;
   const_ref_use(m);
+
+  int l;
+  ignore_template(l);
+  ignore(l);
 }
Index: clang/lib/Analysis/UninitializedValues.cpp
===
--- clang/lib/Analysis/UninitializedValues.cpp
+++ clang/lib/Analysis/UninitializedValues.cpp
@@ -405,6 +405,15 @@
   return QT->isAnyPointerType() && QT->getPointeeType().isConstQualified();
 }
 
+static bool hasTrivialBody(CallExpr *CE) {
+  if (FunctionDecl *fd = CE->getDirectCallee()) {
+if (FunctionTemplateDecl *ftd = fd->getPrimaryTemplate())
+  return ftd->getTemplatedDecl()->hasTrivialBody();
+return fd->hasTrivialBody();
+  }
+  return false;
+}
+
 void ClassifyRefs::VisitCallExpr(CallExpr *CE) {
   // Classify arguments to std::move as used.
   if (CE->isCallToStdMove()) {
@@ -423,7 +432,8 @@
I != E; ++I) {
 if ((*I)->isGLValue()) {
   if ((*I)->getType().isConstQualified())
-classify((*I), ConstRefUse);
+if (!hasTrivialBody(CE))
+  classify((*I), ConstRefUse);
 } else if (isPointerToConst((*I)->getType())) {
   const Expr *Ex = stripCasts(DC->getParentASTContext(), *I);
   const auto *UO = dyn_cast(Ex);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82346: [WebAssebmly] Fully disable 'protected' visibility

2020-06-23 Thread Sam Clegg via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG5804a8b1228b: [WebAssebmly] Fully disable 
protected visibility (authored by sbc100).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82346/new/

https://reviews.llvm.org/D82346

Files:
  clang/lib/Basic/Targets/WebAssembly.h


Index: clang/lib/Basic/Targets/WebAssembly.h
===
--- clang/lib/Basic/Targets/WebAssembly.h
+++ clang/lib/Basic/Targets/WebAssembly.h
@@ -133,11 +133,7 @@
 
   bool hasExtIntType() const override { return true; }
 
-  bool hasProtectedVisibility() const override {
-// TODO: For now, continue to advertise "protected" support for
-// Emscripten targets.
-return getTriple().isOSEmscripten();
-  }
+  bool hasProtectedVisibility() const override { return false; }
 };
 
 class LLVM_LIBRARY_VISIBILITY WebAssembly32TargetInfo


Index: clang/lib/Basic/Targets/WebAssembly.h
===
--- clang/lib/Basic/Targets/WebAssembly.h
+++ clang/lib/Basic/Targets/WebAssembly.h
@@ -133,11 +133,7 @@
 
   bool hasExtIntType() const override { return true; }
 
-  bool hasProtectedVisibility() const override {
-// TODO: For now, continue to advertise "protected" support for
-// Emscripten targets.
-return getTriple().isOSEmscripten();
-  }
+  bool hasProtectedVisibility() const override { return false; }
 };
 
 class LLVM_LIBRARY_VISIBILITY WebAssembly32TargetInfo
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 5804a8b - [WebAssebmly] Fully disable 'protected' visibility

2020-06-23 Thread Sam Clegg via cfe-commits

Author: Sam Clegg
Date: 2020-06-23T17:50:05-07:00
New Revision: 5804a8b1228ba890d48f4085a3a192ef83c73e00

URL: 
https://github.com/llvm/llvm-project/commit/5804a8b1228ba890d48f4085a3a192ef83c73e00
DIFF: 
https://github.com/llvm/llvm-project/commit/5804a8b1228ba890d48f4085a3a192ef83c73e00.diff

LOG: [WebAssebmly] Fully disable 'protected' visibility

Emscripten doesn't use protected visibility either.

Differential Revision: https://reviews.llvm.org/D82346

Added: 


Modified: 
clang/lib/Basic/Targets/WebAssembly.h

Removed: 




diff  --git a/clang/lib/Basic/Targets/WebAssembly.h 
b/clang/lib/Basic/Targets/WebAssembly.h
index e09e21d90802..77a2fe9ae117 100644
--- a/clang/lib/Basic/Targets/WebAssembly.h
+++ b/clang/lib/Basic/Targets/WebAssembly.h
@@ -133,11 +133,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : 
public TargetInfo {
 
   bool hasExtIntType() const override { return true; }
 
-  bool hasProtectedVisibility() const override {
-// TODO: For now, continue to advertise "protected" support for
-// Emscripten targets.
-return getTriple().isOSEmscripten();
-  }
+  bool hasProtectedVisibility() const override { return false; }
 };
 
 class LLVM_LIBRARY_VISIBILITY WebAssembly32TargetInfo



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79895: Add a new warning to warn when passing uninitialized variables as const reference parameters to a function

2020-06-23 Thread Arthur Eubanks via Phabricator via cfe-commits
aeubanks added a comment.

In D79895#2110345 , @nick wrote:

> > We didn't see it in the code bases I work with, so is boost a special case, 
> > or an example of a common practice?
>
> I do not have resources to make such statistics, but there are compilers 
> where casting to void is not enough to suppress the warning. 
> https://herbsutter.com/2009/10/18/mailbag-shutting-up-compiler-warnings/#comment-1509
>  https://godbolt.org/z/pS_iQ3
>
> > If it's just boost, fixing the code seems better
>
> Have you tried to push a fix for a warning in Boost? If it was that simple. A 
> part of my warning-fixing PRs are not merged in years. And considering that 
> fixing this warning will reintroduce warnings for other compilers I probably 
> will have a bad luck with this one too.
>
> > (it will compile faster too).
>
> Should I open a PR with replacing `std::forward` with `static_cast` because 
> it compiles faster? :-)


Perhaps asking for more opinions in cfe-dev will be better.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79895/new/

https://reviews.llvm.org/D79895



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79052: [clang codegen] Fix alignment of "Address" for incomplete array pointer.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGbf8b63ed296c: [clang codegen] Fix alignment of 
Address for incomplete array pointer. (authored by efriedma).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79052/new/

https://reviews.llvm.org/D79052

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGenCXX/alignment.cpp

Index: clang/test/CodeGenCXX/alignment.cpp
===
--- clang/test/CodeGenCXX/alignment.cpp
+++ clang/test/CodeGenCXX/alignment.cpp
@@ -308,4 +308,20 @@
 D d;
 AlignedArray result = d.bArray;
   }
+
+  // CHECK-LABEL: @_ZN5test11hEPA_NS_1BE
+  void h(B (*b)[]) {
+// CHECK: [[RESULT:%.*]] = alloca [[ARRAY]], align 64
+// CHECK: [[B_P:%.*]] = load [0 x [[B]]]*, [0 x [[B]]]**
+// CHECK: [[ELEMENT_P:%.*]] = getelementptr inbounds [0 x [[B]]], [0 x [[B]]]* [[B_P]], i64 0
+// CHECK: [[ARRAY_P:%.*]] = getelementptr inbounds [[B]], [[B]]* [[ELEMENT_P]], i32 0, i32 2
+// CHECK: [[T0:%.*]] = bitcast [[ARRAY]]* [[RESULT]] to i8*
+// CHECK: [[T1:%.*]] = bitcast [[ARRAY]]* [[ARRAY_P]] to i8*
+// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 64 [[T0]], i8* align 16 [[T1]], i64 16, i1 false)
+AlignedArray result = (*b)->bArray;
+  }
 }
+
+// CHECK-LABEL: @_Z22incomplete_array_derefPA_i
+// CHECK: load i32, i32* {{%.*}}, align 4
+int incomplete_array_deref(int (*p)[]) { return (*p)[2]; }
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -5990,6 +5990,9 @@
   if (TBAAInfo)
 *TBAAInfo = getTBAAAccessInfo(T);
 
+  // FIXME: This duplicates logic in ASTContext::getTypeAlignIfKnown. But
+  // that doesn't return the information we need to compute BaseInfo.
+
   // Honor alignment typedef attributes even on incomplete types.
   // We also honor them straight for C++ class types, even as pointees;
   // there's an expressivity gap here.
@@ -6001,32 +6004,46 @@
 }
   }
 
+  bool AlignForArray = T->isArrayType();
+
+  // Analyze the base element type, so we don't get confused by incomplete
+  // array types.
+  T = getContext().getBaseElementType(T);
+
+  if (T->isIncompleteType()) {
+// We could try to replicate the logic from
+// ASTContext::getTypeAlignIfKnown, but nothing uses the alignment if the
+// type is incomplete, so it's impossible to test. We could try to reuse
+// getTypeAlignIfKnown, but that doesn't return the information we need
+// to set BaseInfo.  So just ignore the possibility that the alignment is
+// greater than one.
+if (BaseInfo)
+  *BaseInfo = LValueBaseInfo(AlignmentSource::Type);
+return CharUnits::One();
+  }
+
   if (BaseInfo)
 *BaseInfo = LValueBaseInfo(AlignmentSource::Type);
 
   CharUnits Alignment;
-  if (T->isIncompleteType()) {
-Alignment = CharUnits::One(); // Shouldn't be used, but pessimistic is best.
+  // For C++ class pointees, we don't know whether we're pointing at a
+  // base or a complete object, so we generally need to use the
+  // non-virtual alignment.
+  const CXXRecordDecl *RD;
+  if (forPointeeType && !AlignForArray && (RD = T->getAsCXXRecordDecl())) {
+Alignment = getClassPointerAlignment(RD);
   } else {
-// For C++ class pointees, we don't know whether we're pointing at a
-// base or a complete object, so we generally need to use the
-// non-virtual alignment.
-const CXXRecordDecl *RD;
-if (forPointeeType && (RD = T->getAsCXXRecordDecl())) {
-  Alignment = getClassPointerAlignment(RD);
-} else {
-  Alignment = getContext().getTypeAlignInChars(T);
-  if (T.getQualifiers().hasUnaligned())
-Alignment = CharUnits::One();
-}
+Alignment = getContext().getTypeAlignInChars(T);
+if (T.getQualifiers().hasUnaligned())
+  Alignment = CharUnits::One();
+  }
 
-// Cap to the global maximum type alignment unless the alignment
-// was somehow explicit on the type.
-if (unsigned MaxAlign = getLangOpts().MaxTypeAlign) {
-  if (Alignment.getQuantity() > MaxAlign &&
-  !getContext().isAlignmentRequired(T))
-Alignment = CharUnits::fromQuantity(MaxAlign);
-}
+  // Cap to the global maximum type alignment unless the alignment
+  // was somehow explicit on the type.
+  if (unsigned MaxAlign = getLangOpts().MaxTypeAlign) {
+if (Alignment.getQuantity() > MaxAlign &&
+!getContext().isAlignmentRequired(T))
+  Alignment = CharUnits::fromQuantity(MaxAlign);
   }
   return Alignment;
 }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79895: Add a new warning to warn when passing uninitialized variables as const reference parameters to a function

2020-06-23 Thread Nikita Kniazev via Phabricator via cfe-commits
nick added a comment.

> We didn't see it in the code bases I work with, so is boost a special case, 
> or an example of a common practice?

I do not have resources to make such statistics, but there are compilers where 
casting to void is not enough to suppress the warning. 
https://herbsutter.com/2009/10/18/mailbag-shutting-up-compiler-warnings/#comment-1509
 https://godbolt.org/z/pS_iQ3

> If it's just boost, fixing the code seems better

Have you tried to push a fix for a warning in Boost? If it was that simple. A 
part of my warning-fixing PRs are not merged in years. And considering that 
fixing this warning will reintroduce warnings for other compilers I probably 
will have a bad luck with this one too.

> (it will compile faster too).

Should I open a PR with replacing `std::forward` with `static_cast` because it 
compiles faster? :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79895/new/

https://reviews.llvm.org/D79895



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] bf8b63e - [clang codegen] Fix alignment of "Address" for incomplete array pointer.

2020-06-23 Thread Eli Friedman via cfe-commits

Author: Eli Friedman
Date: 2020-06-23T17:16:17-07:00
New Revision: bf8b63ed296c1ecad03c83b798ffbfa039cbceb4

URL: 
https://github.com/llvm/llvm-project/commit/bf8b63ed296c1ecad03c83b798ffbfa039cbceb4
DIFF: 
https://github.com/llvm/llvm-project/commit/bf8b63ed296c1ecad03c83b798ffbfa039cbceb4.diff

LOG: [clang codegen] Fix alignment of "Address" for incomplete array pointer.

The code was assuming all incomplete types don't have meaningful
alignment, but incomplete arrays do have meaningful alignment.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45710

Differential Revision: https://reviews.llvm.org/D79052

Added: 


Modified: 
clang/lib/CodeGen/CodeGenModule.cpp
clang/test/CodeGenCXX/alignment.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 7a9df700581e..f0ab5165584c 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -5990,6 +5990,9 @@ CharUnits CodeGenModule::getNaturalTypeAlignment(QualType 
T,
   if (TBAAInfo)
 *TBAAInfo = getTBAAAccessInfo(T);
 
+  // FIXME: This duplicates logic in ASTContext::getTypeAlignIfKnown. But
+  // that doesn't return the information we need to compute BaseInfo.
+
   // Honor alignment typedef attributes even on incomplete types.
   // We also honor them straight for C++ class types, even as pointees;
   // there's an expressivity gap here.
@@ -6001,32 +6004,46 @@ CharUnits 
CodeGenModule::getNaturalTypeAlignment(QualType T,
 }
   }
 
+  bool AlignForArray = T->isArrayType();
+
+  // Analyze the base element type, so we don't get confused by incomplete
+  // array types.
+  T = getContext().getBaseElementType(T);
+
+  if (T->isIncompleteType()) {
+// We could try to replicate the logic from
+// ASTContext::getTypeAlignIfKnown, but nothing uses the alignment if the
+// type is incomplete, so it's impossible to test. We could try to reuse
+// getTypeAlignIfKnown, but that doesn't return the information we need
+// to set BaseInfo.  So just ignore the possibility that the alignment is
+// greater than one.
+if (BaseInfo)
+  *BaseInfo = LValueBaseInfo(AlignmentSource::Type);
+return CharUnits::One();
+  }
+
   if (BaseInfo)
 *BaseInfo = LValueBaseInfo(AlignmentSource::Type);
 
   CharUnits Alignment;
-  if (T->isIncompleteType()) {
-Alignment = CharUnits::One(); // Shouldn't be used, but pessimistic is 
best.
+  // For C++ class pointees, we don't know whether we're pointing at a
+  // base or a complete object, so we generally need to use the
+  // non-virtual alignment.
+  const CXXRecordDecl *RD;
+  if (forPointeeType && !AlignForArray && (RD = T->getAsCXXRecordDecl())) {
+Alignment = getClassPointerAlignment(RD);
   } else {
-// For C++ class pointees, we don't know whether we're pointing at a
-// base or a complete object, so we generally need to use the
-// non-virtual alignment.
-const CXXRecordDecl *RD;
-if (forPointeeType && (RD = T->getAsCXXRecordDecl())) {
-  Alignment = getClassPointerAlignment(RD);
-} else {
-  Alignment = getContext().getTypeAlignInChars(T);
-  if (T.getQualifiers().hasUnaligned())
-Alignment = CharUnits::One();
-}
+Alignment = getContext().getTypeAlignInChars(T);
+if (T.getQualifiers().hasUnaligned())
+  Alignment = CharUnits::One();
+  }
 
-// Cap to the global maximum type alignment unless the alignment
-// was somehow explicit on the type.
-if (unsigned MaxAlign = getLangOpts().MaxTypeAlign) {
-  if (Alignment.getQuantity() > MaxAlign &&
-  !getContext().isAlignmentRequired(T))
-Alignment = CharUnits::fromQuantity(MaxAlign);
-}
+  // Cap to the global maximum type alignment unless the alignment
+  // was somehow explicit on the type.
+  if (unsigned MaxAlign = getLangOpts().MaxTypeAlign) {
+if (Alignment.getQuantity() > MaxAlign &&
+!getContext().isAlignmentRequired(T))
+  Alignment = CharUnits::fromQuantity(MaxAlign);
   }
   return Alignment;
 }

diff  --git a/clang/test/CodeGenCXX/alignment.cpp 
b/clang/test/CodeGenCXX/alignment.cpp
index 37509fcb4dd5..c9378bf20a47 100644
--- a/clang/test/CodeGenCXX/alignment.cpp
+++ b/clang/test/CodeGenCXX/alignment.cpp
@@ -308,4 +308,20 @@ namespace test1 {
 D d;
 AlignedArray result = d.bArray;
   }
+
+  // CHECK-LABEL: @_ZN5test11hEPA_NS_1BE
+  void h(B (*b)[]) {
+// CHECK: [[RESULT:%.*]] = alloca [[ARRAY]], align 64
+// CHECK: [[B_P:%.*]] = load [0 x [[B]]]*, [0 x [[B]]]**
+// CHECK: [[ELEMENT_P:%.*]] = getelementptr inbounds [0 x [[B]]], [0 x 
[[B]]]* [[B_P]], i64 0
+// CHECK: [[ARRAY_P:%.*]] = getelementptr inbounds [[B]], [[B]]* 
[[ELEMENT_P]], i32 0, i32 2
+// CHECK: [[T0:%.*]] = bitcast [[ARRAY]]* [[RESULT]] to i8*
+// CHECK: [[T1:%.*]] = bitcast [[ARRAY]]* [[ARRAY_P]] to i8*
+// CHECK: call void 

[clang] d144601 - DR458: Search template parameter scopes in the right order.

2020-06-23 Thread Richard Smith via cfe-commits

Author: Richard Smith
Date: 2020-06-23T17:14:33-07:00
New Revision: d1446017f3fdc2f6a9efba222008d20afa1e26cc

URL: 
https://github.com/llvm/llvm-project/commit/d1446017f3fdc2f6a9efba222008d20afa1e26cc
DIFF: 
https://github.com/llvm/llvm-project/commit/d1446017f3fdc2f6a9efba222008d20afa1e26cc.diff

LOG: DR458: Search template parameter scopes in the right order.

C++ unqualified name lookup searches template parameter scopes
immediately after finishing searching the entity the parameters belong
to. (Eg, for a class template, you search the template parameter scope
after looking in that class template and its base classes and before
looking in the scope containing the class template.) This is complicated
by the fact that scope lookup within a template parameter scope looks in
a different sequence of places prior to reaching the end of the
declarator-id in the template declaration.

We used to approximate the proper lookup rule with a hack in the scope /
decl context walk inside name lookup. Now we instead compute the lookup
parent for each template parameter scope.

In order to get this right, we now make sure to enter a distinct Scope
for each template parameter scope, and make sure to re-enter the
enclosing class scopes properly when handling delay-parsed regions
within a class.

Added: 


Modified: 
clang/include/clang/Parse/Parser.h
clang/include/clang/Sema/Scope.h
clang/include/clang/Sema/Sema.h
clang/lib/AST/DeclBase.cpp
clang/lib/Parse/ParseCXXInlineMethods.cpp
clang/lib/Parse/ParseDeclCXX.cpp
clang/lib/Parse/ParseExprCXX.cpp
clang/lib/Parse/ParseOpenMP.cpp
clang/lib/Parse/ParseTemplate.cpp
clang/lib/Sema/SemaDecl.cpp
clang/lib/Sema/SemaDeclCXX.cpp
clang/lib/Sema/SemaLookup.cpp
clang/lib/Sema/SemaTemplate.cpp
clang/test/CXX/drs/dr4xx.cpp
clang/test/CXX/temp/temp.res/temp.local/p8.cpp
clang/test/SemaCXX/lambda-expressions.cpp
clang/www/cxx_dr_status.html

Removed: 




diff  --git a/clang/include/clang/Parse/Parser.h 
b/clang/include/clang/Parse/Parser.h
index 1ae219781c69..dda6db131240 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -1088,12 +1088,40 @@ class Parser : public CodeCompletionHandler {
 }
   };
 
+  /// Introduces zero or more scopes for parsing. The scopes will all be exited
+  /// when the object is destroyed.
+  class MultiParseScope {
+Parser 
+unsigned NumScopes = 0;
+
+MultiParseScope(const MultiParseScope&) = delete;
+
+  public:
+MultiParseScope(Parser ) : Self(Self) {}
+void Enter(unsigned ScopeFlags) {
+  Self.EnterScope(ScopeFlags);
+  ++NumScopes;
+}
+void Exit() {
+  while (NumScopes) {
+Self.ExitScope();
+--NumScopes;
+  }
+}
+~MultiParseScope() {
+  Exit();
+}
+  };
+
   /// EnterScope - Start a new scope.
   void EnterScope(unsigned ScopeFlags);
 
   /// ExitScope - Pop a scope off the scope stack.
   void ExitScope();
 
+  /// Re-enter the template scopes for a declaration that might be a template.
+  unsigned ReenterTemplateScopes(MultiParseScope , Decl *D);
+
 private:
   /// RAII object used to modify the scope flags for the current scope.
   class ParseScopeFlags {
@@ -1278,13 +1306,7 @@ class Parser : public CodeCompletionHandler {
 Decl *D;
 CachedTokens Toks;
 
-/// Whether this member function had an associated template
-/// scope. When true, D is a template declaration.
-/// otherwise, it is a member function declaration.
-bool TemplateScope;
-
-explicit LexedMethod(Parser* P, Decl *MD)
-  : Self(P), D(MD), TemplateScope(false) {}
+explicit LexedMethod(Parser *P, Decl *MD) : Self(P), D(MD) {}
 
 void ParseLexedMethodDefs() override;
   };
@@ -1314,8 +1336,7 @@ class Parser : public CodeCompletionHandler {
   /// argument (C++ [class.mem]p2).
   struct LateParsedMethodDeclaration : public LateParsedDeclaration {
 explicit LateParsedMethodDeclaration(Parser *P, Decl *M)
-  : Self(P), Method(M), TemplateScope(false),
-ExceptionSpecTokens(nullptr) {}
+: Self(P), Method(M), ExceptionSpecTokens(nullptr) {}
 
 void ParseLexedMethodDeclarations() override;
 
@@ -1324,11 +1345,6 @@ class Parser : public CodeCompletionHandler {
 /// Method - The method declaration.
 Decl *Method;
 
-/// Whether this member function had an associated template
-/// scope. When true, D is a template declaration.
-/// otherwise, it is a member function declaration.
-bool TemplateScope;
-
 /// DefaultArgs - Contains the parameters of the function and
 /// their default arguments. At least one of the parameters will
 /// have a default argument, but all of the parameters of the
@@ -1373,18 +1389,13 @@ class Parser : public CodeCompletionHandler {
   /// parsed after the corresponding top-level class is complete.
   

[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis accepted this revision.
eugenis added a comment.
This revision is now accepted and ready to land.

LGTM




Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3077
+Value *LowShadow = IRB.CreateOr(LowA, LowB);
+Value *Shadow = IRB.CreateInsertElement(Second, LowShadow, 
IRB.getInt32(0));
+

eugenis wrote:
> guiand wrote:
> > eugenis wrote:
> > > You probably want to insert in First, not Second.
> > > 
> > > Is the generated code any better if you OR the vectors, and then shuffle 
> > > to put the top element of First into the top element of the output? 
> > > That's what LLVM generates if I express this logic in C.
> > > 
> > > 
> > The codegen is basically identical either way, but if you'd like I can 
> > still upload a patch to change these into shufflevector instructions.
> This is much better.
> Use makeArrayRef({2, 1}).
llvm:: is unnecessary, and  is probably too


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Gui Andrade via Phabricator via cfe-commits
guiand updated this revision to Diff 272855.
guiand added a comment.

Addressed comments


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398

Files:
  llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
  llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll


Index: llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll
===
--- /dev/null
+++ llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll
@@ -0,0 +1,37 @@
+; RUN: opt < %s -msan-check-access-address=0 -S -passes=msan 2>&1 | FileCheck  
\
+; RUN: %s
+; RUN: opt < %s -msan -msan-check-access-address=0 -S | FileCheck %s
+; REQUIRES: x86-registered-target
+
+target datalayout = 
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare <2 x double> @llvm.x86.sse41.round.sd(<2 x double>, <2 x double>, i32) 
nounwind readnone
+declare <2 x double> @llvm.x86.sse2.min.sd(<2 x double>, <2 x double>) 
nounwind readnone
+
+define <2 x double> @test_sse_round_sd(<2 x double> %op1, <2 x double> %op2) 
sanitize_memory {
+entry:
+  ; CHECK: [[OP2_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[OP1_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[OUT_SHADOW:%.+]] = shufflevector <2 x i64> [[OP1_SHADOW]], <2 x 
i64> [[OP2_SHADOW]], <2 x i32>  
+  ; CHECK-NOT: call void @msan_warning
+  ; CHECK: call <2 x double> @llvm.x86.sse41.round.sd
+  %0 = tail call <2 x double> @llvm.x86.sse41.round.sd(<2 x double> %op1, <2 x 
double> %op2, i32 0)
+  ; CHECK: store <2 x i64> [[OUT_SHADOW]], {{.*}} @__msan_retval_tls
+  ; CHECK: ret <2 x double>
+  ret <2 x double> %0
+}
+
+define <2 x double> @test_sse_min_sd(<2 x double> %op1, <2 x double> %op2) 
sanitize_memory {
+entry:
+  ; CHECK: [[OP2_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[OP1_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[OR_SHADOW:%.+]] = or <2 x i64> [[OP1_SHADOW]], [[OP2_SHADOW]]
+  ; CHECK: [[OUT_SHADOW_VEC:%.+]] = shufflevector <2 x i64> [[OP1_SHADOW]], <2 
x i64> [[OR_SHADOW]], <2 x i32>  
+  ; CHECK-NOT: call void @msan_warning
+  ; CHECK: call <2 x double> @llvm.x86.sse2.min.sd
+  %0 = tail call <2 x double> @llvm.x86.sse2.min.sd(<2 x double> %op1, <2 x 
double> %op2)
+  ; CHECK: store <2 x i64> [[OUT_SHADOW_VEC]], {{.*}} @__msan_retval_tls
+  ; CHECK: ret <2 x double>
+  ret <2 x double> %0
+}
Index: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -3054,6 +3054,32 @@
 SOC.Done();
   }
 
+  // Instrument _mm_*_sd intrinsics
+  void handleUnarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+// High word of first operand, low word of second
+Value *Shadow =
+IRB.CreateShuffleVector(First, Second, llvm::makeArrayRef({2, 
1}));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
+  void handleBinarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *OrShadow = IRB.CreateOr(First, Second);
+// High word of first operand, low word of both OR'd together
+Value *Shadow = IRB.CreateShuffleVector(First, OrShadow,
+llvm::makeArrayRef({2, 1}));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
   void visitIntrinsicInst(IntrinsicInst ) {
 switch (I.getIntrinsicID()) {
 case Intrinsic::lifetime_start:
@@ -3293,6 +3319,14 @@
   handlePclmulIntrinsic(I);
   break;
 
+case Intrinsic::x86_sse41_round_sd:
+  handleUnarySdIntrinsic(I);
+  break;
+case Intrinsic::x86_sse2_max_sd:
+case Intrinsic::x86_sse2_min_sd:
+  handleBinarySdIntrinsic(I);
+  break;
+
 case Intrinsic::is_constant:
   // The result of llvm.is.constant() is always defined.
   setShadow(, getCleanShadow());


Index: llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll
===
--- /dev/null
+++ llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll
@@ -0,0 +1,37 @@
+; RUN: opt < %s -msan-check-access-address=0 -S -passes=msan 2>&1 | FileCheck  \
+; RUN: %s
+; RUN: opt < %s -msan -msan-check-access-address=0 -S | FileCheck %s
+; REQUIRES: x86-registered-target
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare <2 x double> @llvm.x86.sse41.round.sd(<2 x double>, 

[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Gui Andrade via Phabricator via cfe-commits
guiand marked 4 inline comments as done.
guiand added inline comments.



Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3331
+  // case Intrinsic::x86_avx512_mask_sub_sd_round:
+  // case Intrinsic::x86_avx512_mask_mul_sd_round:
+  // case Intrinsic::x86_avx512_mask_div_sd_round:

eugenis wrote:
> Unrelated change.
Whoops!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81678: Introduce frozen attribute at call sites for stricter poison analysis

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

In D81678#2109059 , @nlopes wrote:

> I'm a bit concerned with this patch as it increases the amount of UB that 
> LLVM exploits without any study of the impact.
>  For example, right now it's ok do this with clang (not with constants; make 
> it less trivial so clang doesn't fold it right away):
>
>   int f() { return INT_MAX + 1; }
>
> While technically this is UB in C, when lowered to LLVM IR, this function 
> returns poison.
>  When the frozen attribute is attached, the function will now trigger UB in 
> LLVM IR as well.
>  Is this what we want? It would be worthwhile to at least compile a few 
> programs to check if they break.


Even if we say that it's undefined behavior, we don't have to start converting 
"ret undef" to "unreachable".  I mean, it's something we could consider doing, 
but we don't have to immediately start doing it just because the attribute 
exists.  I expect that just hooking up the attribute to 
isGuaranteedNotToBeUndefOrPoison() will have almost no immediate effect.

> Also, what's the plan to detect these cases in ubsan?

I don't think this has any practical impact on our goals with sanitizers.  We 
should detect undefined behavior before it gets to the point of actually 
passing or returning an undef or poison value.

> Note that pure function calls with 'frozen' arguments become harder to hoist 
> from loops, for example. Since now calling this pure function can trigger UB 
> if one of the arguments is poison. You would need to introduce frozen 
> beforehand. I don't see this issue addressed in this patch.

A pure function can have undefined behavior, in general.

I guess the interaction between "speculatable" and "frozen" is a little weird.  
We don't have any optimizations that infer either "speculatable" or "frozen", 
though, so I'm not sure there's any practical impact here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81678/new/

https://reviews.llvm.org/D81678



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81678: Introduce frozen attribute at call sites for stricter poison analysis

2020-06-23 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis added a comment.

>> Also, what's the plan to detect these cases in ubsan?
> 
> I don't think this has any practical impact on our goals with sanitizers.  We 
> should detect undefined behavior before it gets to the point of actually 
> passing or returning an undef or poison value.

MSan will take advantage of this by validating no-undef at runtime before the 
function call. This will detect more bugs, as well as detect existing bugs 
earlier, making them easier to reason about.

I'm not sure how we could use this information in ubsan.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81678/new/

https://reviews.llvm.org/D81678



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81678: Introduce frozen attribute at call sites for stricter poison analysis

2020-06-23 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added inline comments.



Comment at: clang/lib/CodeGen/CGCall.cpp:4095
+}
+  }
   call->setCallingConv(getRuntimeCC());

eugenis wrote:
> guiand wrote:
> > jdoerfert wrote:
> > > Why would we do this? Function attributes are valid at the call site, no 
> > > need to copy them.
> > Do you mean that for some definition: `define @foo(i32 frozen %a, i32 
> > frozen %b)`, it's valid to issue a call instruction like `call @foo(i32 %a, 
> > i32 %b)` and its operands will be correctly identified as `frozen`? That's 
> > the kind of behavior I was seeing and I wasn't sure if it was an error.
> If you see CallBase::paramHasAttr, function definition attributes will be 
> taken into account when available.
> 
> I'm not sure what happens for indirect calls that do not have a Callee - we 
> need to make sure that the frontend emits the callsite attributes that match 
> the signature of the call.
> If you see CallBase::paramHasAttr, function definition attributes will be 
> taken into account when available.

Yes :)

> I'm not sure what happens for indirect calls that do not have a Callee - we 
> need to make sure that the frontend emits the callsite attributes that match 
> the signature of the call.

This would not trigger for indirect calls, `f` would be `nullptr`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81678/new/

https://reviews.llvm.org/D81678



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 4f5f6c1 - Move late-parsed class member attribute handling adjacent to all the

2020-06-23 Thread Richard Smith via cfe-commits

Author: Richard Smith
Date: 2020-06-23T15:43:11-07:00
New Revision: 4f5f6c1b83cb60354b7b4dea8fc7da561b6758fd

URL: 
https://github.com/llvm/llvm-project/commit/4f5f6c1b83cb60354b7b4dea8fc7da561b6758fd
DIFF: 
https://github.com/llvm/llvm-project/commit/4f5f6c1b83cb60354b7b4dea8fc7da561b6758fd.diff

LOG: Move late-parsed class member attribute handling adjacent to all the
other late-parsed class component handling.

No functionality change intended.

Added: 


Modified: 
clang/lib/Parse/ParseCXXInlineMethods.cpp
clang/lib/Parse/ParseDecl.cpp

Removed: 




diff  --git a/clang/lib/Parse/ParseCXXInlineMethods.cpp 
b/clang/lib/Parse/ParseCXXInlineMethods.cpp
index 06e02ed0e11c..7234a656e9d1 100644
--- a/clang/lib/Parse/ParseCXXInlineMethods.cpp
+++ b/clang/lib/Parse/ParseCXXInlineMethods.cpp
@@ -225,6 +225,7 @@ Parser::LateParsedDeclaration::~LateParsedDeclaration() {}
 void Parser::LateParsedDeclaration::ParseLexedMethodDeclarations() {}
 void Parser::LateParsedDeclaration::ParseLexedMemberInitializers() {}
 void Parser::LateParsedDeclaration::ParseLexedMethodDefs() {}
+void Parser::LateParsedDeclaration::ParseLexedAttributes() {}
 void Parser::LateParsedDeclaration::ParseLexedPragmas() {}
 
 Parser::LateParsedClass::LateParsedClass(Parser *P, ParsingClass *C)
@@ -246,6 +247,10 @@ void Parser::LateParsedClass::ParseLexedMethodDefs() {
   Self->ParseLexedMethodDefs(*Class);
 }
 
+void Parser::LateParsedClass::ParseLexedAttributes() {
+  Self->ParseLexedAttributes(*Class);
+}
+
 void Parser::LateParsedClass::ParseLexedPragmas() {
   Self->ParseLexedPragmas(*Class);
 }
@@ -262,6 +267,10 @@ void 
Parser::LateParsedMemberInitializer::ParseLexedMemberInitializers() {
   Self->ParseLexedMemberInitializer(*this);
 }
 
+void Parser::LateParsedAttribute::ParseLexedAttributes() {
+  Self->ParseLexedAttribute(*this, true, false);
+}
+
 void Parser::LateParsedPragma::ParseLexedPragmas() {
   Self->ParseLexedPragma(*this);
 }
@@ -662,6 +671,141 @@ void 
Parser::ParseLexedMemberInitializer(LateParsedMemberInitializer ) {
 ConsumeAnyToken();
 }
 
+/// Wrapper class which calls ParseLexedAttribute, after setting up the
+/// scope appropriately.
+void Parser::ParseLexedAttributes(ParsingClass ) {
+  // Deal with templates
+  // FIXME: Test cases to make sure this does the right thing for templates.
+  bool HasTemplateScope = !Class.TopLevelClass && Class.TemplateScope;
+  ParseScope ClassTemplateScope(this, Scope::TemplateParamScope,
+HasTemplateScope);
+  if (HasTemplateScope)
+Actions.ActOnReenterTemplateScope(getCurScope(), Class.TagOrTemplate);
+
+  // Set or update the scope flags.
+  bool AlreadyHasClassScope = Class.TopLevelClass;
+  unsigned ScopeFlags = Scope::ClassScope|Scope::DeclScope;
+  ParseScope ClassScope(this, ScopeFlags, !AlreadyHasClassScope);
+  ParseScopeFlags ClassScopeFlags(this, ScopeFlags, AlreadyHasClassScope);
+
+  // Enter the scope of nested classes
+  if (!AlreadyHasClassScope)
+Actions.ActOnStartDelayedMemberDeclarations(getCurScope(),
+Class.TagOrTemplate);
+  if (!Class.LateParsedDeclarations.empty()) {
+for (unsigned i = 0, ni = Class.LateParsedDeclarations.size(); i < ni; 
++i){
+  Class.LateParsedDeclarations[i]->ParseLexedAttributes();
+}
+  }
+
+  if (!AlreadyHasClassScope)
+Actions.ActOnFinishDelayedMemberDeclarations(getCurScope(),
+ Class.TagOrTemplate);
+}
+
+/// Parse all attributes in LAs, and attach them to Decl D.
+void Parser::ParseLexedAttributeList(LateParsedAttrList , Decl *D,
+ bool EnterScope, bool OnDefinition) {
+  assert(LAs.parseSoon() &&
+ "Attribute list should be marked for immediate parsing.");
+  for (unsigned i = 0, ni = LAs.size(); i < ni; ++i) {
+if (D)
+  LAs[i]->addDecl(D);
+ParseLexedAttribute(*LAs[i], EnterScope, OnDefinition);
+delete LAs[i];
+  }
+  LAs.clear();
+}
+
+/// Finish parsing an attribute for which parsing was delayed.
+/// This will be called at the end of parsing a class declaration
+/// for each LateParsedAttribute. We consume the saved tokens and
+/// create an attribute with the arguments filled in. We add this
+/// to the Attribute list for the decl.
+void Parser::ParseLexedAttribute(LateParsedAttribute ,
+ bool EnterScope, bool OnDefinition) {
+  // Create a fake EOF so that attribute parsing won't go off the end of the
+  // attribute.
+  Token AttrEnd;
+  AttrEnd.startToken();
+  AttrEnd.setKind(tok::eof);
+  AttrEnd.setLocation(Tok.getLocation());
+  AttrEnd.setEofData(LA.Toks.data());
+  LA.Toks.push_back(AttrEnd);
+
+  // Append the current token at the end of the new token stream so that it
+  // doesn't get lost.
+  LA.Toks.push_back(Tok);
+  PP.EnterTokenStream(LA.Toks, true, 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > avl wrote:
> > > > > > > efriedma wrote:
> > > > > > > > avl wrote:
> > > > > > > > > efriedma wrote:
> > > > > > > > > > avl wrote:
> > > > > > > > > > > efriedma wrote:
> > > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > analysis?  Is it not enough that the call you're trying 
> > > > > > > > > > > > to TRE is marked "tail"?
> > > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > >analysis?
> > > > > > > > > > > 
> > > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) 
> > > > > > > > > > > could be reused here. 
> > > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > > to make TRE not depending on markTails(). There is a 
> > > > > > > > > > > review for this - https://reviews.llvm.org/D60031
> > > > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > > > 
> > > > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > > > marked "tail"?
> > > > > > > > > > > 
> > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > marked "Tail".
> > > > > > > > > > > It also should be checked that other calls does not 
> > > > > > > > > > > capture pointer to local stack: 
> > > > > > > > > > > 
> > > > > > > > > > > ```
> > > > > > > > > > > // do not do TRE if any pointer to local stack has 
> > > > > > > > > > > escaped.
> > > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > > >return false;
> > > > > > > > > > > 
> > > > > > > > > > > ```
> > > > > > > > > > > 
> > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > marked "Tail". It also should be checked that other calls 
> > > > > > > > > > > does not capture pointer to local stack:
> > > > > > > > > > 
> > > > > > > > > > If there's an escaped pointer to the local stack, we 
> > > > > > > > > > wouldn't infer "tail" in the first place, would we?
> > > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > > understand whether this function receives pointer to alloca 
> > > > > > > > > but does not capture it:
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > void test(int recurseCount)
> > > > > > > > > {
> > > > > > > > > if (recurseCount == 0) return;
> > > > > > > > > int temp = 10;
> > > > > > > > > globalIncrement();
> > > > > > > > > test(recurseCount - 1);
> > > > > > > > > }
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > test - marked with Tail.
> > > > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > test - marked with Tail.
> > > > > > > > 
> > > > > > > > For the given code, TRE won't mark the recursive call "tail".  
> > > > > > > > That transform isn't legal: the recursive call could access the 
> > > > > > > > caller's version of "temp".
> > > > > > > >For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > >That transform isn't legal: the recursive call could access the 
> > > > > > > >caller's version of "temp".
> > > > > > > 
> > > > > > > it looks like recursive call could NOT access the caller's 
> > > > > > > version of "temp":
> > > > > > > 
> > > > > > > ```
> > > > > > > test(recurseCount - 1);
> > > > > > > ```
> > > > > > > 
> > > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > > 
> > > > > > > ```
> > > > > > > globalIncrement();
> > > > > > > ```
> > > > > > > 
> > > > > > > If globalIncrement does not capture the "" then TRE looks to 
> > > > > > > be legal for that case. 
> > > > > > > 
> > > > > > > globalIncrement() would not be marked with "Tail". test() would 
> > > > > > > be marked with Tail.
> > > > > > > 
> > > > > > > Thus the pre-requisite for TRE would be: tail-recursive call must 
> > > > > > > not receive pointer to local stack(Tail) and non-recursive calls 
> > > > > > > must not capture the pointer to local stack.
> > > > > > Can you give a complete IR example where we infer "tail", but 

[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis added inline comments.



Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3077
+Value *LowShadow = IRB.CreateOr(LowA, LowB);
+Value *Shadow = IRB.CreateInsertElement(Second, LowShadow, 
IRB.getInt32(0));
+

guiand wrote:
> eugenis wrote:
> > You probably want to insert in First, not Second.
> > 
> > Is the generated code any better if you OR the vectors, and then shuffle to 
> > put the top element of First into the top element of the output? That's 
> > what LLVM generates if I express this logic in C.
> > 
> > 
> The codegen is basically identical either way, but if you'd like I can still 
> upload a patch to change these into shufflevector instructions.
This is much better.
Use makeArrayRef({2, 1}).



Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3331
+  // case Intrinsic::x86_avx512_mask_sub_sd_round:
+  // case Intrinsic::x86_avx512_mask_mul_sd_round:
+  // case Intrinsic::x86_avx512_mask_div_sd_round:

Unrelated change.



Comment at: llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll:17
+  ; CHECK: [[OP2VEC_SHADOW:%.+]] = insertelement <2 x i64> , 
i64 [[LOWVAL_SHADOW]], i32 0
+  %op2vec = insertelement <2 x double> undef, double %lowval, i32 0
+  ; CHECK: [[OP1VEC_SHADOW:%.+]] = insertelement <2 x i64> , 
i64 [[HIVAL_SHADOW]], i32 1

pass the vectors as arguments, it will make the test case a lot simpler


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72841: Add support for pragma float_control, to control precision and exception behavior at the source level

2020-06-23 Thread Changpeng Fang via Phabricator via cfe-commits
cfang added a comment.

-ffast-math flag got lost in the Builder after this change.

FMF.isFast() is true before  updateFastMathFlags(FMF, FPFeatures), but turns 
false after. 
It seems the Builder.FMF has been correctly set before, but I am not clear what 
FPFeatures should be at this point:

+static void setBuilderFlagsFromFPFeatures(CGBuilderTy ,
+  CodeGenFunction ,
+  FPOptions FPFeatures) {
+  auto NewRoundingBehavior = FPFeatures.getRoundingMode();
+  Builder.setDefaultConstrainedRounding(NewRoundingBehavior);
+  auto NewExceptionBehavior =
+  ToConstrainedExceptMD(FPFeatures.getExceptionMode());
+  Builder.setDefaultConstrainedExcept(NewExceptionBehavior);
+  auto FMF = Builder.getFastMathFlags();
+  updateFastMathFlags(FMF, FPFeatures);
+  Builder.setFastMathFlags(FMF);
+  assert((CGF.CurFuncDecl == nullptr || Builder.getIsFPConstrained() ||
+  isa(CGF.CurFuncDecl) ||
+  isa(CGF.CurFuncDecl) ||
+  (NewExceptionBehavior == llvm::fp::ebIgnore &&
+   NewRoundingBehavior == llvm::RoundingMode::NearestTiesToEven)) &&
+ "FPConstrained should be enabled on entire function");
+}


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72841/new/

https://reviews.llvm.org/D72841



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > avl wrote:
> > > > > > > > > > efriedma wrote:
> > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > analysis?  Is it not enough that the call you're trying 
> > > > > > > > > > > to TRE is marked "tail"?
> > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > > > 
> > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could 
> > > > > > > > > > be reused here. 
> > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > to make TRE not depending on markTails(). There is a review 
> > > > > > > > > > for this - https://reviews.llvm.org/D60031
> > > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > > 
> > > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > > marked "tail"?
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail".
> > > > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > > > pointer to local stack: 
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > >return false;
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail". It also should be checked that other calls 
> > > > > > > > > > does not capture pointer to local stack:
> > > > > > > > > 
> > > > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > > > infer "tail" in the first place, would we?
> > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > understand whether this function receives pointer to alloca but 
> > > > > > > > does not capture it:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > void test(int recurseCount)
> > > > > > > > {
> > > > > > > > if (recurseCount == 0) return;
> > > > > > > > int temp = 10;
> > > > > > > > globalIncrement();
> > > > > > > > test(recurseCount - 1);
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > 
> > > > > > > For the given code, TRE won't mark the recursive call "tail".  
> > > > > > > That transform isn't legal: the recursive call could access the 
> > > > > > > caller's version of "temp".
> > > > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > > > >transform isn't legal: the recursive call could access the 
> > > > > > >caller's version of "temp".
> > > > > > 
> > > > > > it looks like recursive call could NOT access the caller's version 
> > > > > > of "temp":
> > > > > > 
> > > > > > ```
> > > > > > test(recurseCount - 1);
> > > > > > ```
> > > > > > 
> > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > 
> > > > > > ```
> > > > > > globalIncrement();
> > > > > > ```
> > > > > > 
> > > > > > If globalIncrement does not capture the "" then TRE looks to 
> > > > > > be legal for that case. 
> > > > > > 
> > > > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > > > marked with Tail.
> > > > > > 
> > > > > > Thus the pre-requisite for TRE would be: tail-recursive call must 
> > > > > > not receive pointer to local stack(Tail) and non-recursive calls 
> > > > > > must not capture the pointer to local stack.
> > > > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > > > illegal?
> > > > > 
> > > > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > > > still do the TRE transform here?
> > > > >Can you give a complete IR example where we infer "tail", 

[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Gui Andrade via Phabricator via cfe-commits
guiand updated this revision to Diff 272842.
guiand added a comment.

Use shufflevector, move test over to IR


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398

Files:
  llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
  llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll

Index: llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll
===
--- /dev/null
+++ llvm/test/Instrumentation/MemorySanitizer/vector_sd.ll
@@ -0,0 +1,48 @@
+; RUN: opt < %s -msan-check-access-address=0 -S -passes=msan 2>&1 | FileCheck  \
+; RUN: %s
+; RUN: opt < %s -msan -msan-check-access-address=0 -S | FileCheck %s
+; REQUIRES: x86-registered-target
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare <2 x double> @llvm.x86.sse41.round.sd(<2 x double>, <2 x double>, i32) nounwind readnone
+declare <2 x double> @llvm.x86.sse2.min.sd(<2 x double>, <2 x double>) nounwind readnone
+
+define <2 x double> @test_sse_round_sd(double %lowval, double %hival) sanitize_memory {
+entry:
+  ; CHECK: [[HIVAL_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[LOWVAL_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[OP2VEC_SHADOW:%.+]] = insertelement <2 x i64> , i64 [[LOWVAL_SHADOW]], i32 0
+  %op2vec = insertelement <2 x double> undef, double %lowval, i32 0
+  ; CHECK: [[OP1VEC_SHADOW:%.+]] = insertelement <2 x i64> , i64 [[HIVAL_SHADOW]], i32 1
+  %op1vec = insertelement <2 x double> undef, double %hival, i32 1
+  ; CHECK: [[OUT_SHADOW:%.+]] = shufflevector <2 x i64> [[OP1VEC_SHADOW]], <2 x i64> [[OP2VEC_SHADOW]], <2 x i32>  
+  ; CHECK-NOT: call void @msan_warning
+  ; CHECK: call <2 x double> @llvm.x86.sse41.round.sd
+  %0 = tail call <2 x double> @llvm.x86.sse41.round.sd(<2 x double> %op1vec, <2 x double> %op2vec, i32 0)
+  ; CHECK: store <2 x i64> [[OUT_SHADOW]], {{.*}} @__msan_retval_tls
+  ; CHECK: ret <2 x double>
+  ret <2 x double> %0
+}
+
+define <2 x double> @test_sse_min_sd(double %lowval0, double %lowval1, double %hival) sanitize_memory {
+entry:
+  ; CHECK: [[LOWVAL1_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[HIVAL_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[LOWVAL0_SHADOW:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  ; CHECK: [[OP2_SHADOW:%.+]] = insertelement <2 x i64> , i64 [[LOWVAL0_SHADOW]], i32 0
+  %op2vec = insertelement <2 x double> undef, double %lowval0, i32 0
+  ; CHECK: [[HIVEC_SHADOW:%.+]] = insertelement <2 x i64> , i64 [[HIVAL_SHADOW]], i32 1
+  %hivec = insertelement <2 x double> undef, double %hival, i32 1
+  ; CHECK: [[OP1_SHADOW:%.+]] = insertelement <2 x i64> [[HIVEC_SHADOW]], i64 [[LOWVAL1_SHADOW]], i32 0
+  %op1vec = insertelement <2 x double> %hivec, double %lowval1, i32 0
+  ; CHECK: [[OR_SHADOW:%.+]] = or <2 x i64> [[OP1_SHADOW]], [[OP2_SHADOW]]
+  ; CHECK: [[OUT_SHADOW_VEC:%.+]] = shufflevector <2 x i64> [[OP1_SHADOW]], <2 x i64> [[OR_SHADOW]], <2 x i32>  
+  ; CHECK-NOT: call void @msan_warning
+  ; CHECK: call <2 x double> @llvm.x86.sse2.min.sd
+  %0 = tail call <2 x double> @llvm.x86.sse2.min.sd(<2 x double> %op1vec, <2 x double> %op2vec)
+  ; CHECK: store <2 x i64> [[OUT_SHADOW_VEC]], {{.*}} @__msan_retval_tls
+  ; CHECK: ret <2 x double>
+  ret <2 x double> %0
+}
Index: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -3054,6 +3054,32 @@
 SOC.Done();
   }
 
+  // Instrument _mm_*_sd intrinsics
+  void handleUnarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+// High word of first operand, low word of second
+Value *Shadow =
+IRB.CreateShuffleVector(First, Second, llvm::ArrayRef({2, 1}));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
+  void handleBinarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *OrShadow = IRB.CreateOr(First, Second);
+// High word of first operand, low word of both OR'd together
+Value *Shadow =
+IRB.CreateShuffleVector(First, OrShadow, llvm::ArrayRef({2, 1}));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
   void visitIntrinsicInst(IntrinsicInst ) {
 switch (I.getIntrinsicID()) {
 case Intrinsic::lifetime_start:
@@ -3293,6 +3319,24 @@
   handlePclmulIntrinsic(I);
   break;
 
+case Intrinsic::x86_sse41_round_sd:
+  handleUnarySdIntrinsic(I);
+  break;
+case Intrinsic::x86_sse2_max_sd:
+case Intrinsic::x86_sse2_min_sd:
+ 

[PATCH] D82386: [clangd] Config: Fragments and parsing from YAML

2020-06-23 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added inline comments.



Comment at: clang-tools-extra/clangd/ConfigFragment.h:9
+//
+// Various clangd features have configurable behaviour (or can be disabled).
+// The configuration system allows users to control this:

i think this paragraph belongs to `Config.h`



Comment at: clang-tools-extra/clangd/ConfigFragment.h:14
+//
+// This file defines the config::Fragment structure which is models one piece 
of
+// configuration as obtained from a source like a file.

s/which is models/which models/



Comment at: clang-tools-extra/clangd/ConfigFragment.h:16
+// configuration as obtained from a source like a file.
+// This is distinct from how the config is interpreted (CompiledFragment),
+// combined (ConfigProvider) and exposed to the rest of clangd (Config).

again i don't think these are relevant here.

maybe provide more details about any new config option should start propagating 
from here, as this is the closest layer to user written config,



Comment at: clang-tools-extra/clangd/ConfigFragment.h:61
+  /// BufferName is used for the SourceMgr and diagnostics.
+  static std::vector parseYAML(llvm::StringRef YAML,
+ llvm::StringRef BufferName,

what about `fromYAML` and returning `vector` ?



Comment at: clang-tools-extra/clangd/ConfigFragment.h:65
+
+  struct SourceInfo {
+/// Retains a buffer of the original source this fragment was parsed from.

why the bundling?



Comment at: clang-tools-extra/clangd/ConfigFragment.h:69
+/// Shared because multiple fragments are often parsed from one (YAML) 
file.
+/// May be null, then all locations are ignored.
+std::shared_ptr Manager;

maybe `invalid/absent` rather than `ignored` (or change it to say should be 
ignored) ? I am assuming this is referring to locations of config parameters 
like `Add`s and conditions.



Comment at: clang-tools-extra/clangd/ConfigFragment.h:73
+/// Only valid if SourceManager is set.
+llvm::SMLoc Location;
+  };

what is the use of this ?



Comment at: clang-tools-extra/clangd/ConfigFragment.h:75
+  };
+  SourceInfo Source;
+

why not make this const ? i don't think it makes sense to modify these after 
creation.



Comment at: clang-tools-extra/clangd/ConfigFragment.h:77
+
+  struct ConditionFragment {
+std::vector> PathMatch;

comments?



Comment at: clang-tools-extra/clangd/ConfigFragment.h:78
+  struct ConditionFragment {
+std::vector> PathMatch;
+/// An unrecognized key was found while parsing the condition.

some comments? especially around `fragment applies to file matching all (or 
any) of the pathmatches`



Comment at: clang-tools-extra/clangd/ConfigFragment.h:81
+/// The condition will evaluate to false.
+bool UnrecognizedCondition;
+  };

`HasUnrecognizedCondition` ?

also default init to `true` maybe?



Comment at: clang-tools-extra/clangd/ConfigFragment.h:85
+
+  struct CompileFlagsFragment {
+std::vector> Add;

some comments.

I am not sure if putting `Fragment` suffix to all of these makes sense, as they 
already reside inside `Fragment`. Maybe `section` or `block` ? (same goes for 
ConditionFragment)



Comment at: clang-tools-extra/clangd/ConfigFragment.h:87
+std::vector> Add;
+  } CompileFlags;
+};

maybe make this an llvm:Optional too. Even though emptiness of `Add` would be 
OK in this case, for non-container "blocks" we might need to introduce an 
optional, and homogeneity would be nice when it happens.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82386/new/

https://reviews.llvm.org/D82386



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82415: [Coroutines] Special handle __builtin_coro_resume for final_suspend nothrow check

2020-06-23 Thread Xun Li via Phabricator via cfe-commits
lxfind created this revision.
Herald added subscribers: cfe-commits, modocache.
Herald added a project: clang.
lxfind added reviewers: modocache, lewissbaker, junparser.
lxfind requested review of this revision.

In https://reviews.llvm.org/D82029 we added the conformance check that the 
expression co_await promise.final_suspend() should not potentially throw.
As part of this expression, in cases when the await_suspend() method of the 
final suspend awaiter returns a handle, __builtin_coro_resume could be called 
on the handle to immediately resume that coroutine.
__builtin_coro_resume is not declared with noexcept and it shouldn't. We need 
to special check this case here.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82415

Files:
  clang/lib/Sema/SemaCoroutine.cpp


Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -614,6 +614,14 @@
 // In the case of dtor, the call to dtor is implicit and hence we should
 // pass nullptr to canCalleeThrow.
 if (Sema::canCalleeThrow(S, IsDtor ? nullptr : cast(E), D)) {
+  if (auto *FD = dyn_cast(D)) {
+// co_await promise.final_suspend() could end up calling
+// __builtin_coro_resume for symmetric transfer if await_suspend()
+// returns a handle. In that case, even __builtin_coro_resume is not
+// declared as noexcept, we claim that logically it does not throw.
+if (FD->getBuiltinID() == Builtin::BI__builtin_coro_resume)
+  return;
+  }
   if (ThrowingDecls.empty()) {
 // First time seeing an error, emit the error message.
 S.Diag(cast(S.CurContext)->getLocation(),


Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -614,6 +614,14 @@
 // In the case of dtor, the call to dtor is implicit and hence we should
 // pass nullptr to canCalleeThrow.
 if (Sema::canCalleeThrow(S, IsDtor ? nullptr : cast(E), D)) {
+  if (auto *FD = dyn_cast(D)) {
+// co_await promise.final_suspend() could end up calling
+// __builtin_coro_resume for symmetric transfer if await_suspend()
+// returns a handle. In that case, even __builtin_coro_resume is not
+// declared as noexcept, we claim that logically it does not throw.
+if (FD->getBuiltinID() == Builtin::BI__builtin_coro_resume)
+  return;
+  }
   if (ThrowingDecls.empty()) {
 // First time seeing an error, emit the error message.
 S.Diag(cast(S.CurContext)->getLocation(),
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 4935419 - Remove clang::Codegen::EHPadEndScope as unused

2020-06-23 Thread David Blaikie via cfe-commits

Author: David Blaikie
Date: 2020-06-23T15:18:49-07:00
New Revision: 4935419d779bdc6cc2f1c2f9e78821ad550d3b56

URL: 
https://github.com/llvm/llvm-project/commit/4935419d779bdc6cc2f1c2f9e78821ad550d3b56
DIFF: 
https://github.com/llvm/llvm-project/commit/4935419d779bdc6cc2f1c2f9e78821ad550d3b56.diff

LOG: Remove clang::Codegen::EHPadEndScope as unused

Unused since r255423 / D15140 /  4e52d6f811a2269e946c19e77245148bd9221f99

Found indirectly by assessing -debug-info-kind=constructors and
observing the EHPadEndScope type was never emitted because the
constructor is never called. (all credit to Amy Huang for identifying
this issue)

Added: 


Modified: 
clang/lib/CodeGen/CGCleanup.h
clang/lib/CodeGen/CGException.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGCleanup.h b/clang/lib/CodeGen/CGCleanup.h
index a2b3f71a2f86..ef4f6b9ec133 100644
--- a/clang/lib/CodeGen/CGCleanup.h
+++ b/clang/lib/CodeGen/CGCleanup.h
@@ -102,7 +102,7 @@ class EHScope {
   };
 
 public:
-  enum Kind { Cleanup, Catch, Terminate, Filter, PadEnd };
+  enum Kind { Cleanup, Catch, Terminate, Filter };
 
   EHScope(Kind kind, EHScopeStack::stable_iterator enclosingEHScope)
 : CachedLandingPad(nullptr), CachedEHDispatchBlock(nullptr),
@@ -487,17 +487,6 @@ class EHTerminateScope : public EHScope {
   }
 };
 
-class EHPadEndScope : public EHScope {
-public:
-  EHPadEndScope(EHScopeStack::stable_iterator enclosingEHScope)
-  : EHScope(PadEnd, enclosingEHScope) {}
-  static size_t getSize() { return sizeof(EHPadEndScope); }
-
-  static bool classof(const EHScope *scope) {
-return scope->getKind() == PadEnd;
-  }
-};
-
 /// A non-stable pointer into the scope stack.
 class EHScopeStack::iterator {
   char *Ptr;
@@ -535,10 +524,6 @@ class EHScopeStack::iterator {
 case EHScope::Terminate:
   Size = EHTerminateScope::getSize();
   break;
-
-case EHScope::PadEnd:
-  Size = EHPadEndScope::getSize();
-  break;
 }
 Ptr += llvm::alignTo(Size, ScopeStackAlignment);
 return *this;

diff  --git a/clang/lib/CodeGen/CGException.cpp 
b/clang/lib/CodeGen/CGException.cpp
index de3d1b129146..2494f38b3159 100644
--- a/clang/lib/CodeGen/CGException.cpp
+++ b/clang/lib/CodeGen/CGException.cpp
@@ -651,9 +651,6 @@ 
CodeGenFunction::getEHDispatchBlock(EHScopeStack::stable_iterator si) {
 case EHScope::Terminate:
   dispatchBlock = getTerminateHandler();
   break;
-
-case EHScope::PadEnd:
-  llvm_unreachable("PadEnd unnecessary for Itanium!");
 }
 scope.setCachedEHDispatchBlock(dispatchBlock);
   }
@@ -695,9 +692,6 @@ 
CodeGenFunction::getFuncletEHDispatchBlock(EHScopeStack::stable_iterator SI) {
   case EHScope::Terminate:
 DispatchBlock->setName("terminate");
 break;
-
-  case EHScope::PadEnd:
-llvm_unreachable("PadEnd dispatch block missing!");
   }
   EHS.setCachedEHDispatchBlock(DispatchBlock);
   return DispatchBlock;
@@ -713,7 +707,6 @@ static bool isNonEHScope(const EHScope ) {
   case EHScope::Filter:
   case EHScope::Catch:
   case EHScope::Terminate:
-  case EHScope::PadEnd:
 return false;
   }
 
@@ -780,9 +773,6 @@ llvm::BasicBlock *CodeGenFunction::EmitLandingPad() {
   case EHScope::Terminate:
 return getTerminateLandingPad();
 
-  case EHScope::PadEnd:
-llvm_unreachable("PadEnd unnecessary for Itanium!");
-
   case EHScope::Catch:
   case EHScope::Cleanup:
   case EHScope::Filter:
@@ -848,9 +838,6 @@ llvm::BasicBlock *CodeGenFunction::EmitLandingPad() {
 
 case EHScope::Catch:
   break;
-
-case EHScope::PadEnd:
-  llvm_unreachable("PadEnd unnecessary for Itanium!");
 }
 
 EHCatchScope  = cast(*I);



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] f724ce0 - [clang][driver] allow macOS 11 OS version in the driver

2020-06-23 Thread Alex Lorenz via cfe-commits

Author: Alex Lorenz
Date: 2020-06-23T15:14:26-07:00
New Revision: f724ce0d73eb3f85364e346a036588825bc47567

URL: 
https://github.com/llvm/llvm-project/commit/f724ce0d73eb3f85364e346a036588825bc47567
DIFF: 
https://github.com/llvm/llvm-project/commit/f724ce0d73eb3f85364e346a036588825bc47567.diff

LOG: [clang][driver] allow macOS 11 OS version in the driver

Added: 


Modified: 
clang/lib/Driver/ToolChains/Darwin.cpp
clang/test/Driver/darwin-version.c

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Darwin.cpp 
b/clang/lib/Driver/ToolChains/Darwin.cpp
index 3bf7f9cbc139..bb7c7f768b35 100644
--- a/clang/lib/Driver/ToolChains/Darwin.cpp
+++ b/clang/lib/Driver/ToolChains/Darwin.cpp
@@ -1822,7 +1822,7 @@ void Darwin::AddDeploymentTarget(DerivedArgList ) 
const {
   if (Platform == MacOS) {
 if (!Driver::GetReleaseVersion(OSTarget->getOSVersion(), Major, Minor,
Micro, HadExtra) ||
-HadExtra || Major != 10 || Minor >= 100 || Micro >= 100)
+HadExtra || Major < 10 || Major >= 100 || Minor >= 100 || Micro >= 100)
   getDriver().Diag(diag::err_drv_invalid_version_number)
   << OSTarget->getAsString(Args, Opts);
   } else if (Platform == IPhoneOS) {

diff  --git a/clang/test/Driver/darwin-version.c 
b/clang/test/Driver/darwin-version.c
index 7885b5964626..3471552c937f 100644
--- a/clang/test/Driver/darwin-version.c
+++ b/clang/test/Driver/darwin-version.c
@@ -305,3 +305,13 @@
 // RUN: %clang -target armv7k-apple-ios10.1-simulator -c %s -### 2>&1 | \
 // RUN:   FileCheck --check-prefix=CHECK-VERSION-TENV-SIM2 %s
 // CHECK-VERSION-TENV-SIM2: "thumbv7k-apple-ios10.1.0-simulator"
+
+
+// RUN: %clang -target x86_64-apple-macos11 -c %s -### 2>&1 | \
+// RUN:   FileCheck --check-prefix=CHECK-MACOS11 %s
+// RUN: %clang -target x86_64-apple-darwin20 -c %s -### 2>&1 | \
+// RUN:   FileCheck --check-prefix=CHECK-MACOS11 %s
+// RUN: %clang -target x86_64-apple-darwin -mmacos-version-min=11 -c %s -### 
2>&1 | \
+// RUN:   FileCheck --check-prefix=CHECK-MACOS11 %s
+
+// CHECK-MACOS11: "x86_64-apple-macosx11.0.0"



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82414: [X86] Replace PROC macros with an enum and a lookup table of processor information.

2020-06-23 Thread Craig Topper via Phabricator via cfe-commits
craig.topper created this revision.
craig.topper added reviewers: echristo, erichkeane, LuoYuanke.
Herald added a subscriber: hiraditya.
Herald added projects: clang, LLVM.
craig.topper added reviewers: RKSimon, spatel.
craig.topper marked an inline comment as done.
craig.topper added inline comments.



Comment at: llvm/lib/Support/X86TargetParser.cpp:36
+  // i386-generation processors.
+  { "i386", CK_i386, ~0U, PROC_32_BIT },
+  // i486-generation processors.

Once we have feature bits in the table the PROC_32_BIT/PROC_64_BIT can just 
check FEATURE_EM64T.


This patch removes the PROC macro in favor of CPUKind enum and a
table contains information about CPUs.

The current information in the table is the CPU name, CPUKind enum
value, key feature for target multiversioning, and Is64Bit capable.
For the strings that are aliases, I've duplicated the information
in the table. This means there are more rows in the table than
CPUKind enums.

This replaces multiple StringSwitch's with loops through the table.
They are linear searches due to the table being more logically
ordered than alphabetical. The StringSwitch's would have also been
linear. I've used StringLiteral on the strings in the table so we
can quickly check the length while searching.

I contemplated having a CPUKind for each string so there was a 1:1
mapping, but didn't want to spread more names to the places that
use the enum.

My ultimate goal here is to store the features for each CPU as a
bitset within the table. Hoping to use constexpr to make this
composable so we can group features and inherit them. After the
table lookup we can turn the bitset into a list of strings for the
frontend. The current switch we have for selecting features for
CPUs has become difficult to maintain while trying to express
inheritance relationships.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82414

Files:
  clang/lib/Basic/Targets/X86.cpp
  llvm/include/llvm/Support/X86TargetParser.def
  llvm/include/llvm/Support/X86TargetParser.h
  llvm/lib/Support/X86TargetParser.cpp

Index: llvm/lib/Support/X86TargetParser.cpp
===
--- llvm/lib/Support/X86TargetParser.cpp
+++ llvm/lib/Support/X86TargetParser.cpp
@@ -15,44 +15,157 @@
 #include "llvm/ADT/Triple.h"
 
 using namespace llvm;
+using namespace llvm::X86;
 
-bool checkCPUKind(llvm::X86::CPUKind Kind, bool Only64Bit) {
-  using namespace X86;
-  // Perform any per-CPU checks necessary to determine if this CPU is
-  // acceptable.
-  switch (Kind) {
-  case CK_None:
-// No processor selected!
-return false;
-#define PROC(ENUM, STRING, IS64BIT)\
-  case CK_##ENUM:  \
-return IS64BIT || !Only64Bit;
-#include "llvm/Support/X86TargetParser.def"
-  }
-  llvm_unreachable("Unhandled CPU kind");
-}
+namespace {
 
-X86::CPUKind llvm::X86::parseArchX86(StringRef CPU, bool Only64Bit) {
-  X86::CPUKind Kind = llvm::StringSwitch(CPU)
-#define PROC(ENUM, STRING, IS64BIT) .Case(STRING, CK_##ENUM)
-#define PROC_ALIAS(ENUM, ALIAS) .Case(ALIAS, CK_##ENUM)
-#include "llvm/Support/X86TargetParser.def"
-  .Default(CK_None);
+struct ProcInfo {
+  StringLiteral Name;
+  X86::CPUKind Kind;
+  unsigned KeyFeature;
+  bool Is64Bit;
+};
+
+} // end anonymous namespace
+
+#define PROC_64_BIT true
+#define PROC_32_BIT false
+
+static constexpr ProcInfo Processors[] = {
+  // i386-generation processors.
+  { "i386", CK_i386, ~0U, PROC_32_BIT },
+  // i486-generation processors.
+  { "i486", CK_i486, ~0U, PROC_32_BIT },
+  { "winchip-c6", CK_WinChipC6, ~0U, PROC_32_BIT },
+  { "winchip2", CK_WinChip2, ~0U, PROC_32_BIT },
+  { "c3", CK_C3, ~0U, PROC_32_BIT },
+  // i586-generation processors, P5 microarchitecture based.
+  { "i586", CK_i586, ~0U, PROC_32_BIT },
+  { "pentium", CK_Pentium, ~0U, PROC_32_BIT },
+  { "pentium-mmx", CK_PentiumMMX, ~0U, PROC_32_BIT },
+  { "pentiumpro", CK_PentiumPro, ~0U, PROC_32_BIT },
+  // i686-generation processors, P6 / Pentium M microarchitecture based.
+  { "i686", CK_i686, ~0U, PROC_32_BIT },
+  { "pentium2", CK_Pentium2, ~0U, PROC_32_BIT },
+  { "pentium3", CK_Pentium3, ~0U, PROC_32_BIT },
+  { "pentium3m", CK_Pentium3, ~0U, PROC_32_BIT },
+  { "pentium-m", CK_PentiumM, ~0U, PROC_32_BIT },
+  { "c3-2", CK_C3_2, ~0U, PROC_32_BIT },
+  { "yonah", CK_Yonah, ~0U, PROC_32_BIT },
+  // Netburst microarchitecture based processors.
+  { "pentium4", CK_Pentium4, ~0U, PROC_32_BIT },
+  { "pentium4m", CK_Pentium4, ~0U, PROC_32_BIT },
+  { "prescott", CK_Prescott, ~0U, PROC_32_BIT },
+  { "nocona", CK_Nocona, ~0U, PROC_64_BIT },
+  // Core microarchitecture based processors.
+  { "core2", CK_Core2, ~0U, PROC_64_BIT },
+  { "penryn", CK_Penryn, ~0U, PROC_64_BIT },
+  // Atom processors
+  { "bonnell", CK_Bonnell, FEATURE_SSSE3, PROC_64_BIT },
+  { "atom", CK_Bonnell, FEATURE_SSSE3, 

[PATCH] D82414: [X86] Replace PROC macros with an enum and a lookup table of processor information.

2020-06-23 Thread Craig Topper via Phabricator via cfe-commits
craig.topper marked an inline comment as done.
craig.topper added inline comments.



Comment at: llvm/lib/Support/X86TargetParser.cpp:36
+  // i386-generation processors.
+  { "i386", CK_i386, ~0U, PROC_32_BIT },
+  // i486-generation processors.

Once we have feature bits in the table the PROC_32_BIT/PROC_64_BIT can just 
check FEATURE_EM64T.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82414/new/

https://reviews.llvm.org/D82414



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > avl wrote:
> > > > > > > efriedma wrote:
> > > > > > > > avl wrote:
> > > > > > > > > efriedma wrote:
> > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis? 
> > > > > > > > > >  Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > marked "tail"?
> > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > > 
> > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could 
> > > > > > > > > be reused here. 
> > > > > > > > > But marking, done in markTails(), looks like separate tasks. 
> > > > > > > > > i.e. it is better 
> > > > > > > > > to make TRE not depending on markTails(). There is a review 
> > > > > > > > > for this - https://reviews.llvm.org/D60031
> > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > 
> > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > marked "tail"?
> > > > > > > > > 
> > > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > > "Tail".
> > > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > > pointer to local stack: 
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > >return false;
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > > "Tail". It also should be checked that other calls does not 
> > > > > > > > > capture pointer to local stack:
> > > > > > > > 
> > > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > > infer "tail" in the first place, would we?
> > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > understand whether this function receives pointer to alloca but 
> > > > > > > does not capture it:
> > > > > > > 
> > > > > > > ```
> > > > > > > void test(int recurseCount)
> > > > > > > {
> > > > > > > if (recurseCount == 0) return;
> > > > > > > int temp = 10;
> > > > > > > globalIncrement();
> > > > > > > test(recurseCount - 1);
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > test - marked with Tail.
> > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > test - marked with Tail.
> > > > > > 
> > > > > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > > > > transform isn't legal: the recursive call could access the caller's 
> > > > > > version of "temp".
> > > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > > >transform isn't legal: the recursive call could access the caller's 
> > > > > >version of "temp".
> > > > > 
> > > > > it looks like recursive call could NOT access the caller's version of 
> > > > > "temp":
> > > > > 
> > > > > ```
> > > > > test(recurseCount - 1);
> > > > > ```
> > > > > 
> > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > 
> > > > > ```
> > > > > globalIncrement();
> > > > > ```
> > > > > 
> > > > > If globalIncrement does not capture the "" then TRE looks to be 
> > > > > legal for that case. 
> > > > > 
> > > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > > marked with Tail.
> > > > > 
> > > > > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > > > > receive pointer to local stack(Tail) and non-recursive calls must not 
> > > > > capture the pointer to local stack.
> > > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > > illegal?
> > > > 
> > > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > > still do the TRE transform here?
> > > >Can you give a complete IR example where we infer "tail", but TRE is 
> > > >illegal?
> > > 
> > > there is no such example. Currently all cases where we  infer "tail" 
> > > would be valid for TRE.
> > > 
> > > >Can you give a complete IR example, we we don't infer "tail", but we 
> > > >still do the TRE transform 

[PATCH] D82249: [HWASan] Disable GlobalISel/FastISel for HWASan Globals.

2020-06-23 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis added a comment.

In D82249#2110054 , @hctim wrote:

> In D82249#2109920 , @eugenis wrote:
>
> > I'm OK with this as a workaround, but it would be more natural to detect 
> > the unsupported IR pattern in globalisel and fall back instead of disabling 
> > it entirely. Is it difficult to do for some reason?
>
>
> Eh, it's not an unsupported IR pattern that's the problem, it's that the IR 
> is lowered into `adrp + add` so that the `add` can be folded into a `ldr/str` 
> as an offset. IMO on the scale of "painting over the problem vs. fixing it", 
> this patch is 100% paint, forced fallback with `MO_TAGGED` is 80% paint for 
> maybe 60% of the work of just fixing it.
>
> I'm working on fixing this properly now that we know we don't need to make a 
> less-risky, fast-to-deploy patch. If that all falls into place this patchset 
> will just become obsolete anyway.


Right, well, there is an IR pattern that is not handled correctly in 
globalisel. Let's call it "unsupported". A better way to work around that is to 
detect this pattern in globalisel and fall back.

If you can fix the underlying problem, that's even better, of course.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82249/new/

https://reviews.llvm.org/D82249



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82249: [HWASan] Disable GlobalISel/FastISel for HWASan Globals.

2020-06-23 Thread Mitch Phillips via Phabricator via cfe-commits
hctim added a comment.

In D82249#2110010 , @arsenm wrote:

> I don't follow. It no longer falls back, so what is the problem?


HWASan-globals end up with an address that's outside of the code model (due to 
the tag), so the normal instruction sequence of `adrp + mov` or `adrp + ldr/str 
(with folded imm)` isn't adequate. In SelectionDAGISel, we emit a `movk` as 
well after the `adrp` to capture the top 16 bits. We need to do the same for 
GlobalISel. This is a temporary workaround to fix HWASan at `-O0` while I go 
and add support for adding the tagged-address lowering in GlobalISel.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82249/new/

https://reviews.llvm.org/D82249



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81285: [builtins] Change si_int to int in some helper declarations

2020-06-23 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay accepted this revision.
MaskRay added a comment.

> This patch changes types of some integer function arguments or return values 
> from si_int to the default int type (typedefed to native_int to make it 
> obvious this is intentional) to make it more compatible with libgcc.

Please drop `native_int` from the description since the code has been adjusted.

I don't use any sizeof(int)==2 platform and can't verify whether the patch is 
good on MSP430 or AVR. Hope @aykevl con verify the correctness.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81285/new/

https://reviews.llvm.org/D81285



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] a6308c0 - When performing a substitution into a dependent alias template, mark the

2020-06-23 Thread Richard Smith via cfe-commits

Author: Richard Smith
Date: 2020-06-23T14:43:04-07:00
New Revision: a6308c0ad954a08645d9abf0a5e77dc488b8ca28

URL: 
https://github.com/llvm/llvm-project/commit/a6308c0ad954a08645d9abf0a5e77dc488b8ca28
DIFF: 
https://github.com/llvm/llvm-project/commit/a6308c0ad954a08645d9abf0a5e77dc488b8ca28.diff

LOG: When performing a substitution into a dependent alias template, mark the
outer levels as retained rather than omitting their arguments.

This better reflects what's going on (we're performing a substitution
while still inside a template), and in theory is more correct, but I've
not found a testcase where it matters in practice (largely because we
don't allow alias templates to be declared inside a function).

Fixed AST dumping of SubstNonTypeTemplateParm[Pack]Expr to demonstrate
that we're properly substituting through dependent alias templates. (We
can't deduce properly through these yet, but we can at least produce the
right input to template argument deduction.)

No functionality change intended.

Added: 


Modified: 
clang/include/clang/AST/ASTNodeTraverser.h
clang/lib/Sema/SemaTemplate.cpp
clang/test/AST/ast-dump-openmp-begin-declare-variant_template_1.cpp
clang/test/SemaTemplate/alias-templates.cpp
clang/test/SemaTemplate/deduction-guide.cpp
clang/unittests/AST/ASTTraverserTest.cpp

Removed: 




diff  --git a/clang/include/clang/AST/ASTNodeTraverser.h 
b/clang/include/clang/AST/ASTNodeTraverser.h
index 7462120e3765..f1c98193df6c 100644
--- a/clang/include/clang/AST/ASTNodeTraverser.h
+++ b/clang/include/clang/AST/ASTNodeTraverser.h
@@ -680,6 +680,15 @@ class ASTNodeTraverser
 Visit(A);
   }
 
+  void VisitSubstNonTypeTemplateParmExpr(const SubstNonTypeTemplateParmExpr 
*E) {
+Visit(E->getParameter());
+  }
+  void VisitSubstNonTypeTemplateParmPackExpr(
+  const SubstNonTypeTemplateParmPackExpr *E) {
+Visit(E->getParameterPack());
+Visit(E->getArgumentPack());
+  }
+
   void VisitObjCAtCatchStmt(const ObjCAtCatchStmt *Node) {
 if (const VarDecl *CatchParam = Node->getCatchParamDecl())
   Visit(CatchParam);

diff  --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index 3e8a7531..72bd35bea0c6 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -3560,9 +3560,8 @@ QualType Sema::CheckTemplateIdType(TemplateName Name,
 // Only substitute for the innermost template argument list.
 MultiLevelTemplateArgumentList TemplateArgLists;
 TemplateArgLists.addOuterTemplateArguments();
-unsigned Depth = AliasTemplate->getTemplateParameters()->getDepth();
-for (unsigned I = 0; I < Depth; ++I)
-  TemplateArgLists.addOuterTemplateArguments(None);
+TemplateArgLists.addOuterRetainedLevels(
+AliasTemplate->getTemplateParameters()->getDepth());
 
 LocalInstantiationScope Scope(*this);
 InstantiatingTemplate Inst(*this, TemplateLoc, Template);

diff  --git 
a/clang/test/AST/ast-dump-openmp-begin-declare-variant_template_1.cpp 
b/clang/test/AST/ast-dump-openmp-begin-declare-variant_template_1.cpp
index 6d36c1b90eb8..5916958b9462 100644
--- a/clang/test/AST/ast-dump-openmp-begin-declare-variant_template_1.cpp
+++ b/clang/test/AST/ast-dump-openmp-begin-declare-variant_template_1.cpp
@@ -153,7 +153,8 @@ int test() {
 // CHECK-NEXT: |   `-PseudoObjectExpr [[ADDR_85:0x[a-z0-9]*]]  'int'
 // CHECK-NEXT: | |-CallExpr [[ADDR_86:0x[a-z0-9]*]]  
'int'
 // CHECK-NEXT: | | `-SubstNonTypeTemplateParmExpr 
[[ADDR_87:0x[a-z0-9]*]]  'int (*)({{.*}})'
-// CHECK-NEXT: | |   `-UnaryOperator [[ADDR_88:0x[a-z0-9]*]]  
'int (*)({{.*}})' prefix '&' cannot overflow
+// CHECK-NEXT: | |   |-NonTypeTemplateParmDecl {{.*}} referenced 
'Ty':'int (*)()' depth 0 index 0 fn
+// CHECK-NEXT: | |   `-UnaryOperator [[ADDR_88:0x[a-z0-9]*]] 
 'int (*)({{.*}})' prefix '&' cannot overflow
 // CHECK-NEXT: | | `-DeclRefExpr [[ADDR_89:0x[a-z0-9]*]]  
'int ({{.*}})' {{.*}}Function [[ADDR_0]] 'also_before' 'int ({{.*}})'
 // CHECK-NEXT: | `-CallExpr [[ADDR_90:0x[a-z0-9]*]]  'int'
 // CHECK-NEXT: |   `-ImplicitCastExpr [[ADDR_91:0x[a-z0-9]*]] 
 'int (*)({{.*}})' 

diff  --git a/clang/test/SemaTemplate/alias-templates.cpp 
b/clang/test/SemaTemplate/alias-templates.cpp
index 80678bf22985..6dffd9489294 100644
--- a/clang/test/SemaTemplate/alias-templates.cpp
+++ b/clang/test/SemaTemplate/alias-templates.cpp
@@ -265,3 +265,28 @@ namespace an_alias_template_is_not_a_class_template {
 int z = Bar(); // expected-error {{use of template template parameter 
'Bar' requires template arguments}}
   }
 }
+
+namespace resolved_nttp {
+  template  struct A {
+template  using Arr = T[N];
+Arr<3> a;
+  };
+  using TA = decltype(A::a);
+  using TA = int[3];
+
+  template  struct B {
+template  using Fn = T(int(*...A)[N]);
+Fn<1, 2, 3> *p;
+  };
+  using 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 272829.
avl edited the summary of this revision.
avl added a comment.

addressed comments:

1. removed PointerMayBeCaptured() used for CalledFunction.
2. rewrote CanTRE() to visiting instructions only once.
3. replaced areAllLastFuncCallsRecursive() with isInTREPosition().

I did not address request for not using AllocaDerivedValueTracker yet.
Since there is open question on it. I would address it as soon as the question 
would be resolved.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -89,16 +89,6 @@
 STATISTIC(NumRetDuped,   "Number of return duplicated");
 STATISTIC(NumAccumAdded, "Number of accumulators introduced");
 
-/// Scan the specified function for alloca instructions.
-/// If it contains any dynamic allocas, returns false.
-static bool canTRE(Function ) {
-  // Because of PR962, we don't TRE dynamic allocas.
-  return llvm::all_of(instructions(F), [](Instruction ) {
-auto *AI = dyn_cast();
-return !AI || AI->isStaticAlloca();
-  });
-}
-
 namespace {
 struct AllocaDerivedValueTracker {
   // Start at a root value and walk its use-def chain to mark calls that use the
@@ -185,11 +175,9 @@
 };
 }
 
-static bool markTails(Function , bool ,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function , OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +260,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 }
 
 for (auto *SuccBB : make_range(succ_begin(BB), succ_end(BB))) {
@@ -313,8 +298,6 @@
   LLVM_DEBUG(dbgs() << "Marked as tail call 

[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 272828.
fpetrogalli added a comment.

Updates:

1. extracted bfloat C tests into separate files (`*-bfloat.c).
2. Added missing tests (`clast[a|b]`, `last[a|b]`)
3. Tested warning is raised for missing declaration when macro 
`__ARM_FEATURE_SVE_BF16` is not present.
4. Cosmetic changes to formatting.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_insr-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lastb-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-bfloat.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -165,6 +165,14 @@
   ret  %out
 }
 
+define  @insr_bf16( %a, bfloat %b) {
+; CHECK-LABEL: insr_bf16:
+; CHECK: insr z0.h, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.insr.nxv8bf16( %a, bfloat %b)
+  ret  %out
+}
+
 define  @insr_f32( %a, float %b) {
 ; CHECK-LABEL: insr_f32:
 ; CHECK: insr z0.s, s1
@@ -348,6 +356,7 @@
 declare  @llvm.aarch64.sve.insr.nxv4i32(, i32)
 declare  @llvm.aarch64.sve.insr.nxv2i64(, i64)
 declare  @llvm.aarch64.sve.insr.nxv8f16(, half)
+declare  @llvm.aarch64.sve.insr.nxv8bf16(, bfloat)
 declare  @llvm.aarch64.sve.insr.nxv4f32(, float)
 declare  @llvm.aarch64.sve.insr.nxv2f64(, double)
 
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @dup_bf16( %a,  %pg, bfloat %b) {
+; CHECK-LABEL: dup_bf16:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %a,
+ %pg,
+bfloat %b)
+  ret  %out
+}
+
 define  @dup_f32( %a,  %pg, float %b) {
 ; CHECK-LABEL: dup_f32:
 ; CHECK: mov z0.s, p0/m, s1
@@ -82,5 +92,6 @@
 declare  @llvm.aarch64.sve.dup.nxv4i32(, , i32)
 declare  @llvm.aarch64.sve.dup.nxv2i64(, , i64)
 declare  @llvm.aarch64.sve.dup.nxv8f16(, , half)
+declare  @llvm.aarch64.sve.dup.nxv8bf16(, , bfloat)
 declare  @llvm.aarch64.sve.dup.nxv4f32(, , float)
 declare  @llvm.aarch64.sve.dup.nxv2f64(, , double)
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @clasta_bf16( %pg,  %a,  %b) {
+; CHECK-LABEL: clasta_bf16:
+; CHECK: clasta z0.h, p0, z0.h, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.clasta.nxv8bf16( %pg,
+   %a,
+   %b)
+  ret  %out
+}
+
 define  @clasta_f32( %pg,  %a,  %b) {
 ; CHECK-LABEL: clasta_f32:
 ; CHECK: clasta z0.s, p0, z0.s, z1.s
@@ -131,6 +141,16 @@
   ret half %out
 }
 
+define bfloat @clasta_n_bf16( %pg, bfloat %a,  %b) {
+; CHECK-LABEL: clasta_n_bf16:
+; CHECK: clasta h0, p0, h0, 

[PATCH] D82249: [HWASan] Disable GlobalISel/FastISel for HWASan Globals.

2020-06-23 Thread Mitch Phillips via Phabricator via cfe-commits
hctim added a comment.

In D82249#2109920 , @eugenis wrote:

> I'm OK with this as a workaround, but it would be more natural to detect the 
> unsupported IR pattern in globalisel and fall back instead of disabling it 
> entirely. Is it difficult to do for some reason?


Eh, it's not an unsupported IR pattern that's the problem, it's that the IR is 
lowered into `adrp + add` so that the `add` can be folded into a `ldr/str` as 
an offset. IMO on the scale of "painting over the problem vs. fixing it", this 
patch is 100% paint, forced fallback with `MO_TAGGED` is 80% paint for maybe 
60% of the work of just fixing it.

I'm working on fixing this properly now that we know we don't need to make a 
less-risky, fast-to-deploy patch. If that all falls into place this patchset 
will just become obsolete anyway.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82249/new/

https://reviews.llvm.org/D82249



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79719: [AIX] Implement AIX special alignment rule about double/long double

2020-06-23 Thread Xiangling Liao via Phabricator via cfe-commits
Xiangling_L updated this revision to Diff 272824.
Xiangling_L marked 2 inline comments as done.
Xiangling_L added a comment.

Adjust the function name;
Adjust the comment;


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79719/new/

https://reviews.llvm.org/D79719

Files:
  clang/include/clang/AST/RecordLayout.h
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/AST/ASTContext.cpp
  clang/lib/AST/RecordLayout.cpp
  clang/lib/AST/RecordLayoutBuilder.cpp
  clang/lib/Basic/Targets/OSTargets.h
  clang/lib/Basic/Targets/PPC.h
  clang/test/Layout/aix-double-struct-member.cpp
  clang/test/Layout/aix-no-unique-address-with-double.cpp
  clang/test/Layout/aix-virtual-function-and-base-with-double.cpp

Index: clang/test/Layout/aix-virtual-function-and-base-with-double.cpp
===
--- /dev/null
+++ clang/test/Layout/aix-virtual-function-and-base-with-double.cpp
@@ -0,0 +1,112 @@
+// RUN: %clang_cc1 -emit-llvm-only -triple powerpc-ibm-aix-xcoff \
+// RUN: -fdump-record-layouts -fsyntax-only %s 2>/dev/null | \
+// RUN:   FileCheck --check-prefixes=CHECK,CHECK32 %s
+
+// RUN: %clang_cc1 -emit-llvm-only -triple powerpc64-ibm-aix-xcoff \
+// RUN: -fdump-record-layouts -fsyntax-only %s 2>/dev/null | \
+// RUN:   FileCheck --check-prefixes=CHECK,CHECK64 %s
+
+namespace test1 {
+struct A {
+  double d1;
+  virtual void boo() {}
+};
+
+struct B {
+  double d2;
+  A a;
+};
+
+struct C : public A {
+  double d3;
+};
+
+int i = sizeof(B);
+int j = sizeof(C);
+
+// CHECK:  *** Dumping AST Record Layout
+// CHECK-NEXT:0 | struct test1::A
+// CHECK-NEXT:0 |   (A vtable pointer)
+// CHECK32-NEXT:  4 |   double d1
+// CHECK32-NEXT:| [sizeof=12, dsize=12, align=4, preferredalign=4,
+// CHECK32-NEXT:|  nvsize=12, nvalign=4, preferrednvalign=4]
+// CHECK64-NEXT:  8 |   double d1
+// CHECK64-NEXT:| [sizeof=16, dsize=16, align=8, preferredalign=8,
+// CHECK64-NEXT:|  nvsize=16, nvalign=8, preferrednvalign=8]
+
+// CHECK:  *** Dumping AST Record Layout
+// CHECK-NEXT:0 | struct test1::B
+// CHECK-NEXT:0 |   double d2
+// CHECK-NEXT:8 |   struct test1::A a
+// CHECK-NEXT:8 | (A vtable pointer)
+// CHECK32-NEXT: 12 | double d1
+// CHECK32-NEXT:| [sizeof=24, dsize=20, align=4, preferredalign=8,
+// CHECK32-NEXT:|  nvsize=20, nvalign=4, preferrednvalign=8]
+// CHECK64-NEXT: 16 | double d1
+// CHECK64-NEXT:| [sizeof=24, dsize=24, align=8, preferredalign=8,
+// CHECK64-NEXT:|  nvsize=24, nvalign=8, preferrednvalign=8]
+
+// CHECK:  *** Dumping AST Record Layout
+// CHECK-NEXT:0 | struct test1::C
+// CHECK-NEXT:0 |   struct test1::A (primary base)
+// CHECK-NEXT:0 | (A vtable pointer)
+// CHECK32-NEXT:  4 | double d1
+// CHECK32-NEXT: 12 |   double d3
+// CHECK32-NEXT:| [sizeof=20, dsize=20, align=4, preferredalign=4,
+// CHECK32-NEXT:|  nvsize=20, nvalign=4, preferrednvalign=4]
+// CHECK64-NEXT:  8 | double d1
+// CHECK64-NEXT: 16 |   double d3
+// CHECK64-NEXT:| [sizeof=24, dsize=24, align=8, preferredalign=8,
+// CHECK64-NEXT:|  nvsize=24, nvalign=8, preferrednvalign=8]
+
+} // namespace test1
+
+namespace test2 {
+struct A {
+  long long l1;
+};
+
+struct B : public virtual A {
+  double d2;
+};
+
+#pragma pack(2)
+struct C : public virtual A {
+  double __attribute__((aligned(4))) d3;
+};
+
+int i = sizeof(B);
+int j = sizeof(C);
+
+// CHECK:  *** Dumping AST Record Layout
+// CHECK-NEXT:0 | struct test2::A
+// CHECK-NEXT:0 |   long long l1
+// CHECK-NEXT:  | [sizeof=8, dsize=8, align=8, preferredalign=8,
+// CHECK-NEXT:  |  nvsize=8, nvalign=8, preferrednvalign=8]
+
+// CHECK:  *** Dumping AST Record Layout
+// CHECK-NEXT:0 | struct test2::B
+// CHECK-NEXT:0 |   (B vtable pointer)
+// CHECK32-NEXT:  4 |   double d2
+// CHECK64-NEXT:  8 |   double d2
+// CHECK-NEXT:   16 |   struct test2::A (virtual base)
+// CHECK-NEXT:   16 | long long l1
+// CHECK-NEXT:  | [sizeof=24, dsize=24, align=8, preferredalign=8,
+// CHECK32-NEXT:|  nvsize=12, nvalign=4, preferrednvalign=4]
+// CHECK64-NEXT:|  nvsize=16, nvalign=8, preferrednvalign=8]
+
+// CHECK:  *** Dumping AST Record Layout
+// CHECK-NEXT:0 | struct test2::C
+// CHECK-NEXT:0 |   (C vtable pointer)
+// CHECK32-NEXT:  4 |   double d3
+// CHECK32-NEXT: 12 |   struct test2::A (virtual base)
+// CHECK32-NEXT: 12 | long long l1
+// CHECK32-NEXT:| [sizeof=20, dsize=20, align=2, preferredalign=2,
+// CHECK32-NEXT:|  nvsize=12, nvalign=2, 

[PATCH] D81869: Modify FPFeatures to use delta not absolute settings to solve PCH compatibility problems

2020-06-23 Thread Melanie Blower via Phabricator via cfe-commits
mibintc added a comment.

I need to make another revision that makes a couple more of the fp options, 
like ffp-contract, "benign" and add a test case that demonstrates fp options 
don't carry over from pch-create to pch-use


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81869/new/

https://reviews.llvm.org/D81869



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82249: [HWASan] Disable GlobalISel/FastISel for HWASan Globals.

2020-06-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D82249#2107845 , @hctim wrote:

> In D82249#2105036 , @arsenm wrote:
>
> > Is the fallback not working correctly in this case for some reason?
>
>
> I'm fairly sure that G_GLOBAL_VALUE used to fallback onto SelectionDAGISel, 
> and that was changed in D78465 . Because we 
> only implemented the custom `adrp+movk` lowering in SelectionDAGISel, making 
> that instruction sequence no longer fallback, so we don't get our proper 
> lowering for HWASan globals.


I don't follow. It no longer falls back, so what is the problem?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82249/new/

https://reviews.llvm.org/D82249



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82403: fix test failure for clang/test/CodeGen/builtin-expect-with-probability.cpp

2020-06-23 Thread Erich Keane via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG47fb21d2ea90: fix test failure for 
clang/test/CodeGen/builtin-expect-with-probability.cpp (authored by LukeZhuang, 
committed by erichkeane).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82403/new/

https://reviews.llvm.org/D82403

Files:
  clang/test/CodeGen/builtin-expect-with-probability.cpp

Index: clang/test/CodeGen/builtin-expect-with-probability.cpp
===
--- clang/test/CodeGen/builtin-expect-with-probability.cpp
+++ clang/test/CodeGen/builtin-expect-with-probability.cpp
@@ -1,14 +1,38 @@
-// RUN: %clang_cc1 -emit-llvm -o - %s -O1 | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm -o - -fexperimental-new-pass-manager %s -O1 | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O1 -disable-llvm-passes | FileCheck %s --check-prefix=ALL --check-prefix=O1
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O0 | FileCheck %s --check-prefix=ALL --check-prefix=O0
 extern int global;
 
+int expect_taken(int x) {
+// ALL-LABEL: expect_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
+int expect_not_taken(int x) {
+// ALL-LABEL: expect_not_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 0, double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 0, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
 struct S {
   static constexpr int prob = 1;
 };
 
 template
-int expect_taken(int x) {
-// CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 2147483647, i32 1}
+int expect_taken_template(int x) {
+// ALL-LABEL: expect_taken_template
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 1.00e+00)
+// O0-NOT:@llvm.expect.with.probability
 
 	if (__builtin_expect_with_probability (x == 100, 1, T::prob)) {
 		return 0;
@@ -17,20 +41,31 @@
 }
 
 int f() {
-  return expect_taken(global);
+  return expect_taken_template(global);
 }
 
-int expect_taken2(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 1932735283, i32 214748366}
+int x;
+extern "C" {
+  int y(void);
+}
+void foo();
 
-  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
-return 0;
-  }
-  return x;
+void expect_value_side_effects() {
+// ALL-LABEL: expect_value_side_effects
+// ALL: [[CALL:%.*]] = call i32 @y
+// O1:  [[SEXT:%.*]] = sext i32 [[CALL]] to i64
+// O1:  call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 [[SEXT]], double 6.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x, y(), 0.6))
+foo();
 }
 
-int expect_taken3(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 107374184, i32 107374184, i32 1717986918, i32 107374184, i32 107374184}
+int switch_cond(int x) {
+// ALL-LABEL: switch_cond
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 8.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
   switch (__builtin_expect_with_probability(x, 1, 0.8)) {
   case 0:
 x = x + 0;
@@ -45,3 +80,22 @@
   }
   return x;
 }
+
+constexpr double prob = 0.8;
+
+int variable_expected(int stuff) {
+// ALL-LABEL: variable_expected
+// O1: call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 {{%.*}}, double 8.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  int res = 0;
+
+  switch(__builtin_expect_with_probability(stuff, stuff, prob)) {
+case 0:
+  res = 1;
+  break;
+default:
+  break;
+  }
+  return res;
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82373: [CodeComplete] Tweak code completion for `typename`

2020-06-23 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet accepted this revision.
kadircet added a comment.
This revision is now accepted and ready to land.

LGTM, as previous version only saves a single character, i.e. pressing a single 
tab after qualifier vs hitting `:` twice, while it is annoying for the 
non-qualified case, now you need to delete the rest.

let me know if I should land this for you.




Comment at: clang/lib/Sema/SemaCodeComplete.cpp:1693
 Builder.AddChunk(CodeCompletionString::CK_HorizontalSpace);
-Builder.AddPlaceholderChunk("qualifier");
-Builder.AddTextChunk("::");
-Builder.AddPlaceholderChunk("name");
+Builder.AddPlaceholderChunk("identifier");
 Results.AddResult(Result(Builder.TakeString()));

nit: let's use `name` instead of `identifier`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82373/new/

https://reviews.llvm.org/D82373



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  
> > > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > > "tail"?
> > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > 
> > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > > > reused here. 
> > > > > > > > But marking, done in markTails(), looks like separate tasks. 
> > > > > > > > i.e. it is better 
> > > > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > > > this - https://reviews.llvm.org/D60031
> > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > markTails but have it computed inplace).
> > > > > > > > 
> > > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > > "tail"?
> > > > > > > > 
> > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > "Tail".
> > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > pointer to local stack: 
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > >return false;
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > "Tail". It also should be checked that other calls does not 
> > > > > > > > capture pointer to local stack:
> > > > > > > 
> > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > infer "tail" in the first place, would we?
> > > > > > If function receives pointer to alloca then it would not be marked 
> > > > > > with "Tail". Then we do not have a possibility to understand 
> > > > > > whether this function receives pointer to alloca but does not 
> > > > > > capture it:
> > > > > > 
> > > > > > ```
> > > > > > void test(int recurseCount)
> > > > > > {
> > > > > > if (recurseCount == 0) return;
> > > > > > int temp = 10;
> > > > > > globalIncrement();
> > > > > > test(recurseCount - 1);
> > > > > > }
> > > > > > ```
> > > > > > 
> > > > > > test - marked with Tail.
> > > > > > globalIncrement - not marked with Tail. But TRE could be done since 
> > > > > > it does not capture pointer. But if it will capture the pointer 
> > > > > > then we could not do TRE. So we need to check 
> > > > > > !Tracker.EscapePoints.empty().
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > test - marked with Tail.
> > > > > 
> > > > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > > > transform isn't legal: the recursive call could access the caller's 
> > > > > version of "temp".
> > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > >transform isn't legal: the recursive call could access the caller's 
> > > > >version of "temp".
> > > > 
> > > > it looks like recursive call could NOT access the caller's version of 
> > > > "temp":
> > > > 
> > > > ```
> > > > test(recurseCount - 1);
> > > > ```
> > > > 
> > > > Caller`s version of temp is accessed by non-recursive call:
> > > > 
> > > > ```
> > > > globalIncrement();
> > > > ```
> > > > 
> > > > If globalIncrement does not capture the "" then TRE looks to be 
> > > > legal for that case. 
> > > > 
> > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > marked with Tail.
> > > > 
> > > > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > > > receive pointer to local stack(Tail) and non-recursive calls must not 
> > > > capture the pointer to local stack.
> > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > illegal?
> > > 
> > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > still do the TRE transform here?
> > >Can you give a complete IR example where we infer "tail", but TRE is 
> > >illegal?
> > 
> > there is no such example. Currently all cases where we  infer "tail" would 
> > be valid for TRE.
> > 
> > >Can you give a complete IR example, we we don't infer "tail", but we still 
> > >do the TRE transform here?
> > 
> > For the following example current code base would not infer "tail" for 
> > _Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
> > This patch changes this behavior: 

[PATCH] D82352: [clangd] Make background index thread count calculation clearer

2020-06-23 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet accepted this revision.
kadircet added a comment.
This revision is now accepted and ready to land.

Thanks LGTM!




Comment at: clang-tools-extra/clangd/index/Background.h:140
+  // In production an explicit value is passed.
+  size_t ThreadPoolSize = 4,
   std::function OnProgress = nullptr);

maybe `ClangdServer::optsForTest().AsyncThreadsCount` instead of hardcoding 4 
in another place ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82352/new/

https://reviews.llvm.org/D82352



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81769: [clang-tidy] Repair various issues with modernize-avoid-bind

2020-06-23 Thread Jeff Trull via Phabricator via cfe-commits
jaafar updated this revision to Diff 272813.
jaafar marked an inline comment as done.
jaafar added a comment.

One more simplification from Aaron. Thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81769/new/

https://reviews.llvm.org/D81769

Files:
  clang-tools-extra/clang-tidy/modernize/AvoidBindCheck.cpp
  
clang-tools-extra/test/clang-tidy/checkers/modernize-avoid-bind-permissive-parameter-list.cpp
  clang-tools-extra/test/clang-tidy/checkers/modernize-avoid-bind.cpp

Index: clang-tools-extra/test/clang-tidy/checkers/modernize-avoid-bind.cpp
===
--- clang-tools-extra/test/clang-tidy/checkers/modernize-avoid-bind.cpp
+++ clang-tools-extra/test/clang-tidy/checkers/modernize-avoid-bind.cpp
@@ -7,11 +7,11 @@
 
 template 
 bind_rt bind(Fp &&, Arguments &&...);
-}
+} // namespace impl
 
 template 
 T ref(T );
-}
+} // namespace std
 
 namespace boost {
 template 
@@ -58,12 +58,33 @@
 
 void UseF(F);
 
+struct G {
+  G() : _member(0) {}
+  G(int m) : _member(m) {}
+
+  template 
+  void operator()(T) const {}
+
+  int _member;
+};
+
+template 
+struct H {
+  void operator()(T) const {};
+};
+
 struct placeholder {};
 placeholder _1;
 placeholder _2;
 
+namespace placeholders {
+using ::_1;
+using ::_2;
+} // namespace placeholders
+
 int add(int x, int y) { return x + y; }
 int addThree(int x, int y, int z) { return x + y + z; }
+void sub(int , int y) { x += y; }
 
 // Let's fake a minimal std::function-like facility.
 namespace std {
@@ -107,6 +128,7 @@
   int MemberVariable;
   static int StaticMemberVariable;
   F MemberStruct;
+  G MemberStructWithData;
 
   void testCaptureByValue(int Param, F f) {
 int x = 3;
@@ -145,6 +167,11 @@
 auto GGG = boost::bind(UseF, MemberStruct);
 // CHECK-MESSAGES: :[[@LINE-1]]:16: warning: prefer a lambda to boost::bind [modernize-avoid-bind]
 // CHECK-FIXES: auto GGG = [this] { return UseF(MemberStruct); };
+
+auto HHH = std::bind(add, MemberStructWithData._member, 1);
+// CHECK-MESSAGES: :[[@LINE-1]]:16: warning: prefer a lambda to std::bind
+// Correctly distinguish data members of other classes
+// CHECK-FIXES: auto HHH = [capture0 = MemberStructWithData._member] { return add(capture0, 1); };
   }
 };
 
@@ -217,17 +244,38 @@
   auto EEE = std::bind(*D::create(), 1, 2);
   // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
   // CHECK-FIXES: auto EEE = [Func = *D::create()] { return Func(1, 2); };
+
+  auto FFF = std::bind(G(), 1);
+  // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
+  // Templated function call operators may be used
+  // CHECK-FIXES: auto FFF = [] { return G()(1); };
+
+  int CTorArg = 42;
+  auto GGG = std::bind(G(CTorArg), 1);
+  // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
+  // Function objects with constructor arguments should be captured
+  // CHECK-FIXES: auto GGG = [Func = G(CTorArg)] { return Func(1); };
 }
 
+template 
+void testMemberFnOfClassTemplate(T) {
+  auto HHH = std::bind(H(), 42);
+  // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
+  // Ensure function class template arguments are preserved
+  // CHECK-FIXES: auto HHH = [] { return H()(42); };
+}
+
+template void testMemberFnOfClassTemplate(int);
+
 void testPlaceholders() {
   int x = 2;
   auto AAA = std::bind(add, x, _1);
   // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
-  // CHECK-FIXES: auto AAA = [x](auto && PH1) { return add(x, PH1); };
+  // CHECK-FIXES: auto AAA = [x](auto && PH1) { return add(x, std::forward(PH1)); };
 
   auto BBB = std::bind(add, _2, _1);
   // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
-  // CHECK-FIXES: auto BBB = [](auto && PH1, auto && PH2) { return add(PH2, PH1); };
+  // CHECK-FIXES: auto BBB = [](auto && PH1, auto && PH2) { return add(std::forward(PH2), std::forward(PH1)); };
 
   // No fix is applied for reused placeholders.
   auto CCC = std::bind(add, _1, _1);
@@ -238,7 +286,12 @@
   // unnamed parameters.
   auto DDD = std::bind(add, _2, 1);
   // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
-  // CHECK-FIXES: auto DDD = [](auto &&, auto && PH2) { return add(PH2, 1); };
+  // CHECK-FIXES: auto DDD = [](auto &&, auto && PH2) { return add(std::forward(PH2), 1); };
+
+  // Namespace-qualified placeholders are valid too
+  auto EEE = std::bind(add, placeholders::_2, 1);
+  // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
+  // CHECK-FIXES: auto EEE = [](auto &&, auto && PH2) { return add(std::forward(PH2), 1); };
 }
 
 void testGlobalFunctions() {
@@ -267,6 +320,7 @@
 void testCapturedSubexpressions() {
   int x = 3;
   int y = 3;
+  int *p = 
 
   auto AAA = std::bind(add, 1, add(2, 5));
   // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: prefer a lambda to std::bind
@@ -277,6 +331,11 @@
   // CHECK-MESSAGES: :[[@LINE-1]]:14: warning: 

[PATCH] D82249: [HWASan] Disable GlobalISel/FastISel for HWASan Globals.

2020-06-23 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis added a comment.

LGTM

I'm OK with this as a workaround, but it would be more natural to detect the 
unsupported IR pattern in globalisel and fall back instead of disabling it 
entirely. Is it difficult to do for some reason?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82249/new/

https://reviews.llvm.org/D82249



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71739: [AssumeBundles] Use operand bundles to encode alignment assumptions

2020-06-23 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert accepted this revision.
jdoerfert added a comment.
This revision is now accepted and ready to land.

Two nits and one question about removed check lines. If they can be re-added, 
LGTM. Otherwise we need to look into that.




Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:2187
  OffsetValue, TheCheck, Assumption);
-  }
 }
 

Maybe invert the conditions:
```
llvm::Instruction *Assumption = Builder.CreateAlignmentAssumption(
  CGM.getDataLayout(), PtrValue, Alignment, OffsetValue);
 if (!SanOpts.has(SanitizerKind::Alignment))
   return;
//sanitizer stuff
```



Comment at: clang/test/CodeGen/builtin-align.c:48
 int up_2 = __builtin_align_up(256, 32);
-// CHECK: @up_2 = global i32 256, align 4
 

Why did these go away?



Comment at: llvm/test/Transforms/AlignmentFromAssumptions/simple32.ll:14
 ; CHECK-NEXT:tail call void @llvm.assume(i1 [[MASKCOND]])
-; CHECK-NEXT:[[TMP0:%.*]] = load i32, i32* [[A]], align 32
+; CHECK-NEXT:[[TMP0:%.*]] = load i32, i32* [[A]], align 4
 ; CHECK-NEXT:ret i32 [[TMP0]]

I think we should change to the new encoding here too. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71739/new/

https://reviews.llvm.org/D71739



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82081: [z/OS] Add binary format goff and operating system zos to the triple

2020-06-23 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments.



Comment at: llvm/lib/Support/Triple.cpp:220
   case Win32: return "windows";
+  case ZOS:
+return "zos";

Follow the local style by deleting the newline.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82081/new/

https://reviews.llvm.org/D82081



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 47fb21d - fix test failure for clang/test/CodeGen/builtin-expect-with-probability.cpp

2020-06-23 Thread Erich Keane via cfe-commits

Author: Zhi Zhuang
Date: 2020-06-23T13:34:35-07:00
New Revision: 47fb21d2ea903fc4cce38f8da8160cf0eacc16d0

URL: 
https://github.com/llvm/llvm-project/commit/47fb21d2ea903fc4cce38f8da8160cf0eacc16d0
DIFF: 
https://github.com/llvm/llvm-project/commit/47fb21d2ea903fc4cce38f8da8160cf0eacc16d0.diff

LOG: fix test failure for clang/test/CodeGen/builtin-expect-with-probability.cpp

Fix test case added by D79830
Rewrite the test case, which did similar thing as builtin-expect.c
does(test generated llvm intrinsic instead of test branch weights).
Currently pass by "-disable-llvm-passes" option.

Differential Revision: https://reviews.llvm.org/D82403

Added: 


Modified: 
clang/test/CodeGen/builtin-expect-with-probability.cpp

Removed: 




diff  --git a/clang/test/CodeGen/builtin-expect-with-probability.cpp 
b/clang/test/CodeGen/builtin-expect-with-probability.cpp
index e63f35be4dcc..ba1d71321a3f 100644
--- a/clang/test/CodeGen/builtin-expect-with-probability.cpp
+++ b/clang/test/CodeGen/builtin-expect-with-probability.cpp
@@ -1,14 +1,38 @@
-// RUN: %clang_cc1 -emit-llvm -o - %s -O1 | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm -o - -fexperimental-new-pass-manager %s -O1 | 
FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O1 
-disable-llvm-passes | FileCheck %s --check-prefix=ALL --check-prefix=O1
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O0 | 
FileCheck %s --check-prefix=ALL --check-prefix=O0
 extern int global;
 
+int expect_taken(int x) {
+// ALL-LABEL: expect_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, 
double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
+int expect_not_taken(int x) {
+// ALL-LABEL: expect_not_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 0, 
double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 0, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
 struct S {
   static constexpr int prob = 1;
 };
 
 template
-int expect_taken(int x) {
-// CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 2147483647, i32 1}
+int expect_taken_template(int x) {
+// ALL-LABEL: expect_taken_template
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, 
double 1.00e+00)
+// O0-NOT:@llvm.expect.with.probability
 
if (__builtin_expect_with_probability (x == 100, 1, T::prob)) {
return 0;
@@ -17,20 +41,31 @@ int expect_taken(int x) {
 }
 
 int f() {
-  return expect_taken(global);
+  return expect_taken_template(global);
 }
 
-int expect_taken2(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 1932735283, i32 214748366}
+int x;
+extern "C" {
+  int y(void);
+}
+void foo();
 
-  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
-return 0;
-  }
-  return x;
+void expect_value_side_effects() {
+// ALL-LABEL: expect_value_side_effects
+// ALL: [[CALL:%.*]] = call i32 @y
+// O1:  [[SEXT:%.*]] = sext i32 [[CALL]] to i64
+// O1:  call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 [[SEXT]], 
double 6.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x, y(), 0.6))
+foo();
 }
 
-int expect_taken3(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 107374184, i32 107374184, 
i32 1717986918, i32 107374184, i32 107374184}
+int switch_cond(int x) {
+// ALL-LABEL: switch_cond
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, 
double 8.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
   switch (__builtin_expect_with_probability(x, 1, 0.8)) {
   case 0:
 x = x + 0;
@@ -45,3 +80,22 @@ int expect_taken3(int x) {
   }
   return x;
 }
+
+constexpr double prob = 0.8;
+
+int variable_expected(int stuff) {
+// ALL-LABEL: variable_expected
+// O1: call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 {{%.*}}, 
double 8.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  int res = 0;
+
+  switch(__builtin_expect_with_probability(stuff, stuff, prob)) {
+case 0:
+  res = 1;
+  break;
+default:
+  break;
+  }
+  return res;
+}



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81869: Modify FPFeatures to use delta not absolute settings to solve PCH compatibility problems

2020-06-23 Thread Melanie Blower via Phabricator via cfe-commits
mibintc updated this revision to Diff 272808.
mibintc added a comment.

The difference between this patch and the earlier one today is that I modified 
the new test case

I misunderstood why the test case was failing. The BENIGN_LANGOPT is working as 
desired and those pch diagnostics are no longer emitted.  I didn't rerun but I 
believe all check-clang (and check-all) will now pass.

What else?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81869/new/

https://reviews.llvm.org/D81869

Files:
  clang/include/clang/AST/Expr.h
  clang/include/clang/AST/ExprCXX.h
  clang/include/clang/AST/Stmt.h
  clang/include/clang/Basic/FPOptions.def
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Basic/LangOptions.h
  clang/include/clang/Sema/Sema.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/include/clang/module.modulemap
  clang/lib/AST/ASTImporter.cpp
  clang/lib/AST/Expr.cpp
  clang/lib/AST/ExprCXX.cpp
  clang/lib/Analysis/BodyFarm.cpp
  clang/lib/Basic/LangOptions.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/CodeGen/CGObjC.cpp
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/Frontend/Rewrite/RewriteModernObjC.cpp
  clang/lib/Frontend/Rewrite/RewriteObjC.cpp
  clang/lib/Parse/ParseDeclCXX.cpp
  clang/lib/Parse/ParsePragma.cpp
  clang/lib/Sema/Sema.cpp
  clang/lib/Sema/SemaAttr.cpp
  clang/lib/Sema/SemaDeclCXX.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/lib/Sema/SemaExprObjC.cpp
  clang/lib/Sema/SemaOverload.cpp
  clang/lib/Sema/SemaPseudoObject.cpp
  clang/lib/Sema/TreeTransform.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTReaderStmt.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/ASTWriterStmt.cpp
  clang/test/CodeGen/fp-floatcontrol-pragma.cpp
  clang/test/SemaOpenCL/fp-options.cl
  llvm/include/llvm/ADT/FloatingPointMode.h

Index: llvm/include/llvm/ADT/FloatingPointMode.h
===
--- llvm/include/llvm/ADT/FloatingPointMode.h
+++ llvm/include/llvm/ADT/FloatingPointMode.h
@@ -40,6 +40,7 @@
   NearestTiesToAway = 4,///< roundTiesToAway.
 
   // Special values.
+  Unset = 6,  ///< Denotes an unset value, (for clang, must fit in 3 bits)
   Dynamic = 7,///< Denotes mode unknown at compile time.
   Invalid = -1///< Denotes invalid value.
 };
Index: clang/test/SemaOpenCL/fp-options.cl
===
--- /dev/null
+++ clang/test/SemaOpenCL/fp-options.cl
@@ -0,0 +1,4 @@
+// RUN: %clang_cc1 %s -finclude-default-header -triple spir-unknown-unknown -emit-pch -o %t.pch
+// RUN: %clang_cc1 %s -finclude-default-header -cl-no-signed-zeros -triple spir-unknown-unknown -include-pch %t.pch -fsyntax-only -verify
+// expected-no-diagnostics
+
Index: clang/test/CodeGen/fp-floatcontrol-pragma.cpp
===
--- clang/test/CodeGen/fp-floatcontrol-pragma.cpp
+++ clang/test/CodeGen/fp-floatcontrol-pragma.cpp
@@ -119,6 +119,24 @@
   return x;
 }
 
+#pragma float_control(push)
+#pragma float_control(precise, on)
+struct Distance {};
+Distance operator+(Distance, Distance);
+
+template  T add(T lhs, T rhs) {
+#pragma float_control(except, on)
+  return lhs + rhs;
+}
+#pragma float_control(pop)
+
+float test_OperatorCall() {
+  return add(1.0f, 2.0f);
+//CHECK: llvm.experimental.constrained.fadd{{.*}}fpexcept.strict
+}
+// CHECK-LABEL define float  {{.*}}test_OperatorCall{{.*}}
+
+
 #if FENV_ON
 // expected-warning@+1{{pragma STDC FENV_ACCESS ON is not supported, ignoring pragma}}
 #pragma STDC FENV_ACCESS ON
Index: clang/lib/Serialization/ASTWriterStmt.cpp
===
--- clang/lib/Serialization/ASTWriterStmt.cpp
+++ clang/lib/Serialization/ASTWriterStmt.cpp
@@ -1545,8 +1545,8 @@
 void ASTStmtWriter::VisitCXXOperatorCallExpr(CXXOperatorCallExpr *E) {
   VisitCallExpr(E);
   Record.push_back(E->getOperator());
-  Record.push_back(E->getFPFeatures().getAsOpaqueInt());
   Record.AddSourceRange(E->Range);
+  Record.push_back(E->getFPFeatures().getAsOpaqueInt());
   Code = serialization::EXPR_CXX_OPERATOR_CALL;
 }
 
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -3960,7 +3960,7 @@
 }
 
 /// Write an FP_PRAGMA_OPTIONS block for the given FPOptions.
-void ASTWriter::WriteFPPragmaOptions(const FPOptions ) {
+void ASTWriter::WriteFPPragmaOptions(const FPOptionsOverride ) {
   RecordData::value_type Record[] = {Opts.getAsOpaqueInt()};
   Stream.EmitRecord(FP_PRAGMA_OPTIONS, Record);
 }
@@ -4790,7 +4790,7 @@
   WriteReferencedSelectorsPool(SemaRef);
   WriteLateParsedTemplates(SemaRef);
   WriteIdentifierTable(PP, SemaRef.IdResolver, isModule);
-  WriteFPPragmaOptions(SemaRef.getCurFPFeatures());
+ 

[PATCH] D81285: [builtins] Change si_int to int in some helper declarations

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma accepted this revision.
efriedma added a comment.
This revision is now accepted and ready to land.

LGTM.  (Please wait a few days before merging to see if anyone else has 
comments.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81285/new/

https://reviews.llvm.org/D81285



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81920: [clangd] Change FSProvider::getFileSystem to take CurrentWorkingDirectory

2020-06-23 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added a comment.

thanks for the info @uabelho!

this looks like a dormant warning though, as StringRef is not implicitly 
convertible to NoneType (and vice-versa) hence anyone trying to make use of the 
hidden overload would get a hard compile error anyways.
Moreover this class is mostly accessed through a base pointer, hence name 
hiding in derived classes isn't really an issue (for most of the production 
code).

Also the warning itself seems to be noisy 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20423. Interesting this seems to 
be only enabled for clang and nothing else, I wonder how it is decided.
Unfortunately history doesn't tell much 
https://github.com/llvm/llvm-project/blame/master/clang/CMakeLists.txt#L396.

There are 5 derived classes (3 of them are in tests), so just putting a using 
declaration to un-hide the overload seems too disruptive.
Again renaming the endpoints (and possibly changing the signature) just to 
suppress this warning also doesn't seem so nice.

I would rather like to turn-off this warning for at least gcc, assuming this is 
not specific to that version. Can you check if you see the warning with 
different versions of gcc?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81920/new/

https://reviews.llvm.org/D81920



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Gui Andrade via Phabricator via cfe-commits
guiand marked an inline comment as done.
guiand added inline comments.



Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3077
+Value *LowShadow = IRB.CreateOr(LowA, LowB);
+Value *Shadow = IRB.CreateInsertElement(Second, LowShadow, 
IRB.getInt32(0));
+

eugenis wrote:
> You probably want to insert in First, not Second.
> 
> Is the generated code any better if you OR the vectors, and then shuffle to 
> put the top element of First into the top element of the output? That's what 
> LLVM generates if I express this logic in C.
> 
> 
The codegen is basically identical either way, but if you'd like I can still 
upload a patch to change these into shufflevector instructions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

In D82314#2109728 , @lxfind wrote:

> @rsmith Thanks. That's a good point. Do you know if there already exists 
> optimization passes in LLVM that attempts to shrink the range of lifetime 
> intrinsics? If so, I am curious why that does not help in this case. Or is it 
> generally unsafe to move the lifetime intrinsics, and we could only do it 
> here with specific context knowledge about coroutines.


I don't know for sure, but I would expect someone to have implemented such a 
pass already. Moving a lifetime start intrinsic later, past instructions that 
can't possibly reference the object in question, seems like it should always be 
safe and (presumably) should always be a good thing to do, and similarly for 
moving lifetime end markers earlier. It could be that such a pass exists but it 
is run too late in the pass pipeline, so the coroutine split pass doesn't get 
to take advantage of it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82224: [OpenMP][NFC] Remove hard-coded line numbers from test

2020-06-23 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert accepted this revision.
jdoerfert added a comment.
This revision is now accepted and ready to land.

LGTM. Thx!


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82224/new/

https://reviews.llvm.org/D82224



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82403: fix test failure for clang/test/CodeGen/builtin-expect-with-probability.cpp

2020-06-23 Thread Zhi Zhuang via Phabricator via cfe-commits
LukeZhuang updated this revision to Diff 272805.
LukeZhuang added a comment.

Fixed. If it looks good to you, could you please help me commit it? Thank you 
very much!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82403/new/

https://reviews.llvm.org/D82403

Files:
  clang/test/CodeGen/builtin-expect-with-probability.cpp

Index: clang/test/CodeGen/builtin-expect-with-probability.cpp
===
--- clang/test/CodeGen/builtin-expect-with-probability.cpp
+++ clang/test/CodeGen/builtin-expect-with-probability.cpp
@@ -1,14 +1,38 @@
-// RUN: %clang_cc1 -emit-llvm -o - %s -O1 | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm -o - -fexperimental-new-pass-manager %s -O1 | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O1 -disable-llvm-passes | FileCheck %s --check-prefix=ALL --check-prefix=O1
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O0 | FileCheck %s --check-prefix=ALL --check-prefix=O0
 extern int global;
 
+int expect_taken(int x) {
+// ALL-LABEL: expect_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
+int expect_not_taken(int x) {
+// ALL-LABEL: expect_not_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 0, double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 0, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
 struct S {
   static constexpr int prob = 1;
 };
 
 template
-int expect_taken(int x) {
-// CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 2147483647, i32 1}
+int expect_taken_template(int x) {
+// ALL-LABEL: expect_taken_template
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 1.00e+00)
+// O0-NOT:@llvm.expect.with.probability
 
 	if (__builtin_expect_with_probability (x == 100, 1, T::prob)) {
 		return 0;
@@ -17,20 +41,31 @@
 }
 
 int f() {
-  return expect_taken(global);
+  return expect_taken_template(global);
 }
 
-int expect_taken2(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 1932735283, i32 214748366}
+int x;
+extern "C" {
+  int y(void);
+}
+void foo();
 
-  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
-return 0;
-  }
-  return x;
+void expect_value_side_effects() {
+// ALL-LABEL: expect_value_side_effects
+// ALL: [[CALL:%.*]] = call i32 @y
+// O1:  [[SEXT:%.*]] = sext i32 [[CALL]] to i64
+// O1:  call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 [[SEXT]], double 6.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x, y(), 0.6))
+foo();
 }
 
-int expect_taken3(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 107374184, i32 107374184, i32 1717986918, i32 107374184, i32 107374184}
+int switch_cond(int x) {
+// ALL-LABEL: switch_cond
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 8.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
   switch (__builtin_expect_with_probability(x, 1, 0.8)) {
   case 0:
 x = x + 0;
@@ -45,3 +80,22 @@
   }
   return x;
 }
+
+constexpr double prob = 0.8;
+
+int variable_expected(int stuff) {
+// ALL-LABEL: variable_expected
+// O1: call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 {{%.*}}, double 8.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  int res = 0;
+
+  switch(__builtin_expect_with_probability(stuff, stuff, prob)) {
+case 0:
+  res = 1;
+  break;
+default:
+  break;
+  }
+  return res;
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > avl wrote:
> > > > > > > efriedma wrote:
> > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is 
> > > > > > > > it not enough that the call you're trying to TRE is marked 
> > > > > > > > "tail"?
> > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > 
> > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > > reused here. 
> > > > > > > But marking, done in markTails(), looks like separate tasks. i.e. 
> > > > > > > it is better 
> > > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > > this - https://reviews.llvm.org/D60031
> > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > markTails but have it computed inplace).
> > > > > > > 
> > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > "tail"?
> > > > > > > 
> > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > "Tail".
> > > > > > > It also should be checked that other calls does not capture 
> > > > > > > pointer to local stack: 
> > > > > > > 
> > > > > > > ```
> > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > >return false;
> > > > > > > 
> > > > > > > ```
> > > > > > > 
> > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > "Tail". It also should be checked that other calls does not 
> > > > > > > capture pointer to local stack:
> > > > > > 
> > > > > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > > > > "tail" in the first place, would we?
> > > > > If function receives pointer to alloca then it would not be marked 
> > > > > with "Tail". Then we do not have a possibility to understand whether 
> > > > > this function receives pointer to alloca but does not capture it:
> > > > > 
> > > > > ```
> > > > > void test(int recurseCount)
> > > > > {
> > > > > if (recurseCount == 0) return;
> > > > > int temp = 10;
> > > > > globalIncrement();
> > > > > test(recurseCount - 1);
> > > > > }
> > > > > ```
> > > > > 
> > > > > test - marked with Tail.
> > > > > globalIncrement - not marked with Tail. But TRE could be done since 
> > > > > it does not capture pointer. But if it will capture the pointer then 
> > > > > we could not do TRE. So we need to check 
> > > > > !Tracker.EscapePoints.empty().
> > > > > 
> > > > > 
> > > > > 
> > > > > test - marked with Tail.
> > > > 
> > > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > > transform isn't legal: the recursive call could access the caller's 
> > > > version of "temp".
> > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > >transform isn't legal: the recursive call could access the caller's 
> > > >version of "temp".
> > > 
> > > it looks like recursive call could NOT access the caller's version of 
> > > "temp":
> > > 
> > > ```
> > > test(recurseCount - 1);
> > > ```
> > > 
> > > Caller`s version of temp is accessed by non-recursive call:
> > > 
> > > ```
> > > globalIncrement();
> > > ```
> > > 
> > > If globalIncrement does not capture the "" then TRE looks to be 
> > > legal for that case. 
> > > 
> > > globalIncrement() would not be marked with "Tail". test() would be marked 
> > > with Tail.
> > > 
> > > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > > receive pointer to local stack(Tail) and non-recursive calls must not 
> > > capture the pointer to local stack.
> > Can you give a complete IR example where we infer "tail", but TRE is 
> > illegal?
> > 
> > Can you give a complete IR example, we we don't infer "tail", but we still 
> > do the TRE transform here?
> >Can you give a complete IR example where we infer "tail", but TRE is illegal?
> 
> there is no such example. Currently all cases where we  infer "tail" would be 
> valid for TRE.
> 
> >Can you give a complete IR example, we we don't infer "tail", but we still 
> >do the TRE transform here?
> 
> For the following example current code base would not infer "tail" for 
> _Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
> This patch changes this behavior: so that if _Z15globalIncrementPKi is not 
> marked with "tail" and does not capture its pointer argument - TRE would be 
> allowed for _Z4testi.
> 
> 
> ```
> @count = dso_local local_unnamed_addr global i32 0, align 4
> 
> ; Function Attrs: nofree noinline norecurse nounwind uwtable
> 

[PATCH] D82403: fix test failure for clang/test/CodeGen/builtin-expect-with-probability.cpp

2020-06-23 Thread Erich Keane via Phabricator via cfe-commits
erichkeane accepted this revision.
erichkeane added a comment.
This revision is now accepted and ready to land.

1 suggestion to the test, but otherwise looks OK.  Let me know if you'd like 
this committed for you.




Comment at: clang/test/CodeGen/builtin-expect-with-probability.cpp:48
+int x;
+int y(void);
+void foo();

You might consider making this 'extern C' so that you can do a more exact check 
on line 53.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82403/new/

https://reviews.llvm.org/D82403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81869: Modify FPFeatures to use delta not absolute settings to solve PCH compatibility problems

2020-06-23 Thread Melanie Blower via Phabricator via cfe-commits
mibintc updated this revision to Diff 272798.
mibintc added a comment.

I responded to review from @riccibruno and @rjmccall : I put the dump() 
routines into the .cpp file and changed them to use the x-macros.  I moved 
CXXOperatorCall.FPOptionsOverride from the Expr bits into the OperatorCall 
itself. I added the operator call test case suggested by John.  I must be 
mistaken about the bug, the operator call addition does account for the 
FPFeatures in both the unmodified compiler and the patched compiler.

The new test case is failing. Need to dig into that. I get this error:

error: no expected directives found: consider use of 'expected-no-diagnostics'
error: 'error' diagnostics seen but not expected:

  (frontend): Include default header file for OpenCL was enabled in PCH file 
but is currently disabled

2 errors generated.

  Clang :: SemaOpenCL/fp-options.cl

For the same test case, the unmodified compiler shows this,
error: no expected directives found: consider use of 'expected-no-diagnostics'
error: 'error' diagnostics seen but not expected:

  (frontend): Permit Floating Point optimization without regard to signed zeros 
was disabled in PCH file but is currently enabled

2 errors generated

What else?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81869/new/

https://reviews.llvm.org/D81869

Files:
  clang/include/clang/AST/Expr.h
  clang/include/clang/AST/ExprCXX.h
  clang/include/clang/AST/Stmt.h
  clang/include/clang/Basic/FPOptions.def
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Basic/LangOptions.h
  clang/include/clang/Sema/Sema.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/include/clang/module.modulemap
  clang/lib/AST/ASTImporter.cpp
  clang/lib/AST/Expr.cpp
  clang/lib/AST/ExprCXX.cpp
  clang/lib/Analysis/BodyFarm.cpp
  clang/lib/Basic/LangOptions.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/CodeGen/CGObjC.cpp
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/Frontend/Rewrite/RewriteModernObjC.cpp
  clang/lib/Frontend/Rewrite/RewriteObjC.cpp
  clang/lib/Parse/ParseDeclCXX.cpp
  clang/lib/Parse/ParsePragma.cpp
  clang/lib/Sema/Sema.cpp
  clang/lib/Sema/SemaAttr.cpp
  clang/lib/Sema/SemaDeclCXX.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/lib/Sema/SemaExprObjC.cpp
  clang/lib/Sema/SemaOverload.cpp
  clang/lib/Sema/SemaPseudoObject.cpp
  clang/lib/Sema/TreeTransform.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTReaderStmt.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/ASTWriterStmt.cpp
  clang/test/CodeGen/fp-floatcontrol-pragma.cpp
  clang/test/SemaOpenCL/fp-options.cl
  llvm/include/llvm/ADT/FloatingPointMode.h

Index: llvm/include/llvm/ADT/FloatingPointMode.h
===
--- llvm/include/llvm/ADT/FloatingPointMode.h
+++ llvm/include/llvm/ADT/FloatingPointMode.h
@@ -40,6 +40,7 @@
   NearestTiesToAway = 4,///< roundTiesToAway.
 
   // Special values.
+  Unset = 6,  ///< Denotes an unset value, (for clang, must fit in 3 bits)
   Dynamic = 7,///< Denotes mode unknown at compile time.
   Invalid = -1///< Denotes invalid value.
 };
Index: clang/test/SemaOpenCL/fp-options.cl
===
--- /dev/null
+++ clang/test/SemaOpenCL/fp-options.cl
@@ -0,0 +1,4 @@
+// RUN: %clang_cc1 %s -finclude-default-header -triple spir-unknown-unknown -emit-pch -o %t.pch
+// RUN: %clang_cc1 %s -cl-no-signed-zeros -triple spir-unknown-unknown -include-pch %t.pch -fsyntax-only -verify
+// expected-no-diagnostics
+
Index: clang/test/CodeGen/fp-floatcontrol-pragma.cpp
===
--- clang/test/CodeGen/fp-floatcontrol-pragma.cpp
+++ clang/test/CodeGen/fp-floatcontrol-pragma.cpp
@@ -119,6 +119,24 @@
   return x;
 }
 
+#pragma float_control(push)
+#pragma float_control(precise, on)
+struct Distance {};
+Distance operator+(Distance, Distance);
+
+template  T add(T lhs, T rhs) {
+#pragma float_control(except, on)
+  return lhs + rhs;
+}
+#pragma float_control(pop)
+
+float test_OperatorCall() {
+  return add(1.0f, 2.0f);
+//CHECK: llvm.experimental.constrained.fadd{{.*}}fpexcept.strict
+}
+// CHECK-LABEL define float  {{.*}}test_OperatorCall{{.*}}
+
+
 #if FENV_ON
 // expected-warning@+1{{pragma STDC FENV_ACCESS ON is not supported, ignoring pragma}}
 #pragma STDC FENV_ACCESS ON
Index: clang/lib/Serialization/ASTWriterStmt.cpp
===
--- clang/lib/Serialization/ASTWriterStmt.cpp
+++ clang/lib/Serialization/ASTWriterStmt.cpp
@@ -1545,8 +1545,8 @@
 void ASTStmtWriter::VisitCXXOperatorCallExpr(CXXOperatorCallExpr *E) {
   VisitCallExpr(E);
   Record.push_back(E->getOperator());
-  Record.push_back(E->getFPFeatures().getAsOpaqueInt());
   Record.AddSourceRange(E->Range);
+  

[PATCH] D82403: fix test failure for clang/test/CodeGen/builtin-expect-with-probability.cpp

2020-06-23 Thread Zhi Zhuang via Phabricator via cfe-commits
LukeZhuang created this revision.
LukeZhuang added a reviewer: erichkeane.
LukeZhuang added projects: LLVM, clang.
Herald added a subscriber: cfe-commits.
LukeZhuang edited the summary of this revision.

Fix test case added by D79830 
Rewrite the test case, which did similar thing as builtin-expect.c does(test 
generated llvm intrinsic instead of test branch weights). 
Currently pass by "-disable-llvm-passes" option.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82403

Files:
  clang/test/CodeGen/builtin-expect-with-probability.cpp

Index: clang/test/CodeGen/builtin-expect-with-probability.cpp
===
--- clang/test/CodeGen/builtin-expect-with-probability.cpp
+++ clang/test/CodeGen/builtin-expect-with-probability.cpp
@@ -1,14 +1,38 @@
-// RUN: %clang_cc1 -emit-llvm -o - %s -O1 | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm -o - -fexperimental-new-pass-manager %s -O1 | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O1 -disable-llvm-passes | FileCheck %s --check-prefix=ALL --check-prefix=O1
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s -O0 | FileCheck %s --check-prefix=ALL --check-prefix=O0
 extern int global;
 
+int expect_taken(int x) {
+// ALL-LABEL: expect_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
+int expect_not_taken(int x) {
+// ALL-LABEL: expect_not_taken
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 0, double 9.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x == 100, 0, 0.9)) {
+return 0;
+  }
+  return x;
+}
+
 struct S {
   static constexpr int prob = 1;
 };
 
 template
-int expect_taken(int x) {
-// CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 2147483647, i32 1}
+int expect_taken_template(int x) {
+// ALL-LABEL: expect_taken_template
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 1.00e+00)
+// O0-NOT:@llvm.expect.with.probability
 
 	if (__builtin_expect_with_probability (x == 100, 1, T::prob)) {
 		return 0;
@@ -17,20 +41,29 @@
 }
 
 int f() {
-  return expect_taken(global);
+  return expect_taken_template(global);
 }
 
-int expect_taken2(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 1932735283, i32 214748366}
+int x;
+int y(void);
+void foo();
 
-  if (__builtin_expect_with_probability(x == 100, 1, 0.9)) {
-return 0;
-  }
-  return x;
+void expect_value_side_effects() {
+// ALL-LABEL: expect_value_side_effects
+// ALL: [[CALL:%.*]] = call i32 @{{.*}}y
+// O1:  [[SEXT:%.*]] = sext i32 [[CALL]] to i64
+// O1:  call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 [[SEXT]], double 6.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  if (__builtin_expect_with_probability(x, y(), 0.6))
+foo();
 }
 
-int expect_taken3(int x) {
-  // CHECK: !{{[0-9]+}} = !{!"branch_weights", i32 107374184, i32 107374184, i32 1717986918, i32 107374184, i32 107374184}
+int switch_cond(int x) {
+// ALL-LABEL: switch_cond
+// O1:call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 1, double 8.00e-01)
+// O0-NOT:@llvm.expect.with.probability
+
   switch (__builtin_expect_with_probability(x, 1, 0.8)) {
   case 0:
 x = x + 0;
@@ -45,3 +78,22 @@
   }
   return x;
 }
+
+constexpr double prob = 0.8;
+
+int variable_expected(int stuff) {
+// ALL-LABEL: variable_expected
+// O1: call i64 @llvm.expect.with.probability.i64(i64 {{%.*}}, i64 {{%.*}}, double 8.00e-01)
+// O0-NOT: @llvm.expect.with.probability
+
+  int res = 0;
+
+  switch(__builtin_expect_with_probability(stuff, stuff, prob)) {
+case 0:
+  res = 1;
+  break;
+default:
+  break;
+  }
+  return res;
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79830: Add support of __builtin_expect_with_probability

2020-06-23 Thread Zhi Zhuang via Phabricator via cfe-commits
LukeZhuang added a comment.

In D79830#2109324 , @erichkeane wrote:

> @LukeZhuang : This patch causes the buildbots to fail, as O1 
>  means something slightly 
> different with the new pass manager :
>  
> http://lab.llvm.org:8011/builders/clang-x86_64-debian-new-pass-manager-fast/builds/10542/steps/test-check-all/logs/FAIL%3A%20Clang%3A%3Abuiltin-expect-with-probability.cpp
>
> I'm sorry I didn't notice it during review, but the test that is failing is a 
> poorly written test.  The CFE tests shouldn't be written in a way that 
> depends on the actions of the optimizer, so testing the branch_weights is 
> incorrect.
>
> Please submit a new patch with a way to validate the clang codegen actions 
> without depending on the optimization (that is, would work with 
> -disable-llvm-passes), and if necessary, add a test to llvm to ensure the 
> proper result is validated.
>
> EDIT: To clarify, I've unblocked the buildbots by doing a temporary fix.  But 
> this test needs to be rewritten.


Thank you for pointing out! I have fixed by mimic the test case 
builtin-expect.c does. It makes sense to test generated llvm intrinsic instead 
of branch weight here. The lowering to branch weight is already tested in llvm 
side.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79830/new/

https://reviews.llvm.org/D79830



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is 
> > > > > > > it not enough that the call you're trying to TRE is marked "tail"?
> > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > 
> > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > reused here. 
> > > > > > But marking, done in markTails(), looks like separate tasks. i.e. 
> > > > > > it is better 
> > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > this - https://reviews.llvm.org/D60031
> > > > > > Thus such separation looks useful(To not reuse result of markTails 
> > > > > > but have it computed inplace).
> > > > > > 
> > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > "tail"?
> > > > > > 
> > > > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > > > It also should be checked that other calls does not capture pointer 
> > > > > > to local stack: 
> > > > > > 
> > > > > > ```
> > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > if (!Tracker.EscapePoints.empty())
> > > > > >return false;
> > > > > > 
> > > > > > ```
> > > > > > 
> > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > "Tail". It also should be checked that other calls does not capture 
> > > > > > pointer to local stack:
> > > > > 
> > > > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > > > "tail" in the first place, would we?
> > > > If function receives pointer to alloca then it would not be marked with 
> > > > "Tail". Then we do not have a possibility to understand whether this 
> > > > function receives pointer to alloca but does not capture it:
> > > > 
> > > > ```
> > > > void test(int recurseCount)
> > > > {
> > > > if (recurseCount == 0) return;
> > > > int temp = 10;
> > > > globalIncrement();
> > > > test(recurseCount - 1);
> > > > }
> > > > ```
> > > > 
> > > > test - marked with Tail.
> > > > globalIncrement - not marked with Tail. But TRE could be done since it 
> > > > does not capture pointer. But if it will capture the pointer then we 
> > > > could not do TRE. So we need to check !Tracker.EscapePoints.empty().
> > > > 
> > > > 
> > > > 
> > > > test - marked with Tail.
> > > 
> > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > transform isn't legal: the recursive call could access the caller's 
> > > version of "temp".
> > >For the given code, TRE won't mark the recursive call "tail". That 
> > >transform isn't legal: the recursive call could access the caller's 
> > >version of "temp".
> > 
> > it looks like recursive call could NOT access the caller's version of 
> > "temp":
> > 
> > ```
> > test(recurseCount - 1);
> > ```
> > 
> > Caller`s version of temp is accessed by non-recursive call:
> > 
> > ```
> > globalIncrement();
> > ```
> > 
> > If globalIncrement does not capture the "" then TRE looks to be legal 
> > for that case. 
> > 
> > globalIncrement() would not be marked with "Tail". test() would be marked 
> > with Tail.
> > 
> > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > receive pointer to local stack(Tail) and non-recursive calls must not 
> > capture the pointer to local stack.
> Can you give a complete IR example where we infer "tail", but TRE is illegal?
> 
> Can you give a complete IR example, we we don't infer "tail", but we still do 
> the TRE transform here?
>Can you give a complete IR example where we infer "tail", but TRE is illegal?

there is no such example. Currently all cases where we  infer "tail" would be 
valid for TRE.

>Can you give a complete IR example, we we don't infer "tail", but we still do 
>the TRE transform here?

For the following example current code base would not infer "tail" for 
_Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
This patch changes this behavior: so that if _Z15globalIncrementPKi is not 
marked with "tail" and does not capture its pointer argument - TRE would be 
allowed for _Z4testi.


```
@count = dso_local local_unnamed_addr global i32 0, align 4

; Function Attrs: nofree noinline norecurse nounwind uwtable
define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) 
local_unnamed_addr #0 {
entry:
  %0 = load i32, i32* %param, align 4
  %1 = load i32, i32* @count, align 4
  %add = add nsw i32 %1, %0
  store i32 %add, i32* @count, align 4
  ret 

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread Xun Li via Phabricator via cfe-commits
lxfind added a comment.

@rsmith Thanks. That's a good point. Do you know if there already exists 
optimization passes in LLVM that attempts to shrink the range of lifetime 
intrinsics? If so, I am curious why that does not help in this case. Or is it 
generally unsafe to move the lifetime intrinsics, and we could only do it here 
with specific context knowledge about coroutines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81176: [HIP] Add default header and include path

2020-06-23 Thread Han Zhu via Phabricator via cfe-commits
zhuhan0 added a comment.

In D81176#2106382 , @yaxunl wrote:

> In D81176#2105944 , @zhuhan0 wrote:
>
> > This broke a test `clang/test/Tooling/clang-check-offload.cpp` for a 
> > critical Linux distro at Facebook. With this change, the test adds a 
> > `-include __clang_hip_runtime_wrapper` argument. The wrapper includes some 
> > standard c++ headers, but our distro don't have those headers in the 
> > default include paths, thus causing a break.
> >
> > I notice this behavior doesn't happen for CUDA tests, which also rely on a 
> > similar `__clang_cuda_runtime_wrapper`. I think what's causing the 
> > difference is the different handling of `nogpuinc/nogpulib` option. My 
> > knowledge on this area is limited, so correct me if I'm wrong. CUDA seems 
> > to respect `nogpuinc` and doesn't include its wrapper if the flag is 
> > provided: 
> > https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Cuda.cpp#L255.
> >  But based on this change, HIP does things differently: 
> > https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/AMDGPU.cpp#L226.
> >
> > If I modify `RocmInstallationDetector::AddHIPIncludeArgs` to also respect 
> > `nogpuinc/nogpulib`, the test will pass for us. Is it a mistake for HIP to 
> > always include the wrapper file? Could you provide a fix for this issue? 
> > Thanks!
>
>
> Thanks for investigating the issue. It makes sense to respect nogpuinc and 
> nogpulib. fixed by 2580635bd2f3c0527353e4d7823326cd9f92ff7c 
> 


It works! Thanks for the quick fix.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81176/new/

https://reviews.llvm.org/D81176



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

In D82314#2109437 , @lxfind wrote:

> In D82314#2107910 , @junparser wrote:
>
> > Rather than doing it here, can we build await_resume call expression with 
> > MaterializedTemporaryExpr when expand the coawait expression. That's how 
> > gcc does.
>
>
> There doesn't appear to be a way to do that in Clang. It goes from the AST to 
> IR directly, and there needs to be a MaterializedTemporaryExpr to wrap the 
> result of co_await. Could you elaborate on how this might be done in Clang?


For a call such as:

  coro f() { awaitable a; (co_await a).g(); }

we produce an AST like:

  |   |   `-CXXMemberCallExpr  'void'
  |   | `-MemberExpr  '' .g 
0x55d38948ca98
  |   |   `-MaterializeTemporaryExpr  'huge' xvalue
  |   | `-ParenExpr  'huge'
  |   |   `-CoawaitExpr  'huge'
  |   | |-DeclRefExpr  'awaitable' lvalue Var 
0x55d38948d4c8 'a' 'awaitable'
  |   | |-CXXMemberCallExpr  'bool'
  |   | | `-MemberExpr  '' 
.await_ready 0x55d38948cee8
  |   | |   `-OpaqueValueExpr  'awaitable' lvalue
  |   | | `-DeclRefExpr  'awaitable' lvalue Var 
0x55d38948d4c8 'a' 'awaitable'
  |   | |-CXXMemberCallExpr  'void'
  |   | | |-MemberExpr  '' 
.await_suspend 0x55d38948d298
  |   | | | `-OpaqueValueExpr  'awaitable' lvalue
  |   | | |   `-DeclRefExpr  'awaitable' lvalue Var 
0x55d38948d4c8 'a' 'awaitable'
  |   | | `-CXXConstructExpr  
'std::coroutine_handle<>':'std::experimental::coroutines_v1::coroutine_handle'
 'void (std::experimental::coroutines_v1::coroutine_handle &&) noexcept'
  |   | |   `-ImplicitCastExpr  
'std::experimental::coroutines_v1::coroutine_handle' xvalue 

  |   | | `-MaterializeTemporaryExpr  
'std::experimental::coroutines_v1::coroutine_handle' xvalue
  |   | |   `-CallExpr  
'std::experimental::coroutines_v1::coroutine_handle'
  |   | | |-ImplicitCastExpr  
'std::experimental::coroutines_v1::coroutine_handle 
(*)(void *) noexcept' 
  |   | | | `-DeclRefExpr  
'std::experimental::coroutines_v1::coroutine_handle (void 
*) noexcept' lvalue CXXMethod 0x55d38948adb0 'from_address' 
'std::experimental::coroutines_v1::coroutine_handle (void 
*) noexcept'
  |   | | `-CallExpr  'void *'
  |   | |   `-ImplicitCastExpr  'void *(*)() 
noexcept' 
  |   | | `-DeclRefExpr  'void *() noexcept' 
lvalue Function 0x55d38948ef38 '__builtin_coro_frame' 'void *() noexcept'
  |   | `-CXXBindTemporaryExpr  'huge' (CXXTemporary 
0x55d38948ffb8)
  |   |   `-CXXMemberCallExpr  'huge'
  |   | `-MemberExpr  '' 
.await_resume 0x55d38948cff8
  |   |   `-OpaqueValueExpr  'awaitable' lvalue
  |   | `-DeclRefExpr  'awaitable' lvalue Var 
0x55d38948d4c8 'a' 'awaitable'

See https://godbolt.org/z/hVn2u9.

I think the suggestion is to move the `MaterializeTemporaryExpr` from being 
wrapped around the  `CoawaitExpr` to being wrapped around the `.await_resume` 
call within it.

Unfortunately, that's not a correct change. The language rules require us to 
delay materializing the temporary until we see how it is used. For example, in 
a case such as

  g(co_await a);

... no temporary is materialized at all, and the `await_resume` call instead 
directly initializes the parameter slot for the function.



It seems to me that the same issue (of making large objects unnecessarily live 
across suspend points) can arise for other similar cases too. For example, 
consider:

  huge(co_await a).g(); // suppose `co_await a` returns something small

Here again we will start the lifetime of the `huge` temporary before the 
suspend point, and only initialize it after the resume. Your change won't help 
here, because it's not the lifetime markers of the value produced by `co_await` 
that are the problem.

As a result of the above, I think this is the wrong level at which to perform 
this optimization. Instead, I think you should consider whether we can move 
lifetime start markers later (and end markers earlier, for unescaped locals) as 
part of the coroutine splitting pass.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82332: [Coroutines] Handle dependent promise types for final_suspend non-throw check

2020-06-23 Thread Xun Li via Phabricator via cfe-commits
lxfind created this revision.
Herald added subscribers: cfe-commits, modocache.
Herald added a project: clang.
lxfind updated this revision to Diff 272553.
lxfind added a comment.
lxfind added reviewers: Benabik, lewissbaker, junparser.
lxfind updated this revision to Diff 272786.
lxfind edited the summary of this revision.
lxfind requested review of this revision.

Simplify tests


lxfind added a comment.

Rebase


Check that the co_await promise.final_suspend() does not potentially throw 
again after we have resolved dependent types.
This takes care of the cases where promises types are templated.
Added test cases for this scenario and confirmed that the checks happen now.
Also run libcxx tests locally to make sure all tests pass.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82332

Files:
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCoroutine.cpp
  clang/lib/Sema/TreeTransform.h
  clang/test/SemaCXX/coroutine-final-suspend-noexcept.cpp

Index: clang/test/SemaCXX/coroutine-final-suspend-noexcept.cpp
===
--- clang/test/SemaCXX/coroutine-final-suspend-noexcept.cpp
+++ clang/test/SemaCXX/coroutine-final-suspend-noexcept.cpp
@@ -11,27 +11,27 @@
 
 template 
 struct coroutine_handle {
-  static coroutine_handle from_address(void *); // expected-note {{must be declared with 'noexcept'}}
+  static coroutine_handle from_address(void *); // expected-note 2 {{must be declared with 'noexcept'}}
 };
 template <>
 struct coroutine_handle {
   template 
-  coroutine_handle(coroutine_handle); // expected-note {{must be declared with 'noexcept'}}
+  coroutine_handle(coroutine_handle); // expected-note 2 {{must be declared with 'noexcept'}}
 };
 
 struct suspend_never {
-  bool await_ready() { return true; }   // expected-note {{must be declared with 'noexcept'}}
-  void await_suspend(coroutine_handle<>) {} // expected-note {{must be declared with 'noexcept'}}
-  void await_resume() {}// expected-note {{must be declared with 'noexcept'}}
-  ~suspend_never() noexcept(false); // expected-note {{must be declared with 'noexcept'}}
+  bool await_ready() { return true; }   // expected-note 2 {{must be declared with 'noexcept'}}
+  void await_suspend(coroutine_handle<>) {} // expected-note 2 {{must be declared with 'noexcept'}}
+  void await_resume() {}// expected-note 2 {{must be declared with 'noexcept'}}
+  ~suspend_never() noexcept(false); // expected-note 2 {{must be declared with 'noexcept'}}
 };
 
 struct suspend_always {
   bool await_ready() { return false; }
   void await_suspend(coroutine_handle<>) {}
   void await_resume() {}
-  suspend_never operator co_await(); // expected-note {{must be declared with 'noexcept'}}
-  ~suspend_always() noexcept(false); // expected-note {{must be declared with 'noexcept'}}
+  suspend_never operator co_await(); // expected-note 2 {{must be declared with 'noexcept'}}
+  ~suspend_always() noexcept(false); // expected-note 2 {{must be declared with 'noexcept'}}
 };
 
 } // namespace experimental
@@ -50,7 +50,7 @@
   struct promise_type {
 coro_t get_return_object();
 suspend_never initial_suspend();
-suspend_always final_suspend(); // expected-note {{must be declared with 'noexcept'}}
+suspend_always final_suspend(); // expected-note 2 {{must be declared with 'noexcept'}}
 void return_void();
 static void unhandled_exception();
   };
@@ -60,3 +60,13 @@
   A a{};
   co_await a;
 }
+
+template 
+coro_t f_dep(T n) { // expected-error {{the expression 'co_await __promise.final_suspend()' is required to be non-throwing}}
+  A a{};
+  co_await a;
+}
+
+void foo() {
+  f_dep(5); // expected-note {{in instantiation of function template specialization 'f_dep' requested here}}
+}
Index: clang/lib/Sema/TreeTransform.h
===
--- clang/lib/Sema/TreeTransform.h
+++ clang/lib/Sema/TreeTransform.h
@@ -7630,7 +7630,8 @@
 return StmtError();
   StmtResult FinalSuspend =
   getDerived().TransformStmt(S->getFinalSuspendStmt());
-  if (FinalSuspend.isInvalid())
+  if (FinalSuspend.isInvalid() ||
+  !SemaRef.checkFinalSuspendNoThrow(FinalSuspend.get()))
 return StmtError();
   ScopeInfo->setCoroutineSuspends(InitSuspend.get(), FinalSuspend.get());
   assert(isa(InitSuspend.get()) && isa(FinalSuspend.get()));
Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -631,7 +631,6 @@
   } else if (SC == Expr::CallExprClass || SC == Expr::CXXMemberCallExprClass ||
  SC == Expr::CXXOperatorCallExprClass) {
 if (!cast(E)->isTypeDependent()) {
-  // FIXME: Handle dependent types.
   checkDeclNoexcept(cast(E)->getCalleeDecl());
   auto ReturnType = cast(E)->getCallReturnType(S.getASTContext());
   // 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it 
> > > > > > not enough that the call you're trying to TRE is marked "tail"?
> > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > 
> > > > > AllocaDerivedValueTracker analysis(done in markTails) could be reused 
> > > > > here. 
> > > > > But marking, done in markTails(), looks like separate tasks. i.e. it 
> > > > > is better 
> > > > > to make TRE not depending on markTails(). There is a review for this 
> > > > > - https://reviews.llvm.org/D60031
> > > > > Thus such separation looks useful(To not reuse result of markTails 
> > > > > but have it computed inplace).
> > > > > 
> > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > "tail"?
> > > > > 
> > > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > > It also should be checked that other calls does not capture pointer 
> > > > > to local stack: 
> > > > > 
> > > > > ```
> > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > if (!Tracker.EscapePoints.empty())
> > > > >return false;
> > > > > 
> > > > > ```
> > > > > 
> > > > > It is not enough that call which is subject to TRE is marked "Tail". 
> > > > > It also should be checked that other calls does not capture pointer 
> > > > > to local stack:
> > > > 
> > > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > > "tail" in the first place, would we?
> > > If function receives pointer to alloca then it would not be marked with 
> > > "Tail". Then we do not have a possibility to understand whether this 
> > > function receives pointer to alloca but does not capture it:
> > > 
> > > ```
> > > void test(int recurseCount)
> > > {
> > > if (recurseCount == 0) return;
> > > int temp = 10;
> > > globalIncrement();
> > > test(recurseCount - 1);
> > > }
> > > ```
> > > 
> > > test - marked with Tail.
> > > globalIncrement - not marked with Tail. But TRE could be done since it 
> > > does not capture pointer. But if it will capture the pointer then we 
> > > could not do TRE. So we need to check !Tracker.EscapePoints.empty().
> > > 
> > > 
> > > 
> > > test - marked with Tail.
> > 
> > For the given code, TRE won't mark the recursive call "tail".  That 
> > transform isn't legal: the recursive call could access the caller's version 
> > of "temp".
> >For the given code, TRE won't mark the recursive call "tail". That transform 
> >isn't legal: the recursive call could access the caller's version of "temp".
> 
> it looks like recursive call could NOT access the caller's version of "temp":
> 
> ```
> test(recurseCount - 1);
> ```
> 
> Caller`s version of temp is accessed by non-recursive call:
> 
> ```
> globalIncrement();
> ```
> 
> If globalIncrement does not capture the "" then TRE looks to be legal 
> for that case. 
> 
> globalIncrement() would not be marked with "Tail". test() would be marked 
> with Tail.
> 
> Thus the pre-requisite for TRE would be: tail-recursive call must not receive 
> pointer to local stack(Tail) and non-recursive calls must not capture the 
> pointer to local stack.
Can you give a complete IR example where we infer "tail", but TRE is illegal?

Can you give a complete IR example, we we don't infer "tail", but we still do 
the TRE transform here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis added a comment.

The test should be in LLVM, under  test/Instrumentation/MemorySanitizer




Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3077
+Value *LowShadow = IRB.CreateOr(LowA, LowB);
+Value *Shadow = IRB.CreateInsertElement(Second, LowShadow, 
IRB.getInt32(0));
+

You probably want to insert in First, not Second.

Is the generated code any better if you OR the vectors, and then shuffle to put 
the top element of First into the top element of the output? That's what LLVM 
generates if I express this logic in C.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79895: Add a new warning to warn when passing uninitialized variables as const reference parameters to a function

2020-06-23 Thread Hans Wennborg via Phabricator via cfe-commits
hans added a comment.

In D79895#2109414 , @nick wrote:

> > I feel like doing interprocedural analysis for this is overkill. What is 
> > the benefit of boost::ignore_unused(foo); rather than the more common 
> > (void) foo;? Any examples?
>
>
>
> > I haven't seen boost::ignore_unused before. In my experience, the idiomatic 
> > way of ignoring an unused variable in C/C++ is to cast it to void, as 
> > Arthur said.
>
> This is a weak argument to have false positives, don't you agree? You may 
> have not seen it, but it exists and is used: 
> https://github.com/search?q=%22boost%3A%3Aignore_unused%22+NOT+%22Boost+Software+License%22=Code


There are plenty of warnings which have false positives on non-idiomatic code 
though. The question is how common this pattern of using a function to ignore 
an unused variable is. We didn't see it in the code bases I work with, so is 
boost a special case, or an example of a common practice? If it's just boost, 
fixing the code seems better (it will compile faster too).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79895/new/

https://reviews.llvm.org/D79895



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82182: [AArch64][SVE] Add bfloat16 support to perm and select intrinsics

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/include/clang/Basic/arm_sve.td:1115
 
+let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
+def SVREV_BF16: SInst<"svrev[_{d}]","dd",   "b", MergeNone, 
"aarch64_sve_rev">;

c-rhodes wrote:
> c-rhodes wrote:
> > fpetrogalli wrote:
> > > nit: could create a multiclass here like @sdesmalen have done in 
> > > https://reviews.llvm.org/D82187, seems quite a nice way to keep the 
> > > definition of the intrinsics together (look for `multiclass StructLoad`, 
> > > for example)
> > it might be a bit tedious having separate multiclasses, what do you think 
> > about:
> > ```multiclass SInstBF16 > i = "",
> >  list ft = [], list ch = []> {
> >   def : SInst;
> >   let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
> > def : SInst;
> >   }
> > }
> > 
> > defm SVREV: SInstBF16<"svrev[_{d}]","dd",   "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_rev">;
> > defm SVSEL: SInstBF16<"svsel[_{d}]","dPdd", "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_sel">;
> > defm SVSPLICE : SInstBF16<"svsplice[_{d}]", "dPdd", "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_splice">;
> > defm SVTRN1   : SInstBF16<"svtrn1[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_trn1">;
> > defm SVTRN2   : SInstBF16<"svtrn2[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_trn2">;
> > defm SVUZP1   : SInstBF16<"svuzp1[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_uzp1">;
> > defm SVUZP2   : SInstBF16<"svuzp2[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_uzp2">;
> > defm SVZIP1   : SInstBF16<"svzip1[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_zip1">;
> > defm SVZIP2   : SInstBF16<"svzip2[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_zip2">;```
> > 
> > ?
> I've played around with this and it works great for instructions guarded on a 
> single feature flag but falls apart for the .Q forms that also require 
> `__ARM_FEATURE_SVE_MATMUL_FP64`. I suspect there's a nice way of handling it 
> in tablegen by passing the features as a list of strings and joining them but 
> I spent long enough trying to get that to work so I'm going to keep it simple 
> for now.
> it might be a bit tedious having separate multiclasses, what do you think 
> about:

Sorry I think I misunderstood you when we last discussed this. I didn't mean to 
write a multiclass that would work for ALL intrinsics that uses regular types 
and bfloats I just meant to merge together those who were using the same 
archguard and that you are adding in this patch.

I think you could keep both macros in a single ArchGuard string:

```
multiclass SInstPerm {
  def : SInst;
  let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
def : SInst;
  }
}

defm SVREV: SInstPerm<"svrev[_{d}]","dd",MergeNone, 
"aarch64_sve_rev">;
...

multiclass SInstPermMatmul {
  def : SInst;
  let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16) && 
defined(__ARM_FEATURE_SVE_MATMUL_FP64)" in {
def : SInst;
  }
}

def SVTRN1Q : SInstPermMatmul ...
...
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82182/new/

https://reviews.llvm.org/D82182



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Gui Andrade via Phabricator via cfe-commits
guiand updated this revision to Diff 272776.
guiand added a comment.

Ran clang-format


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82398/new/

https://reviews.llvm.org/D82398

Files:
  clang/test/CodeGen/msan-intrinsics.c
  llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp


Index: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -3054,6 +3054,32 @@
 SOC.Done();
   }
 
+  // Instrument _mm_*_sd intrinsics
+  void handleUnarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *LowShadow =
+IRB.CreateExtractElement(Second, IRB.getInt32(0), "_lo_a");
+Value *Shadow = IRB.CreateInsertElement(First, LowShadow, IRB.getInt32(0));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
+  void handleBinarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *LowA = IRB.CreateExtractElement(First, IRB.getInt32(0), "_lo_a");
+Value *LowB = IRB.CreateExtractElement(Second, IRB.getInt32(0), "_lo_b");
+Value *LowShadow = IRB.CreateOr(LowA, LowB);
+Value *Shadow = IRB.CreateInsertElement(Second, LowShadow, 
IRB.getInt32(0));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
   void visitIntrinsicInst(IntrinsicInst ) {
 switch (I.getIntrinsicID()) {
 case Intrinsic::lifetime_start:
@@ -3293,6 +3319,13 @@
   handlePclmulIntrinsic(I);
   break;
 
+case Intrinsic::x86_sse41_round_sd:
+  handleUnarySdIntrinsic(I);
+  break;
+case Intrinsic::x86_sse2_max_sd:
+case Intrinsic::x86_sse2_min_sd:
+  handleBinarySdIntrinsic(I);
+  break;
 case Intrinsic::is_constant:
   // The result of llvm.is.constant() is always defined.
   setShadow(, getCleanShadow());
Index: clang/test/CodeGen/msan-intrinsics.c
===
--- /dev/null
+++ clang/test/CodeGen/msan-intrinsics.c
@@ -0,0 +1,38 @@
+// RUN: %clang_cc1 -fsanitize=memory -triple x86_64-linux-gnu -emit-llvm %s 
-O3 -o - | FileCheck %s
+
+typedef double double2 __attribute__((vector_size(16)));
+
+__attribute__((target("sse4.1")))
+double2
+RoundSD(double theta, double top) {
+  // CHECK: [[IN_SHADOW1:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  // CHECK: [[IN_SHADOW2:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  double2 vec;
+  double2 unused;
+  unused[1] = top;
+  vec[0] = theta;
+  return __builtin_ia32_roundsd(unused, vec, 1);
+  // CHECK: [[TMP_SHADOW:%.+]] = insertelement <2 x i64> undef, i64 
[[IN_SHADOW2]], i32 1
+  // CHECK: [[OUT_SHADOW:%.+]] = insertelement <2 x i64> [[TMP_SHADOW]], i64 
[[IN_SHADOW1]], i32 0
+
+  // CHECK-NOT: call void @msan_warning
+  // CHECK: call <2 x double> @llvm.x86.sse41.round.sd
+  // CHECK: store <2 x i64> [[OUT_SHADOW]], {{.*}} @__msan_retval_tls
+  // CHECK: ret <2 x double>
+}
+
+__attribute__((target("sse2"))) double MinSD(double t1, double t2) {
+  // CHECK: [[IN_SHADOW1:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  // CHECK: [[IN_SHADOW2:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  // CHECK: [[COMB_SHADOW:%[0-9]+]] = or i64 [[IN_SHADOW2]], [[IN_SHADOW1]]
+  double2 first;
+  double2 second;
+  first[0] = t1;
+  second[0] = t2;
+  double min = __builtin_ia32_minsd(first, second)[0];
+  // CHECK-NOT: call void @msan_warning
+  // CHECK: call <2 x double> @llvm.x86.sse2.min.sd
+  return min;
+  // CHECK: store i64 [[COMB_SHADOW]], {{.*}} @__msan_retval_tls
+  // CHECK: ret double
+}
\ No newline at end of file


Index: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -3054,6 +3054,32 @@
 SOC.Done();
   }
 
+  // Instrument _mm_*_sd intrinsics
+  void handleUnarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *LowShadow =
+IRB.CreateExtractElement(Second, IRB.getInt32(0), "_lo_a");
+Value *Shadow = IRB.CreateInsertElement(First, LowShadow, IRB.getInt32(0));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
+  void handleBinarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *LowA = IRB.CreateExtractElement(First, IRB.getInt32(0), "_lo_a");
+Value *LowB = IRB.CreateExtractElement(Second, IRB.getInt32(0), "_lo_b");
+Value *LowShadow = IRB.CreateOr(LowA, LowB);
+Value *Shadow = IRB.CreateInsertElement(Second, LowShadow, IRB.getInt32(0));
+
+setShadow(, 

[PATCH] D82399: [AArch64][SVE2] Add bfloat16 support to whilerw/whilewr intrinsics

2020-06-23 Thread Cullen Rhodes via Phabricator via cfe-commits
c-rhodes created this revision.
c-rhodes added reviewers: sdesmalen, efriedma, kmclaughlin, david-arm, 
fpetrogalli, stuij.
Herald added subscribers: danielkiss, kristof.beyls, tschuett.
Herald added projects: clang, LLVM.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82399

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_whilerw-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_whilewr-bfloat.c
  llvm/test/CodeGen/AArch64/sve2-intrinsics-contiguous-conflict-detection.ll

Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-contiguous-conflict-detection.ll
===
--- llvm/test/CodeGen/AArch64/sve2-intrinsics-contiguous-conflict-detection.ll
+++ llvm/test/CodeGen/AArch64/sve2-intrinsics-contiguous-conflict-detection.ll
@@ -36,6 +36,14 @@
   ret  %out
 }
 
+define  @whilerw_bfloat(bfloat* %a, bfloat* %b) {
+; CHECK-LABEL: whilerw_bfloat:
+; CHECK: whilerw  p0.h, x0, x1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.whilerw.h.nx8i1.bf16.bf16(bfloat* %a, bfloat* %b)
+  ret  %out
+}
+
 define  @whilerw_half(half* %a, half* %b) {
 ; CHECK-LABEL: whilerw_half:
 ; CHECK: whilerw  p0.h, x0, x1
@@ -96,6 +104,14 @@
   ret  %out
 }
 
+define  @whilewr_bfloat(bfloat* %a, bfloat* %b) {
+; CHECK-LABEL: whilewr_bfloat:
+; CHECK: whilewr  p0.h, x0, x1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.whilewr.h.nx8i1.bf16.bf16(bfloat* %a, bfloat* %b)
+  ret  %out
+}
+
 define  @whilewr_half(half* %a, half* %b) {
 ; CHECK-LABEL: whilewr_half:
 ; CHECK: whilewr  p0.h, x0, x1
@@ -125,6 +141,7 @@
 declare  @llvm.aarch64.sve.whilerw.s.nx4i1(i32* %a, i32* %b)
 declare  @llvm.aarch64.sve.whilerw.d.nx2i1(i64* %a, i64* %b)
 
+declare  @llvm.aarch64.sve.whilerw.h.nx8i1.bf16.bf16(bfloat* %a, bfloat* %b)
 declare  @llvm.aarch64.sve.whilerw.h.nx8i1.f16.f16(half* %a, half* %b)
 declare  @llvm.aarch64.sve.whilerw.s.nx4i1.f32.f32(float* %a, float* %b)
 declare  @llvm.aarch64.sve.whilerw.d.nx2i1.f64.f64(double* %a, double* %b)
@@ -134,6 +151,7 @@
 declare  @llvm.aarch64.sve.whilewr.s.nx4i1(i32* %a, i32* %b)
 declare  @llvm.aarch64.sve.whilewr.d.nx2i1(i64* %a, i64* %b)
 
+declare  @llvm.aarch64.sve.whilewr.h.nx8i1.bf16.bf16(bfloat* %a, bfloat* %b)
 declare  @llvm.aarch64.sve.whilewr.h.nx8i1.f16.f16(half* %a, half* %b)
 declare  @llvm.aarch64.sve.whilewr.s.nx4i1.f32.f32(float* %a, float* %b)
 declare  @llvm.aarch64.sve.whilewr.d.nx2i1.f64.f64(double* %a, double* %b)
Index: clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_whilewr-bfloat.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_whilewr-bfloat.c
@@ -0,0 +1,36 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE_BF16 -triple aarch64-none-linux-gnu -target-feature +sve2 -target-feature +bf16 -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE_BF16 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve2 -target-feature +bf16 -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+// Test expected warnings for implicit declaration when +sve2 is missing
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE_BF16 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -fallow-half-arguments-and-returns -fsyntax-only -verify -verify-ignore-unexpected=error %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE_BF16 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -fallow-half-arguments-and-returns -fsyntax-only -verify=overload -verify-ignore-unexpected=error %s
+
+// Test expected warnings for implicit declaration when +bf16 is missing
+// NOTE: +bf16 doesn't currently imply __ARM_FEATURE_SVE_BF16, once the
+// implementation is complete it will, at which point -target-feature +bf16
+// should be removed.
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -triple aarch64-none-linux-gnu -target-feature +sve2 -target-feature +bf16 -fallow-half-arguments-and-returns -fsyntax-only -verify -verify-ignore-unexpected=error %s
+
+// Test expected ambiguous call error for overloaded form when +bf16 is missing
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve2 -target-feature +bf16 -fallow-half-arguments-and-returns -fsyntax-only -verify=overload-bf16 -verify-ignore-unexpected=note %s
+
+#include 
+
+#ifdef 

[PATCH] D79719: [AIX] Implement AIX special alignment rule about double/long double

2020-06-23 Thread Hubert Tong via Phabricator via cfe-commits
hubert.reinterpretcast added inline comments.



Comment at: clang/lib/AST/ASTContext.cpp:2424
+  (T->isSpecificBuiltinType(BuiltinType::LongDouble) &&
+   Target->supportsAIXPowerAlignment()))
 // Don't increase the alignment if an alignment attribute was specified on 
a

Xiangling_L wrote:
> hubert.reinterpretcast wrote:
> > Does `supportsAIXPowerAlignment` express the condition we want to check 
> > here? That might be true for an implementation operating with `mac68k` 
> > alignment rules.
> Yeah, `supportsAIXPowerAlignment` cannot separate the preferred alignment of 
> double, long double between `power/natural` and `mac68k` alignment rules. But 
> I noticed that currently, AIX target on wyvern or XL don't support `mac68k` , 
> so maybe we should leave further changes to the patch which is gonna 
> implement `mac68k` alignment rules? The possible solution I am thinking is we 
> can add checking if the decl has `AlignMac68kAttr` into query to separate 
> things out.
> 
> Another thing is that once we start supporting mac68k alignment rule(if we 
> will), should we also change the ABI align values as well? (e.g. for double, 
> it should be 2 instead)
If the "base state" is AIX `power` alignment for a platform, I suggest that the 
name be `defaultsToAIXPowerAlignment`.



Comment at: clang/lib/AST/RecordLayoutBuilder.cpp:660
+  /// When there are OverlappingEmptyFields existing in the aggregate, the
+  /// flag shows if the following first non-overlappingEmptyField has been
+  /// handled, if any.

Xiangling_L wrote:
> hubert.reinterpretcast wrote:
> > I suggest to replace (if correct) "non-overlappingEmptyField" with 
> > "non-empty or empty-but-non-overlapping field".
> > 
> Thanks for your suggestion. But I kinda prefer using 
> `NonOverlappingEmptyField`, it is more consistent with 
> `IsOverlappingEmptyField`. And also the equivalent name to 
> `NonOverlappingEmptyField` is `NonOverlappingAndNonEmptyField` which is 
> tedious I think.
I am suggesting a change to the comment and not the name here. If both the 
comment and the name uses the same (possibly confusing) form to express the 
concept, then the comment does not aid comprehension of the code.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79719/new/

https://reviews.llvm.org/D79719



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82398: [MSAN] Handle x86 {round,min,max}sd intrinsics

2020-06-23 Thread Gui Andrade via Phabricator via cfe-commits
guiand created this revision.
guiand added a reviewer: eugenis.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya.
Herald added projects: clang, LLVM.
guiand updated this revision to Diff 272776.
guiand added a comment.

Ran clang-format


These need special handling over the simple vector intrinsics as they behave 
more like a shuffle operation: taking the top half of the vector from one 
input, and the bottom half separately. Previously, these were being handled as 
though all bits of all operands were combined.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82398

Files:
  clang/test/CodeGen/msan-intrinsics.c
  llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp


Index: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -3054,6 +3054,32 @@
 SOC.Done();
   }
 
+  // Instrument _mm_*_sd intrinsics
+  void handleUnarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *LowShadow =
+IRB.CreateExtractElement(Second, IRB.getInt32(0), "_lo_a");
+Value *Shadow = IRB.CreateInsertElement(First, LowShadow, IRB.getInt32(0));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
+  void handleBinarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *LowA = IRB.CreateExtractElement(First, IRB.getInt32(0), "_lo_a");
+Value *LowB = IRB.CreateExtractElement(Second, IRB.getInt32(0), "_lo_b");
+Value *LowShadow = IRB.CreateOr(LowA, LowB);
+Value *Shadow = IRB.CreateInsertElement(Second, LowShadow, 
IRB.getInt32(0));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
   void visitIntrinsicInst(IntrinsicInst ) {
 switch (I.getIntrinsicID()) {
 case Intrinsic::lifetime_start:
@@ -3293,6 +3319,13 @@
   handlePclmulIntrinsic(I);
   break;
 
+case Intrinsic::x86_sse41_round_sd:
+  handleUnarySdIntrinsic(I);
+  break;
+case Intrinsic::x86_sse2_max_sd:
+case Intrinsic::x86_sse2_min_sd:
+  handleBinarySdIntrinsic(I);
+  break;
 case Intrinsic::is_constant:
   // The result of llvm.is.constant() is always defined.
   setShadow(, getCleanShadow());
Index: clang/test/CodeGen/msan-intrinsics.c
===
--- /dev/null
+++ clang/test/CodeGen/msan-intrinsics.c
@@ -0,0 +1,38 @@
+// RUN: %clang_cc1 -fsanitize=memory -triple x86_64-linux-gnu -emit-llvm %s 
-O3 -o - | FileCheck %s
+
+typedef double double2 __attribute__((vector_size(16)));
+
+__attribute__((target("sse4.1")))
+double2
+RoundSD(double theta, double top) {
+  // CHECK: [[IN_SHADOW1:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  // CHECK: [[IN_SHADOW2:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  double2 vec;
+  double2 unused;
+  unused[1] = top;
+  vec[0] = theta;
+  return __builtin_ia32_roundsd(unused, vec, 1);
+  // CHECK: [[TMP_SHADOW:%.+]] = insertelement <2 x i64> undef, i64 
[[IN_SHADOW2]], i32 1
+  // CHECK: [[OUT_SHADOW:%.+]] = insertelement <2 x i64> [[TMP_SHADOW]], i64 
[[IN_SHADOW1]], i32 0
+
+  // CHECK-NOT: call void @msan_warning
+  // CHECK: call <2 x double> @llvm.x86.sse41.round.sd
+  // CHECK: store <2 x i64> [[OUT_SHADOW]], {{.*}} @__msan_retval_tls
+  // CHECK: ret <2 x double>
+}
+
+__attribute__((target("sse2"))) double MinSD(double t1, double t2) {
+  // CHECK: [[IN_SHADOW1:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  // CHECK: [[IN_SHADOW2:%[0-9]+]] = load {{.*}} @__msan_param_tls
+  // CHECK: [[COMB_SHADOW:%[0-9]+]] = or i64 [[IN_SHADOW2]], [[IN_SHADOW1]]
+  double2 first;
+  double2 second;
+  first[0] = t1;
+  second[0] = t2;
+  double min = __builtin_ia32_minsd(first, second)[0];
+  // CHECK-NOT: call void @msan_warning
+  // CHECK: call <2 x double> @llvm.x86.sse2.min.sd
+  return min;
+  // CHECK: store i64 [[COMB_SHADOW]], {{.*}} @__msan_retval_tls
+  // CHECK: ret double
+}
\ No newline at end of file


Index: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -3054,6 +3054,32 @@
 SOC.Done();
   }
 
+  // Instrument _mm_*_sd intrinsics
+  void handleUnarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+Value *First = getShadow(, 0);
+Value *Second = getShadow(, 1);
+Value *LowShadow =
+IRB.CreateExtractElement(Second, IRB.getInt32(0), "_lo_a");
+Value *Shadow = IRB.CreateInsertElement(First, LowShadow, IRB.getInt32(0));
+
+setShadow(, Shadow);
+setOriginForNaryOp(I);
+  }
+
+  void handleBinarySdIntrinsic(IntrinsicInst ) {
+IRBuilder<> IRB();
+

[PATCH] D82298: [AArch64][SVE] Add bfloat16 support to load intrinsics

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82298/new/

https://reviews.llvm.org/D82298



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81736: [openmp] Base of tablegen generated OpenMP common declaration

2020-06-23 Thread Valentin Clement via Phabricator via cfe-commits
clementval marked 3 inline comments as done.
clementval added inline comments.



Comment at: llvm/include/llvm/Frontend/OpenMP/CMakeLists.txt:2
+set(LLVM_TARGET_DEFINITIONS OMP.td)
+tablegen(LLVM OMP.h.inc --gen-directive-decls)
+add_public_tablegen_target(omp_gen)

thakis wrote:
> jdoerfert wrote:
> > clementval wrote:
> > > thakis wrote:
> > > > All other tblgen outputs are called .inc, not .h.inc. Any reason this 
> > > > one's different?
> > > There is a `.cpp.inc` coming in a following patch. 
> > @clementval ^ 
> ...why would you want to include a cpp file?
> 
> If it's for definitions of generated functions, I think the usual pattern is 
> to put that in the .inc too behind a define and define that in one cpp file 
> that includes the .inc. (Examples: GET_DAGISEL_BODY, GET_INSTRINFO_MC_DESC, 
> PRINT_ALIAS_INSTR etc -- `rg -B5 '#include.*\.inc' clang llvm` shows many 
> examples).
Yeah this was the idea. I was following the same pattern as MLIR use of 
TableGen. I'm happy with a single `.inc` file with define. If you don't mind 
I'll update the filename in the next patch since this one  has landed already. 

FYI MLIR TableGen .cpp.inc -> 
https://github.com/llvm/llvm-project/blob/master/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp#L42


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81736/new/

https://reviews.llvm.org/D81736



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81672: [Driver] When forcing a crash print the bug report message

2020-06-23 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments.



Comment at: clang/tools/driver/driver.cpp:515
+
+  llvm::dbgs() << llvm::getBugReportMsg();
 }

Why ` llvm::dbgs() << llvm::getBugReportMsg();` when -gen-reproducer is 
specified? The user requests to generate a reproduce file, but this does not 
suggest that clang has a bug. Dumping an URL is not very appropriate.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81672/new/

https://reviews.llvm.org/D81672



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79796: Sketch support for generating CC1 command line from CompilerInvocation

2020-06-23 Thread Michael Spencer via Phabricator via cfe-commits
Bigcheese accepted this revision.
Bigcheese added a comment.
This revision is now accepted and ready to land.

LGTM with `KeyPathPrefix` moved to the patch that actually uses it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79796/new/

https://reviews.llvm.org/D79796



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81736: [openmp] Base of tablegen generated OpenMP common declaration

2020-06-23 Thread Nico Weber via Phabricator via cfe-commits
thakis added inline comments.



Comment at: llvm/include/llvm/Frontend/OpenMP/CMakeLists.txt:2
+set(LLVM_TARGET_DEFINITIONS OMP.td)
+tablegen(LLVM OMP.h.inc --gen-directive-decls)
+add_public_tablegen_target(omp_gen)

jdoerfert wrote:
> clementval wrote:
> > thakis wrote:
> > > All other tblgen outputs are called .inc, not .h.inc. Any reason this 
> > > one's different?
> > There is a `.cpp.inc` coming in a following patch. 
> @clementval ^ 
...why would you want to include a cpp file?

If it's for definitions of generated functions, I think the usual pattern is to 
put that in the .inc too behind a define and define that in one cpp file that 
includes the .inc. (Examples: GET_DAGISEL_BODY, GET_INSTRINFO_MC_DESC, 
PRINT_ALIAS_INSTR etc -- `rg -B5 '#include.*\.inc' clang llvm` shows many 
examples).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81736/new/

https://reviews.llvm.org/D81736



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78655: [CUDA][HIP] Let non-caputuring lambda be host device

2020-06-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

In D78655#2108047 , @pfultz2 wrote:

> > Could you give an example to demonstrate current use and how it will break?
>
> Here is place where it would break:
>
> https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/src/targets/gpu/device/include/migraphx/gpu/device/multi_index.hpp#L129
>
> This change was already included in a fork of llvm in rocm 3.5 and 3.6 
> releases which is why this compiles. This also compiles using the hcc-based 
> hip compilers which is what previous rocm versions used. It would be best if 
> this can be upstreamed, so we dont have to hold on to these extra changes in 
> a fork.


It may be OK to require updated software in order to switch to a new compiler. 
E.g. it would be unreasonable for clang to compile all existing HCC code. Nor 
did we promise to compile all existing CUDA code when it was at the point in 
time where HIP is now -- new compiler emerging in an ecosystem with existing 
code which compiles and works fine with the incumbent compiler, but needs some 
tweaks to compile/work with clang. There will be some back and forth before we 
reach the equilibrium where most things compile and work.

You may need to make some portability tweaks to your code to make it work with 
upstream and internal clang versions + hcc. This is roughly what's been done to 
existing CUDA code -- pretty much all major libraries that use CUDA 
(tensorflow, Thrust, cutlas, cub, pytorch) had to have minor tweaks to make it 
portable to clang.

Now, back to the specifics of your example. I'm still not 100% sure I 
understand what the problem is. Can you boil down the use case to an example on 
godbolt? Not just the lambda itself, but also the way it's intended to be used. 
It does not need to compile, I just need it to understand your use case and the 
problem.
I can imaging passing lambda type as a template parameter which would make it 
hard to predict/control where/how it will finally be instantiated or used, but 
it would be great to have a practical example.

> Part of the motivation for this change was that it wasn't always clear in 
> code where the `__device__` attribute is needed with lambdas sometimes. It 
> also makes it more consistent with `constexpr` lambdas and hcc-based hip 
> compiler. Including this for capturing lambdas will make this simpler and 
> easier to understand.
> 
> If there are concerns about making it default for capturing lambdas, then can 
> we at least just have a flag to enable this for capturing lambdas?

I've just pointed that the assumption that having the capture implies having 
enclosing function is invalid. We've already decided to proceed with promotion 
of all lambdas in general, so it's mostly the matter of taking care of 
implementation details.
Dealing with capturing lambdas in a separate patch is one option. IMO it makes 
sense in general as capturing lambdas do have their own distinct quirks, while 
promotion of non-capturing lambdas are relatively uncontroversial.
If Sam decides to incorporate support for capturing lambdas in this patch, we 
could still do it by restricting the capturing lambda promotion to the ones 
within a function scope only. I.e. lambdas created in global scope would still 
be host.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78655/new/

https://reviews.llvm.org/D78655



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82298: [AArch64][SVE] Add bfloat16 support to load intrinsics

2020-06-23 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin updated this revision to Diff 272759.
kmclaughlin added a comment.

- Moved bfloat tests into separate files
- Added checks to the bfloat test files which test the warnings given when 
ARM_FEATURE_SVE_BF16 is omitted in the RUN line


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82298/new/

https://reviews.llvm.org/D82298

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1rq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ldff1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ldnf1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ldnt1-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
  llvm/test/CodeGen/AArch64/sve-intrinsics-ld1-addressing-mode-reg-imm.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-ld1-addressing-mode-reg-reg.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-ld1.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-loads-ff.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-loads-nf.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll
  llvm/test/CodeGen/AArch64/sve-masked-ldst-nonext.ll

Index: llvm/test/CodeGen/AArch64/sve-masked-ldst-nonext.ll
===
--- llvm/test/CodeGen/AArch64/sve-masked-ldst-nonext.ll
+++ llvm/test/CodeGen/AArch64/sve-masked-ldst-nonext.ll
@@ -87,6 +87,14 @@
   ret  %load
 }
 
+define  @masked_load_nxv8bf16( *%a,  %mask) nounwind {
+; CHECK-LABEL: masked_load_nxv8bf16:
+; CHECK-NEXT: ld1h { z0.h }, p0/z, [x0]
+; CHECK-NEXT: ret
+  %load = call  @llvm.masked.load.nxv8bf16( *%a, i32 2,  %mask,  undef)
+  ret  %load
+}
+
 ;
 ; Masked Stores
 ;
@@ -182,6 +190,7 @@
 declare  @llvm.masked.load.nxv4f32(*, i32, , )
 declare  @llvm.masked.load.nxv4f16(*, i32, , )
 declare  @llvm.masked.load.nxv8f16(*, i32, , )
+declare  @llvm.masked.load.nxv8bf16(*, i32, , )
 
 declare void @llvm.masked.store.nxv2i64(, *, i32, )
 declare void @llvm.masked.store.nxv4i32(, *, i32, )
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll
@@ -97,6 +97,23 @@
   ret  %res
 }
 
+define  @ld1rqh_bf16( %pred, bfloat* %addr) {
+; CHECK-LABEL: ld1rqh_bf16:
+; CHECK: ld1rqh { z0.h }, p0/z, [x0]
+; CHECK-NEXT: ret
+  %res = call  @llvm.aarch64.sve.ld1rq.nxv8bf16( %pred, bfloat* %addr)
+  ret  %res
+}
+
+define  @ld1rqh_bf16_imm( %pred, bfloat* %addr) {
+; CHECK-LABEL: ld1rqh_bf16_imm:
+; CHECK: ld1rqh { z0.h }, p0/z, [x0, #-16]
+; CHECK-NEXT: ret
+  %ptr = getelementptr inbounds bfloat, bfloat* %addr, i16 -8
+  %res = call  @llvm.aarch64.sve.ld1rq.nxv8bf16( %pred, bfloat* %ptr)
+  ret  %res
+}
+
 ;
 ; LD1RQW
 ;
@@ -208,6 +225,15 @@
   ret  %res
 }
 
+define  @ldnt1h_bf16( %pred, bfloat* %addr) {
+; CHECK-LABEL: ldnt1h_bf16:
+; CHECK: ldnt1h { z0.h }, p0/z, [x0]
+; CHECK-NEXT: ret
+  %res = call  @llvm.aarch64.sve.ldnt1.nxv8bf16( %pred,
+ bfloat* %addr)
+  ret  %res
+}
+
 ;
 ; LDNT1W
 ;
@@ -498,6 +524,7 @@
 declare  @llvm.aarch64.sve.ld1rq.nxv4i32(, i32*)
 declare  @llvm.aarch64.sve.ld1rq.nxv2i64(, i64*)
 declare  @llvm.aarch64.sve.ld1rq.nxv8f16(, half*)
+declare  @llvm.aarch64.sve.ld1rq.nxv8bf16(, bfloat*)
 declare  @llvm.aarch64.sve.ld1rq.nxv4f32(, float*)
 declare  @llvm.aarch64.sve.ld1rq.nxv2f64(, double*)
 
@@ -506,6 +533,7 @@
 declare  @llvm.aarch64.sve.ldnt1.nxv4i32(, i32*)
 declare  @llvm.aarch64.sve.ldnt1.nxv2i64(, i64*)
 declare  @llvm.aarch64.sve.ldnt1.nxv8f16(, half*)
+declare  @llvm.aarch64.sve.ldnt1.nxv8bf16(, bfloat*)
 declare  @llvm.aarch64.sve.ldnt1.nxv4f32(, float*)
 declare  @llvm.aarch64.sve.ldnt1.nxv2f64(, double*)
 
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-loads-nf.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-loads-nf.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-loads-nf.ll
@@ -140,6 +140,14 @@
   ret  %load
 }
 
+define  @ldnf1h_bf16( %pg, bfloat* %a) {
+; CHECK-LABEL: ldnf1h_bf16:
+; CHECK: ldnf1h { z0.h }, p0/z, [x0]
+; CHECK-NEXT: ret
+  %load = call  @llvm.aarch64.sve.ldnf1.nxv8bf16( %pg, bfloat* %a)
+  ret  %load
+}
+
 define  @ldnf1h_f16_inbound( %pg, half* %a) {
 ; CHECK-LABEL: ldnf1h_f16_inbound:
 ; CHECK: ldnf1h { z0.h }, p0/z, [x0, #1, mul vl]
@@ -151,6 +159,17 @@
   ret  %load
 }
 
+define  @ldnf1h_bf16_inbound( %pg, bfloat* %a) {
+; CHECK-LABEL: ldnf1h_bf16_inbound:
+; CHECK: ldnf1h { z0.h }, p0/z, [x0, #1, mul vl]
+; CHECK-NEXT: ret
+  %base_scalable = bitcast bfloat* %a to *
+  %base = getelementptr , * %base_scalable, i64 1
+  %base_scalar = bitcast * %base to bfloat*
+  %load = call  @llvm.aarch64.sve.ldnf1.nxv8bf16( %pg, bfloat* %base_scalar)
+  ret  %load
+}
+

[PATCH] D82392: [CodeGen] Add public function to emit C++ destructor call.

2020-06-23 Thread Zoe Carver via Phabricator via cfe-commits
zoecarver created this revision.
zoecarver added reviewers: rjmccall, mboehme.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Adds `CodeGen::emitCXXDestructorCall`, a function that creates a 
CodeGenFunction using the arguments provided, then invokes 
CodeGenFunction::EmitCXXDestructorCall.

This will allow other frontends (Swift, for example) to easily emit calls to 
object destructors with correct ABI semantics and calling convetions.

This is needed for Swift C++ interop. Here's the corresponding Swift change: 
https://github.com/apple/swift/pull/32291


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82392

Files:
  clang/include/clang/CodeGen/CodeGenABITypes.h
  clang/lib/CodeGen/ABIInfo.h
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CodeGenABITypes.cpp

Index: clang/lib/CodeGen/CodeGenABITypes.cpp
===
--- clang/lib/CodeGen/CodeGenABITypes.cpp
+++ clang/lib/CodeGen/CodeGenABITypes.cpp
@@ -115,3 +115,20 @@
  const FieldDecl *FD) {
   return CGM.getTypes().getCGRecordLayout(RD).getLLVMFieldNo(FD);
 }
+
+void CodeGen::emitCXXDestructorCall(CodeGenModule ,
+llvm::BasicBlock *InsertBlock,
+llvm::BasicBlock::iterator InsertPoint,
+const CXXDestructorDecl *D,
+CXXDtorType Type, bool ForVirtualBase,
+bool Delegating, llvm::Value *This,
+CharUnits ThisAlign, QualType ThisTy) {
+  Address ThisAddr(This, ThisAlign);
+  CodeGenFunction CGF(CGM);
+  CGF.CurCodeDecl = D;
+  CGF.CurFuncDecl = D;
+  CGF.CurFn = InsertBlock->getParent();
+  CGF.Builder.SetInsertPoint(InsertBlock, InsertPoint);
+  CGF.EmitCXXDestructorCall(D, Type, ForVirtualBase, Delegating, ThisAddr,
+ThisTy);
+}
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1462,9 +1462,10 @@
 case ABIArgInfo::Extend:
 case ABIArgInfo::Direct: {
   // FIXME: handle sseregparm someday...
-  llvm::StructType *STy = dyn_cast(AI.getCoerceToType());
-  if (AI.isDirect() && AI.getCanBeFlattened() && STy) {
-IRArgs.NumberOfArgs = STy->getNumElements();
+  llvm::Type *Ty = AI.getCoerceToType();
+  if (AI.isDirect() && AI.getCanBeFlattened() &&
+  isa(Ty)) {
+IRArgs.NumberOfArgs = cast(Ty)->getNumElements();
   } else {
 IRArgs.NumberOfArgs = 1;
   }
@@ -1644,8 +1645,9 @@
   // Fast-isel and the optimizer generally like scalar values better than
   // FCAs, so we flatten them if this is safe to do for this argument.
   llvm::Type *argType = ArgInfo.getCoerceToType();
-  llvm::StructType *st = dyn_cast(argType);
-  if (st && ArgInfo.isDirect() && ArgInfo.getCanBeFlattened()) {
+  if (ArgInfo.isDirect() && ArgInfo.getCanBeFlattened() &&
+  isa(argType)) {
+llvm::StructType *st = cast(argType);
 assert(NumIRArgs == st->getNumElements());
 for (unsigned i = 0, e = st->getNumElements(); i != e; ++i)
   ArgTypes[FirstIRArg + i] = st->getElementType(i);
Index: clang/lib/CodeGen/ABIInfo.h
===
--- clang/lib/CodeGen/ABIInfo.h
+++ clang/lib/CodeGen/ABIInfo.h
@@ -28,7 +28,6 @@
 
 namespace CodeGen {
   class ABIArgInfo;
-  class Address;
   class CGCXXABI;
   class CGFunctionInfo;
   class CodeGenFunction;
Index: clang/include/clang/CodeGen/CodeGenABITypes.h
===
--- clang/include/clang/CodeGen/CodeGenABITypes.h
+++ clang/include/clang/CodeGen/CodeGenABITypes.h
@@ -25,7 +25,9 @@
 
 #include "clang/AST/CanonicalType.h"
 #include "clang/AST/Type.h"
+#include "clang/Basic/ABI.h"
 #include "clang/CodeGen/CGFunctionInfo.h"
+#include "llvm/IR/BasicBlock.h"
 
 namespace llvm {
 class AttrBuilder;
@@ -40,6 +42,7 @@
 namespace clang {
 class ASTContext;
 class CXXConstructorDecl;
+class CXXDestructorDecl;
 class CXXRecordDecl;
 class CXXMethodDecl;
 class CodeGenOptions;
@@ -49,6 +52,7 @@
 class ObjCMethodDecl;
 class ObjCProtocolDecl;
 class PreprocessorOptions;
+class QualType;
 
 namespace CodeGen {
 class CGFunctionInfo;
@@ -90,6 +94,13 @@
 ImplicitCXXConstructorArgs
 getImplicitCXXConstructorArgs(CodeGenModule , const CXXConstructorDecl *D);
 
+void emitCXXDestructorCall(CodeGenModule , llvm::BasicBlock *InsertBlock,
+   llvm::BasicBlock::iterator InsertPoint,
+   const CXXDestructorDecl *D, CXXDtorType Type,
+   bool ForVirtualBase, bool Delegating,
+   llvm::Value *This, CharUnits ThisAlign,
+   QualType 

[PATCH] D82391: [AArch64][SVE] Add bfloat16 support to svext intrinsic

2020-06-23 Thread Cullen Rhodes via Phabricator via cfe-commits
c-rhodes created this revision.
c-rhodes added reviewers: sdesmalen, kmclaughlin, efriedma, david-arm, 
fpetrogalli.
Herald added subscribers: danielkiss, psnobl, rkruppe, hiraditya, 
kristof.beyls, tschuett.
Herald added projects: clang, LLVM.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82391

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ext-bfloat.c
  clang/utils/TableGen/SveEmitter.cpp
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -516,6 +516,16 @@
   ret  %out
 }
 
+define  @ext_bf16( %a,  %b) {
+; CHECK-LABEL: ext_bf16:
+; CHECK: ext z0.b, z0.b, z1.b, #6
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.ext.nxv8bf16( %a,
+%b,
+   i32 3)
+  ret  %out
+}
+
 define  @ext_f16( %a,  %b) {
 ; CHECK-LABEL: ext_f16:
 ; CHECK: ext z0.b, z0.b, z1.b, #6
@@ -1876,6 +1886,7 @@
 declare  @llvm.aarch64.sve.ext.nxv8i16(, , i32)
 declare  @llvm.aarch64.sve.ext.nxv4i32(, , i32)
 declare  @llvm.aarch64.sve.ext.nxv2i64(, , i32)
+declare  @llvm.aarch64.sve.ext.nxv8bf16(, , i32)
 declare  @llvm.aarch64.sve.ext.nxv8f16(, , i32)
 declare  @llvm.aarch64.sve.ext.nxv4f32(, , i32)
 declare  @llvm.aarch64.sve.ext.nxv2f64(, , i32)
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1405,12 +1405,13 @@
   // constraint that none of the bits change when stored to memory as one
   // type, and and reloaded as another type.
   let Predicates = [IsLE] in {
-def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
-def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
-def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
-def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
-def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
-def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv16i8 (bitconvert (nxv8i16  ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv16i8 (bitconvert (nxv4i32  ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv16i8 (bitconvert (nxv2i64  ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv16i8 (bitconvert (nxv8f16  ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv16i8 (bitconvert (nxv4f32  ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv16i8 (bitconvert (nxv2f64  ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv16i8 (bitconvert (nxv8bf16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
 
 def : Pat<(nxv8i16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8i16 ZPR:$src)>;
 def : Pat<(nxv8i16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8i16 ZPR:$src)>;
@@ -1435,7 +1436,6 @@
 
 def : Pat<(nxv8f16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8f16 ZPR:$src)>;
-def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
@@ -1454,6 +1454,9 @@
 def : Pat<(nxv2f64 (bitconvert (nxv2i64 ZPR:$src))), (nxv2f64 ZPR:$src)>;
 def : Pat<(nxv2f64 (bitconvert (nxv8f16 ZPR:$src))), (nxv2f64 ZPR:$src)>;
 def : Pat<(nxv2f64 (bitconvert (nxv4f32 ZPR:$src))), (nxv2f64 ZPR:$src)>;
+
+def : Pat<(nxv8bf16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
   }
 
   def : Pat<(nxv16i1 (reinterpret_cast (nxv16i1 PPR:$src))), (COPY_TO_REGCLASS PPR:$src, PPR)>;
Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -596,6 +596,7 @@
   case 'i':
 Predicate = false;
 Float = false;
+BFloat = false;
 ElementBitwidth = Bitwidth = 64;
 NumVectors = 0;
 Signed = false;
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ext-bfloat.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ext-bfloat.c
@@ -0,0 +1,26 @@
+// REQUIRES: aarch64-registered-target
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread Xun Li via Phabricator via cfe-commits
lxfind added a comment.

In D82314#2107910 , @junparser wrote:

> Rather than doing it here, can we build await_resume call expression with 
> MaterializedTemporaryExpr when expand the coawait expression. That's how gcc 
> does.


There doesn't appear to be a way to do that in Clang. It goes from the AST to 
IR directly, and there needs to be a MaterializedTemporaryExpr to wrap the 
result of co_await. Could you elaborate on how this might be done in Clang?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   >