[clang] 7063ac1 - [HIP] Allow target addr space in target builtins

2021-08-19 Thread Anshil Gandhi via cfe-commits

Author: Anshil Gandhi
Date: 2021-08-19T23:51:58-06:00
New Revision: 7063ac1afa656bdbb851c8ef120ff699c2e98483

URL: 
https://github.com/llvm/llvm-project/commit/7063ac1afa656bdbb851c8ef120ff699c2e98483
DIFF: 
https://github.com/llvm/llvm-project/commit/7063ac1afa656bdbb851c8ef120ff699c2e98483.diff

LOG: [HIP] Allow target addr space in target builtins

This patch allows target specific addr space in target builtins for HIP. It 
inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts 
implicit addr
space cast for generic to non-generic for target builtin arguments only.

It is NFC for non-HIP languages.

Differential Revision: https://reviews.llvm.org/D102405

Added: 


Modified: 
clang/include/clang/AST/Type.h
clang/lib/Basic/Targets/AMDGPU.h
clang/lib/Sema/SemaExpr.cpp
clang/test/CodeGenCUDA/builtins-amdgcn.cu

Removed: 




diff  --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index 09e9705bd86b8..fc83c895afa2e 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -495,7 +495,12 @@ class Qualifiers {
(A == LangAS::Default &&
 (B == LangAS::sycl_private || B == LangAS::sycl_local ||
  B == LangAS::sycl_global || B == LangAS::sycl_global_device ||
- B == LangAS::sycl_global_host));
+ B == LangAS::sycl_global_host)) ||
+   // In HIP device compilation, any cuda address space is allowed
+   // to implicitly cast into the default address space.
+   (A == LangAS::Default &&
+(B == LangAS::cuda_constant || B == LangAS::cuda_device ||
+ B == LangAS::cuda_shared));
   }
 
   /// Returns true if the address space in these qualifiers is equal to or

diff  --git a/clang/lib/Basic/Targets/AMDGPU.h 
b/clang/lib/Basic/Targets/AMDGPU.h
index 2e580ecf24259..77c2c5fd50145 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -352,7 +352,18 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   }
 
   LangAS getCUDABuiltinAddressSpace(unsigned AS) const override {
-return LangAS::Default;
+switch (AS) {
+case 0:
+  return LangAS::Default;
+case 1:
+  return LangAS::cuda_device;
+case 3:
+  return LangAS::cuda_shared;
+case 4:
+  return LangAS::cuda_constant;
+default:
+  return getLangASFromTargetAS(AS);
+}
   }
 
   llvm::Optional getConstantAddressSpace() const override {

diff  --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 8ef4a9d96320b..5bde87d02877e 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6572,6 +6572,53 @@ ExprResult Sema::BuildCallExpr(Scope *Scope, Expr *Fn, 
SourceLocation LParenLoc,
   return ExprError();
 
 checkDirectCallValidity(*this, Fn, FD, ArgExprs);
+
+// If this expression is a call to a builtin function in HIP device
+// compilation, allow a pointer-type argument to default address space to 
be
+// passed as a pointer-type parameter to a non-default address space.
+// If Arg is declared in the default address space and Param is declared
+// in a non-default address space, perform an implicit address space cast 
to
+// the parameter type.
+if (getLangOpts().HIP && getLangOpts().CUDAIsDevice && FD &&
+FD->getBuiltinID()) {
+  for (unsigned Idx = 0; Idx < FD->param_size(); ++Idx) {
+ParmVarDecl *Param = FD->getParamDecl(Idx);
+if (!ArgExprs[Idx] || !Param || !Param->getType()->isPointerType() ||
+!ArgExprs[Idx]->getType()->isPointerType())
+  continue;
+
+auto ParamAS = Param->getType()->getPointeeType().getAddressSpace();
+auto ArgTy = ArgExprs[Idx]->getType();
+auto ArgPtTy = ArgTy->getPointeeType();
+auto ArgAS = ArgPtTy.getAddressSpace();
+
+// Only allow implicit casting from a non-default address space pointee
+// type to a default address space pointee type
+if (ArgAS != LangAS::Default || ParamAS == LangAS::Default)
+  continue;
+
+// First, ensure that the Arg is an RValue.
+if (ArgExprs[Idx]->isGLValue()) {
+  ArgExprs[Idx] = ImplicitCastExpr::Create(
+  Context, ArgExprs[Idx]->getType(), CK_NoOp, ArgExprs[Idx],
+  nullptr, VK_PRValue, FPOptionsOverride());
+}
+
+// Construct a new arg type with address space of Param
+Qualifiers ArgPtQuals = ArgPtTy.getQualifiers();
+ArgPtQuals.setAddressSpace(ParamAS);
+auto NewArgPtTy =
+Context.getQualifiedType(ArgPtTy.getUnqualifiedType(), ArgPtQuals);
+auto NewArgTy =
+Context.getQualifiedType(Context.getPointerType(NewArgPtTy),
+ ArgTy.getQualifiers());
+
+// Finally 

[libclc] 59510c4 - libclc: Fix rounding during type conversion

2021-08-19 Thread Tom Stellard via cfe-commits

Author: Daniel Stone
Date: 2021-08-19T22:24:19-07:00
New Revision: 59510c421208e178de63b3640787d02ad56deb37

URL: 
https://github.com/llvm/llvm-project/commit/59510c421208e178de63b3640787d02ad56deb37
DIFF: 
https://github.com/llvm/llvm-project/commit/59510c421208e178de63b3640787d02ad56deb37.diff

LOG: libclc: Fix rounding during type conversion

The rounding during type conversion uses multiple conversions, selecting
between them to try to discover if rounding occurred. This appears to
not have been tested, since it would generate code of the form:
float convert_float_rtp(char x)
{
  float r = convert_float(x);
  char y = convert_char(y);
  [...]
}

which will access uninitialised data. The idea appears to have been to
have done a char -> float -> char roundtrip in order to discover the
rounding, so do this.

Discovered by inspection.

Signed-off-by: Daniel Stone 

Reviewed By: jvesely

Differential Revision: https://reviews.llvm.org/D81999

Added: 


Modified: 
libclc/generic/lib/gen_convert.py

Removed: 




diff  --git a/libclc/generic/lib/gen_convert.py 
b/libclc/generic/lib/gen_convert.py
index 7e649faa7dfcb..469244047de96 100644
--- a/libclc/generic/lib/gen_convert.py
+++ b/libclc/generic/lib/gen_convert.py
@@ -355,7 +355,7 @@ def generate_float_conversion(src, dst, size, mode, sat):
 print("  return convert_{DST}{N}(x);".format(DST=dst, N=size))
   else:
 print("  {DST}{N} r = convert_{DST}{N}(x);".format(DST=dst, N=size))
-print("  {SRC}{N} y = convert_{SRC}{N}(y);".format(SRC=src, N=size))
+print("  {SRC}{N} y = convert_{SRC}{N}(r);".format(SRC=src, N=size))
 if mode == '_rtz':
   if src in int_types:
 print("  {USRC}{N} abs_x = abs(x);".format(USRC=unsigned_type[src], 
N=size))



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106620: [Clang][LLVM] generate btf_tag annotations for func parameters

2021-08-19 Thread Yonghong Song via Phabricator via cfe-commits
yonghong-song updated this revision to Diff 367711.
yonghong-song added a comment.

- fix llvm/unittests failure


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106620/new/

https://reviews.llvm.org/D106620

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/test/CodeGen/attr-btf_tag-parameter.c
  llvm/include/llvm/IR/DIBuilder.h
  llvm/include/llvm/IR/DebugInfoMetadata.h
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/MetadataLoader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/DIBuilder.cpp
  llvm/lib/IR/DebugInfoMetadata.cpp
  llvm/lib/IR/LLVMContextImpl.h
  llvm/test/Bitcode/attr-btf_tag-parameter.ll
  llvm/unittests/IR/MetadataTest.cpp

Index: llvm/unittests/IR/MetadataTest.cpp
===
--- llvm/unittests/IR/MetadataTest.cpp
+++ llvm/unittests/IR/MetadataTest.cpp
@@ -1171,7 +1171,7 @@
   DIType *Type = getDerivedType();
   DINode::DIFlags Flags = static_cast(7);
   auto *VlaExpr = DILocalVariable::get(Context, Scope, "vla_expr", File, 8,
-   Type, 2, Flags, 8);
+   Type, 2, Flags, 8, nullptr);
 
   auto *N = DISubrange::get(Context, VlaExpr, 0);
   auto Count = N->getCount();
@@ -1199,7 +1199,7 @@
   auto *UIother = ConstantAsMetadata::get(
   ConstantInt::getSigned(Type::getInt64Ty(Context), 20));
   auto *UVother = DILocalVariable::get(Context, Scope, "ubother", File, 8, Type,
-   2, Flags, 8);
+   2, Flags, 8, nullptr);
   auto *UEother = DIExpression::get(Context, {5, 6});
   auto *LIZero = ConstantAsMetadata::get(
   ConstantInt::getSigned(Type::getInt64Ty(Context), 0));
@@ -1242,13 +1242,16 @@
   DIType *Type = getDerivedType();
   DINode::DIFlags Flags = static_cast(7);
   auto *LV =
-  DILocalVariable::get(Context, Scope, "lb", File, 8, Type, 2, Flags, 8);
+  DILocalVariable::get(Context, Scope, "lb", File, 8, Type, 2, Flags, 8,
+   nullptr);
   auto *UV =
-  DILocalVariable::get(Context, Scope, "ub", File, 8, Type, 2, Flags, 8);
+  DILocalVariable::get(Context, Scope, "ub", File, 8, Type, 2, Flags, 8,
+   nullptr);
   auto *SV =
-  DILocalVariable::get(Context, Scope, "st", File, 8, Type, 2, Flags, 8);
+  DILocalVariable::get(Context, Scope, "st", File, 8, Type, 2, Flags, 8,
+   nullptr);
   auto *SVother = DILocalVariable::get(Context, Scope, "stother", File, 8, Type,
-   2, Flags, 8);
+   2, Flags, 8, nullptr);
   auto *SIother = ConstantAsMetadata::get(
   ConstantInt::getSigned(Type::getInt64Ty(Context), 20));
   auto *SEother = DIExpression::get(Context, {5, 6});
@@ -1289,7 +1292,7 @@
   auto *LIother = ConstantAsMetadata::get(
   ConstantInt::getSigned(Type::getInt64Ty(Context), 20));
   auto *LVother = DILocalVariable::get(Context, Scope, "lbother", File, 8, Type,
-   2, Flags, 8);
+   2, Flags, 8, nullptr);
 
   auto *N = DISubrange::get(Context, nullptr, LE, UE, SE);
 
@@ -1328,7 +1331,7 @@
   auto *SI = DIExpression::get(Context, {dwarf::DW_OP_consts, 4});
   auto *UIother = DIExpression::get(Context, {dwarf::DW_OP_consts, 20});
   auto *UVother = DILocalVariable::get(Context, Scope, "ubother", File, 8, Type,
-   2, Flags, 8);
+   2, Flags, 8, nullptr);
   auto *UEother = DIExpression::get(Context, {5, 6});
   auto *LIZero = DIExpression::get(Context, {dwarf::DW_OP_consts, 0});
   auto *UIZero = DIExpression::get(Context, {dwarf::DW_OP_consts, 0});
@@ -1371,13 +1374,16 @@
   DIType *Type = getDerivedType();
   DINode::DIFlags Flags = static_cast(7);
   auto *LV =
-  DILocalVariable::get(Context, Scope, "lb", File, 8, Type, 2, Flags, 8);
+  DILocalVariable::get(Context, Scope, "lb", File, 8, Type, 2, Flags, 8,
+   nullptr);
   auto *UV =
-  DILocalVariable::get(Context, Scope, "ub", File, 8, Type, 2, Flags, 8);
+  DILocalVariable::get(Context, Scope, "ub", File, 8, Type, 2, Flags, 8,
+   nullptr);
   auto *SV =
-  DILocalVariable::get(Context, Scope, "st", File, 8, Type, 2, Flags, 8);
+  DILocalVariable::get(Context, Scope, "st", File, 8, Type, 2, Flags, 8,
+   nullptr);
   auto *SVother = DILocalVariable::get(Context, Scope, "stother", File, 8, Type,
-   2, Flags, 8);
+   2, Flags, 8, nullptr);
   auto *SIother = DIExpression::get(
   Context, {dwarf::DW_OP_consts, static_cast(-1)});
   auto *SEother = DIExpression::get(Context, {5, 6});
@@ -1412,12 +1418,12 @@
   DIType *Type = 

[PATCH] D106619: [Clang][LLVM] generate btf_tag annotations for DIGlobalVariable

2021-08-19 Thread Yonghong Song via Phabricator via cfe-commits
yonghong-song updated this revision to Diff 367710.
yonghong-song added a comment.

- fix llvm/unittest failures.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106619/new/

https://reviews.llvm.org/D106619

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/test/CodeGen/attr-btf_tag-diglobalvariable.c
  llvm/include/llvm/IR/DIBuilder.h
  llvm/include/llvm/IR/DebugInfoMetadata.h
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/MetadataLoader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/DIBuilder.cpp
  llvm/lib/IR/DebugInfoMetadata.cpp
  llvm/lib/IR/LLVMContextImpl.h
  llvm/test/Bitcode/attr-btf_tag-diglobalvariable.ll
  llvm/unittests/IR/MetadataTest.cpp

Index: llvm/unittests/IR/MetadataTest.cpp
===
--- llvm/unittests/IR/MetadataTest.cpp
+++ llvm/unittests/IR/MetadataTest.cpp
@@ -2574,7 +2574,8 @@
 
   auto *N = DIGlobalVariable::get(
   Context, Scope, Name, LinkageName, File, Line, Type, IsLocalToUnit,
-  IsDefinition, StaticDataMemberDeclaration, templateParams, AlignInBits);
+  IsDefinition, StaticDataMemberDeclaration, templateParams, AlignInBits,
+  nullptr);
 
   EXPECT_EQ(dwarf::DW_TAG_variable, N->getTag());
   EXPECT_EQ(Scope, N->getScope());
@@ -2591,52 +2592,54 @@
   EXPECT_EQ(N, DIGlobalVariable::get(Context, Scope, Name, LinkageName, File,
  Line, Type, IsLocalToUnit, IsDefinition,
  StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
 
   EXPECT_NE(N, DIGlobalVariable::get(
Context, getSubprogram(), Name, LinkageName, File, Line,
Type, IsLocalToUnit, IsDefinition,
-   StaticDataMemberDeclaration, templateParams, AlignInBits));
+   StaticDataMemberDeclaration, templateParams, AlignInBits,
+   nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, "other", LinkageName, File,
  Line, Type, IsLocalToUnit, IsDefinition,
  StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, Name, "other", File, Line,
  Type, IsLocalToUnit, IsDefinition,
  StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, Name, LinkageName,
  getFile(), Line, Type, IsLocalToUnit,
  IsDefinition, StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, Name, LinkageName, File,
  Line + 1, Type, IsLocalToUnit,
  IsDefinition, StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, Name, LinkageName, File,
  Line, getDerivedType(), IsLocalToUnit,
  IsDefinition, StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, Name, LinkageName, File,
  Line, Type, !IsLocalToUnit, IsDefinition,
  StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, Name, LinkageName, File,
  Line, Type, IsLocalToUnit, !IsDefinition,
  StaticDataMemberDeclaration,
- templateParams, AlignInBits));
+ templateParams, AlignInBits, nullptr));
   EXPECT_NE(N, DIGlobalVariable::get(Context, Scope, Name, LinkageName, File,
  Line, Type, IsLocalToUnit, IsDefinition,
  cast(getDerivedType()),
- 

[PATCH] D106616: [Clang][LLVM] generate btf_tag annotations for DIDerived types

2021-08-19 Thread Yonghong Song via Phabricator via cfe-commits
yonghong-song updated this revision to Diff 367709.
yonghong-song added a comment.

- use creation-parameters for annotations instead of replacement-style APIs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106616/new/

https://reviews.llvm.org/D106616

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/lib/CodeGen/CGDebugInfo.h
  clang/test/CodeGen/attr-btf_tag-field.c
  llvm/include/llvm/IR/DIBuilder.h
  llvm/include/llvm/IR/DebugInfoMetadata.h
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/MetadataLoader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/DIBuilder.cpp
  llvm/lib/IR/DebugInfoMetadata.cpp
  llvm/lib/IR/LLVMContextImpl.h
  llvm/test/Bitcode/attr-btf_tag-field.ll

Index: llvm/test/Bitcode/attr-btf_tag-field.ll
===
--- /dev/null
+++ llvm/test/Bitcode/attr-btf_tag-field.ll
@@ -0,0 +1,91 @@
+; REQUIRES: x86-registered-target
+; RUN: llvm-as < %s | llvm-dis | FileCheck %s
+
+%struct.t1 = type { i32 }
+%struct.t2 = type { i8, [3 x i8] }
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @foo(%struct.t1* %arg) #0 !dbg !9 {
+entry:
+  %arg.addr = alloca %struct.t1*, align 8
+  store %struct.t1* %arg, %struct.t1** %arg.addr, align 8
+  call void @llvm.dbg.declare(metadata %struct.t1** %arg.addr, metadata !20, metadata !DIExpression()), !dbg !21
+  %0 = load %struct.t1*, %struct.t1** %arg.addr, align 8, !dbg !22
+  %a = getelementptr inbounds %struct.t1, %struct.t1* %0, i32 0, i32 0, !dbg !23
+  %1 = load i32, i32* %a, align 4, !dbg !23
+  ret i32 %1, !dbg !24
+}
+
+; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
+declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @foo2(%struct.t2* %arg) #0 !dbg !25 {
+entry:
+  %arg.addr = alloca %struct.t2*, align 8
+  store %struct.t2* %arg, %struct.t2** %arg.addr, align 8
+  call void @llvm.dbg.declare(metadata %struct.t2** %arg.addr, metadata !32, metadata !DIExpression()), !dbg !33
+  %0 = load %struct.t2*, %struct.t2** %arg.addr, align 8, !dbg !34
+  %1 = bitcast %struct.t2* %0 to i8*, !dbg !35
+  %bf.load = load i8, i8* %1, align 4, !dbg !35
+  %bf.shl = shl i8 %bf.load, 7, !dbg !35
+  %bf.ashr = ashr i8 %bf.shl, 7, !dbg !35
+  %bf.cast = sext i8 %bf.ashr to i32, !dbg !35
+  ret i32 %bf.cast, !dbg !36
+}
+
+attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { nofree nosync nounwind readnone speculatable willreturn }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!3, !4, !5, !6, !7}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 14.0.0 (https://github.com/llvm/llvm-project.git 4cbaee98885ead226304e8836090069db6596965)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "attr-btf_tag-field.c", directory: "/home/yhs/work/tests/llvm/btf_tag")
+!2 = !{}
+!3 = !{i32 7, !"Dwarf Version", i32 4}
+!4 = !{i32 2, !"Debug Info Version", i32 3}
+!5 = !{i32 1, !"wchar_size", i32 4}
+!6 = !{i32 7, !"uwtable", i32 1}
+!7 = !{i32 7, !"frame-pointer", i32 2}
+!8 = !{!"clang version 14.0.0 (https://github.com/llvm/llvm-project.git 4cbaee98885ead226304e8836090069db6596965)"}
+!9 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 11, type: !10, scopeLine: 11, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
+!10 = !DISubroutineType(types: !11)
+!11 = !{!12, !13}
+!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64)
+!14 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", file: !1, line: 7, size: 32, elements: !15)
+!15 = !{!16}
+!16 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !14, file: !1, line: 8, baseType: !12, size: 32, annotations: !17)
+!17 = !{!18, !19}
+!18 = !{!"btf_tag", !"tag1"}
+!19 = !{!"btf_tag", !"tag2"}
+
+; CHECK:!DIDerivedType(tag: DW_TAG_member, name: "a"
+; CHECK-SAME:   annotations: ![[ANNOT:[0-9]+]]
+; CHECK:![[ANNOT]] = !{![[TAG1:[0-9]+]], ![[TAG2:[0-9]+]]}
+; CHECK:![[TAG1]] = !{!"btf_tag", !"tag1"}
+; CHECK:![[TAG2]] = !{!"btf_tag", !"tag2"}
+
+!20 = !DILocalVariable(name: "arg", arg: 1, scope: !9, file: !1, line: 11, type: !13)
+!21 = !DILocation(line: 11, column: 20, scope: !9)
+!22 = !DILocation(line: 12, column: 10, scope: !9)
+!23 = !DILocation(line: 12, column: 15, scope: !9)
+!24 = !DILocation(line: 12, column: 3, scope: !9)
+!25 = distinct !DISubprogram(name: "foo2", scope: 

[PATCH] D105168: [RISCV] Unify the arch string parsing logic to RISCVISAInfo.

2021-08-19 Thread Kito Cheng via Phabricator via cfe-commits
kito-cheng added a comment.

@asb Thanks for report that, there is no warning and build error when I build 
with GCC 7 (which is default compiler in Ubuntu 18.04), but I can reproduce 
that with clang 13, seems like I should switch my default compiler :P


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105168/new/

https://reviews.llvm.org/D105168

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105168: [RISCV] Unify the arch string parsing logic to RISCVISAInfo.

2021-08-19 Thread Kito Cheng via Phabricator via cfe-commits
kito-cheng updated this revision to Diff 367704.
kito-cheng added a comment.

Changes:

- Fix build warning and build error with clang


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105168/new/

https://reviews.llvm.org/D105168

Files:
  clang/lib/Basic/Targets/RISCV.cpp
  clang/lib/Basic/Targets/RISCV.h
  clang/lib/Driver/ToolChains/Arch/RISCV.cpp
  clang/test/Driver/riscv-abi.c
  clang/test/Driver/riscv-arch.c
  llvm/include/llvm/Support/RISCVISAInfo.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/RISCVISAInfo.cpp
  llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
  llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.cpp
  llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
  llvm/lib/Target/RISCV/MCTargetDesc/RISCVTargetStreamer.cpp
  llvm/test/MC/RISCV/attribute-arch.s
  llvm/test/MC/RISCV/attribute-with-insts.s
  llvm/test/MC/RISCV/invalid-attribute.s

Index: llvm/test/MC/RISCV/invalid-attribute.s
===
--- llvm/test/MC/RISCV/invalid-attribute.s
+++ llvm/test/MC/RISCV/invalid-attribute.s
@@ -7,10 +7,10 @@
 # RUN: not llvm-mc %s -triple=riscv64 -filetype=asm 2>&1 | FileCheck %s
 
 .attribute arch, "foo"
-# CHECK: [[@LINE-1]]:18: error: bad arch string foo
+# CHECK: [[@LINE-1]]:18: error: invalid arch name 'foo', string must begin with rv32{i,e,g} or rv64{i,g}
 
 .attribute arch, "rv32i2p0_y2p0"
-# CHECK: [[@LINE-1]]:18: error: bad arch string y2p0
+# CHECK: [[@LINE-1]]:18: error: invalid arch name 'rv32i2p0_y2p0', invalid standard user-level extension 'y'
 
 .attribute stack_align, "16"
 # CHECK: [[@LINE-1]]:25: error: expected numeric constant
Index: llvm/test/MC/RISCV/attribute-with-insts.s
===
--- llvm/test/MC/RISCV/attribute-with-insts.s
+++ llvm/test/MC/RISCV/attribute-with-insts.s
@@ -10,7 +10,7 @@
 # RUN:   | llvm-objdump --triple=riscv64 -d -M no-aliases - \
 # RUN:   | FileCheck -check-prefix=CHECK-INST %s
 
-.attribute arch, "rv64i2p0_m2p0_a2p0_d2p0_c2p0"
+.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0"
 
 # CHECK-INST: lr.w t0, (t1)
 lr.w t0, (t1)
Index: llvm/test/MC/RISCV/attribute-arch.s
===
--- llvm/test/MC/RISCV/attribute-arch.s
+++ llvm/test/MC/RISCV/attribute-arch.s
@@ -9,9 +9,6 @@
 .attribute arch, "rv32i2"
 # CHECK: attribute  5, "rv32i2p0"
 
-.attribute arch, "rv32i2p"
-# CHECK: attribute  5, "rv32i2p0"
-
 .attribute arch, "rv32i2p0"
 # CHECK: attribute  5, "rv32i2p0"
 
@@ -33,14 +30,14 @@
 .attribute arch, "rv32ima2p0_fdc"
 # CHECK: attribute  5, "rv32i2p0_m2p0_a2p0_f2p0_d2p0_c2p0"
 
-.attribute arch, "rv32ima2p_fdc"
+.attribute arch, "rv32ima2p0_fdc"
 # CHECK: attribute  5, "rv32i2p0_m2p0_a2p0_f2p0_d2p0_c2p0"
 
 .attribute arch, "rv32ib"
 # CHECK: attribute  5, "rv32i2p0_b0p93_zba0p93_zbb0p93_zbc0p93_zbe0p93_zbf0p93_zbm0p93_zbp0p93_zbr0p93_zbs0p93_zbt0p93"
 
 .attribute arch, "rv32iv"
-# CHECK: attribute  5, "rv32i2p0_v0p10"
+# CHECK: attribute  5, "rv32i2p0_v0p10_zvlsseg0p10"
 
 .attribute arch, "rv32izba"
 # CHECK: attribute  5, "rv32i2p0_zba0p93"
Index: llvm/lib/Target/RISCV/MCTargetDesc/RISCVTargetStreamer.cpp
===
--- llvm/lib/Target/RISCV/MCTargetDesc/RISCVTargetStreamer.cpp
+++ llvm/lib/Target/RISCV/MCTargetDesc/RISCVTargetStreamer.cpp
@@ -11,9 +11,11 @@
 //===--===//
 
 #include "RISCVTargetStreamer.h"
+#include "RISCVBaseInfo.h"
 #include "RISCVMCTargetDesc.h"
 #include "llvm/Support/FormattedStream.h"
 #include "llvm/Support/RISCVAttributes.h"
+#include "llvm/Support/RISCVISAInfo.h"
 
 using namespace llvm;
 
@@ -43,57 +45,13 @@
   else
 emitAttribute(RISCVAttrs::STACK_ALIGN, RISCVAttrs::ALIGN_16);
 
-  std::string Arch = "rv32";
-  if (STI.hasFeature(RISCV::Feature64Bit))
-Arch = "rv64";
-  if (STI.hasFeature(RISCV::FeatureRV32E))
-Arch += "e1p9";
-  else
-Arch += "i2p0";
-  if (STI.hasFeature(RISCV::FeatureStdExtM))
-Arch += "_m2p0";
-  if (STI.hasFeature(RISCV::FeatureStdExtA))
-Arch += "_a2p0";
-  if (STI.hasFeature(RISCV::FeatureStdExtF))
-Arch += "_f2p0";
-  if (STI.hasFeature(RISCV::FeatureStdExtD))
-Arch += "_d2p0";
-  if (STI.hasFeature(RISCV::FeatureStdExtC))
-Arch += "_c2p0";
-  if (STI.hasFeature(RISCV::FeatureStdExtB))
-Arch += "_b0p93";
-  if (STI.hasFeature(RISCV::FeatureStdExtV))
-Arch += "_v0p10";
-  if (STI.hasFeature(RISCV::FeatureExtZfh))
-Arch += "_zfh0p1";
-  if (STI.hasFeature(RISCV::FeatureExtZba))
-Arch += "_zba0p93";
-  if (STI.hasFeature(RISCV::FeatureExtZbb))
-Arch += "_zbb0p93";
-  if (STI.hasFeature(RISCV::FeatureExtZbc))
-Arch += "_zbc0p93";
-  if (STI.hasFeature(RISCV::FeatureExtZbe))
-Arch += "_zbe0p93";
-  if (STI.hasFeature(RISCV::FeatureExtZbf))
-Arch += 

[PATCH] D108441: [clang] Fix JSON AST output when a filter is used

2021-08-19 Thread William Woodruff via Phabricator via cfe-commits
woodruffw created this revision.
woodruffw added reviewers: klimek, rsmith.
woodruffw added a project: clang.
woodruffw requested review of this revision.
Herald added a subscriber: cfe-commits.

Without this, the combination of `-ast-dump=json` and `-ast-dump-filter FILTER` 
produces invalid JSON: the first line is a string that says `Dumping 
$SOME_DECL_NAME: `.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108441

Files:
  clang/lib/Frontend/ASTConsumers.cpp


Index: clang/lib/Frontend/ASTConsumers.cpp
===
--- clang/lib/Frontend/ASTConsumers.cpp
+++ clang/lib/Frontend/ASTConsumers.cpp
@@ -57,8 +57,11 @@
 bool ShowColors = Out.has_colors();
 if (ShowColors)
   Out.changeColor(raw_ostream::BLUE);
-Out << (OutputKind != Print ? "Dumping " : "Printing ") << getName(D)
-<< ":\n";
+
+if (OutputFormat == ADOF_Default)
+  Out << (OutputKind != Print ? "Dumping " : "Printing ") << getName(D)
+  << ":\n";
+
 if (ShowColors)
   Out.resetColor();
 print(D);


Index: clang/lib/Frontend/ASTConsumers.cpp
===
--- clang/lib/Frontend/ASTConsumers.cpp
+++ clang/lib/Frontend/ASTConsumers.cpp
@@ -57,8 +57,11 @@
 bool ShowColors = Out.has_colors();
 if (ShowColors)
   Out.changeColor(raw_ostream::BLUE);
-Out << (OutputKind != Print ? "Dumping " : "Printing ") << getName(D)
-<< ":\n";
+
+if (OutputFormat == ADOF_Default)
+  Out << (OutputKind != Print ? "Dumping " : "Printing ") << getName(D)
+  << ":\n";
+
 if (ShowColors)
   Out.resetColor();
 print(D);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108421: Mark openmp internal global dso_local

2021-08-19 Thread kamlesh kumar via Phabricator via cfe-commits
kamleshbhalui updated this revision to Diff 367700.
kamleshbhalui added a comment.

clang formatted


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108421/new/

https://reviews.llvm.org/D108421

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp


Index: clang/lib/CodeGen/CGOpenMPRuntime.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2179,11 +2179,14 @@
 return &*Elem.second;
   }
 
-  return Elem.second = new llvm::GlobalVariable(
- CGM.getModule(), Ty, /*IsConstant*/ false,
- llvm::GlobalValue::CommonLinkage, 
llvm::Constant::getNullValue(Ty),
- Elem.first(), /*InsertBefore=*/nullptr,
- llvm::GlobalValue::NotThreadLocal, AddressSpace);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  CGM.getModule(), Ty, /*IsConstant*/ false,
+  llvm::GlobalValue::CommonLinkage, llvm::Constant::getNullValue(Ty),
+  Elem.first(), /*InsertBefore=*/nullptr, 
llvm::GlobalValue::NotThreadLocal,
+  AddressSpace);
+  GV->setDSOLocal(true);
+  Elem.second = GV;
+  return Elem.second;
 }
 
 llvm::Value *CGOpenMPRuntime::getCriticalRegionLock(StringRef CriticalName) {


Index: clang/lib/CodeGen/CGOpenMPRuntime.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2179,11 +2179,14 @@
 return &*Elem.second;
   }
 
-  return Elem.second = new llvm::GlobalVariable(
- CGM.getModule(), Ty, /*IsConstant*/ false,
- llvm::GlobalValue::CommonLinkage, llvm::Constant::getNullValue(Ty),
- Elem.first(), /*InsertBefore=*/nullptr,
- llvm::GlobalValue::NotThreadLocal, AddressSpace);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  CGM.getModule(), Ty, /*IsConstant*/ false,
+  llvm::GlobalValue::CommonLinkage, llvm::Constant::getNullValue(Ty),
+  Elem.first(), /*InsertBefore=*/nullptr, llvm::GlobalValue::NotThreadLocal,
+  AddressSpace);
+  GV->setDSOLocal(true);
+  Elem.second = GV;
+  return Elem.second;
 }
 
 llvm::Value *CGOpenMPRuntime::getCriticalRegionLock(StringRef CriticalName) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 508b066 - [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-19 Thread Anshil Gandhi via cfe-commits

Author: Anshil Gandhi
Date: 2021-08-19T20:51:19-06:00
New Revision: 508b06699a396cc6f2f2602dab350860cb69f087

URL: 
https://github.com/llvm/llvm-project/commit/508b06699a396cc6f2f2602dab350860cb69f087
DIFF: 
https://github.com/llvm/llvm-project/commit/508b06699a396cc6f2f2602dab350860cb69f087.diff

LOG: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating 
hardware instructions

Produce remarks when atomic instructions are expanded into hardware instructions
in SIISelLowering.cpp. Currently, these remarks are only emitted for atomic fadd
instructions.

Differential Revision: https://reviews.llvm.org/D108150

Added: 
clang/test/CodeGenOpenCL/atomics-cas-remarks-gfx90a.cl
clang/test/CodeGenOpenCL/atomics-unsafe-hw-remarks-gfx90a.cl
llvm/test/CodeGen/AMDGPU/atomics-cas-remarks-gfx90a.ll
llvm/test/CodeGen/AMDGPU/atomics-hw-remarks-gfx90a.ll

Modified: 
llvm/lib/CodeGen/AtomicExpandPass.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp

Removed: 
clang/test/CodeGenOpenCL/atomics-remarks-gfx90a.cl
llvm/test/CodeGen/AMDGPU/atomics-remarks-gfx90a.ll



diff  --git a/clang/test/CodeGenOpenCL/atomics-remarks-gfx90a.cl 
b/clang/test/CodeGenOpenCL/atomics-cas-remarks-gfx90a.cl
similarity index 100%
rename from clang/test/CodeGenOpenCL/atomics-remarks-gfx90a.cl
rename to clang/test/CodeGenOpenCL/atomics-cas-remarks-gfx90a.cl

diff  --git a/clang/test/CodeGenOpenCL/atomics-unsafe-hw-remarks-gfx90a.cl 
b/clang/test/CodeGenOpenCL/atomics-unsafe-hw-remarks-gfx90a.cl
new file mode 100644
index 0..ea3324126c209
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/atomics-unsafe-hw-remarks-gfx90a.cl
@@ -0,0 +1,44 @@
+// RUN: %clang_cc1 -cl-std=CL2.0 -O0 -triple=amdgcn-amd-amdhsa -target-cpu 
gfx90a \
+// RUN: -Rpass=si-lower -munsafe-fp-atomics %s -S -emit-llvm -o - 2>&1 | \
+// RUN: FileCheck %s --check-prefix=GFX90A-HW
+
+// RUN: %clang_cc1 -cl-std=CL2.0 -O0 -triple=amdgcn-amd-amdhsa -target-cpu 
gfx90a \
+// RUN: -Rpass=si-lower -munsafe-fp-atomics %s -S -o - 2>&1 | \
+// RUN: FileCheck %s --check-prefix=GFX90A-HW-REMARK
+
+
+// REQUIRES: amdgpu-registered-target
+
+typedef enum memory_order {
+  memory_order_relaxed = __ATOMIC_RELAXED,
+  memory_order_acquire = __ATOMIC_ACQUIRE,
+  memory_order_release = __ATOMIC_RELEASE,
+  memory_order_acq_rel = __ATOMIC_ACQ_REL,
+  memory_order_seq_cst = __ATOMIC_SEQ_CST
+} memory_order;
+
+typedef enum memory_scope {
+  memory_scope_work_item = __OPENCL_MEMORY_SCOPE_WORK_ITEM,
+  memory_scope_work_group = __OPENCL_MEMORY_SCOPE_WORK_GROUP,
+  memory_scope_device = __OPENCL_MEMORY_SCOPE_DEVICE,
+  memory_scope_all_svm_devices = __OPENCL_MEMORY_SCOPE_ALL_SVM_DEVICES,
+#if defined(cl_intel_subgroups) || defined(cl_khr_subgroups)
+  memory_scope_sub_group = __OPENCL_MEMORY_SCOPE_SUB_GROUP
+#endif
+} memory_scope;
+
+// GFX90A-HW-REMARK: Hardware instruction generated for atomic fadd operation 
at memory scope workgroup-one-as due to an unsafe request. [-Rpass=si-lower]
+// GFX90A-HW-REMARK: Hardware instruction generated for atomic fadd operation 
at memory scope agent-one-as due to an unsafe request. [-Rpass=si-lower]
+// GFX90A-HW-REMARK: Hardware instruction generated for atomic fadd operation 
at memory scope wavefront-one-as due to an unsafe request. [-Rpass=si-lower]
+// GFX90A-HW-REMARK: global_atomic_add_f32 v0, v[0:1], v2, off glc
+// GFX90A-HW-REMARK: global_atomic_add_f32 v0, v[0:1], v2, off glc
+// GFX90A-HW-REMARK: global_atomic_add_f32 v0, v[0:1], v2, off glc
+// GFX90A-HW-LABEL: @atomic_unsafe_hw
+// GFX90A-HW:   atomicrmw fadd float addrspace(1)* %{{.*}}, float %{{.*}} 
syncscope("workgroup-one-as") monotonic, align 4
+// GFX90A-HW:   atomicrmw fadd float addrspace(1)* %{{.*}}, float %{{.*}} 
syncscope("agent-one-as") monotonic, align 4
+// GFX90A-HW:   atomicrmw fadd float addrspace(1)* %{{.*}}, float %{{.*}} 
syncscope("wavefront-one-as") monotonic, align 4
+void atomic_unsafe_hw(__global atomic_float *d, float a) {
+  float ret1 = __opencl_atomic_fetch_add(d, a, memory_order_relaxed, 
memory_scope_work_group);
+  float ret2 = __opencl_atomic_fetch_add(d, a, memory_order_relaxed, 
memory_scope_device);
+  float ret3 = __opencl_atomic_fetch_add(d, a, memory_order_relaxed, 
memory_scope_sub_group);
+}

diff  --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp 
b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index 47cdd222702f2..1297f99698d8b 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -610,7 +610,7 @@ bool AtomicExpand::tryExpandAtomicRMW(AtomicRMWInst *AI) {
   : SSNs[AI->getSyncScopeID()];
   OptimizationRemarkEmitter ORE(AI->getFunction());
   ORE.emit([&]() {
-return OptimizationRemark(DEBUG_TYPE, "Passed", AI->getFunction())
+return OptimizationRemark(DEBUG_TYPE, "Passed", AI)
<< "A compare and swap loop was generated for an 

[PATCH] D107138: [PowerPC] Implement cmplxl builtins

2021-08-19 Thread Albion Fung via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9d4faa8ac3e7: [PowerPC] Implement cmplxl builtins (authored 
by Conanap).

Changed prior to commit:
  https://reviews.llvm.org/D107138?vs=365596=367695#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107138/new/

https://reviews.llvm.org/D107138

Files:
  clang/lib/Basic/Targets/PPC.cpp
  clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c

Index: clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c
===
--- clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c
+++ clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c
@@ -226,3 +226,115 @@
 float _Complex testcmplxf(float real, float imag) {
   return __cmplxf(real, imag);
 }
+
+// 64BIT-LABEL: @test_xl_cmplxl(
+// 64BIT-NEXT:  entry:
+// 64BIT-NEXT:[[RETVAL:%.*]] = alloca { ppc_fp128, ppc_fp128 }, align 16
+// 64BIT-NEXT:[[LDA_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BIT-NEXT:[[LDB_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BIT-NEXT:store ppc_fp128 [[LDA:%.*]], ppc_fp128* [[LDA_ADDR]], align 16
+// 64BIT-NEXT:store ppc_fp128 [[LDB:%.*]], ppc_fp128* [[LDB_ADDR]], align 16
+// 64BIT-NEXT:[[TMP0:%.*]] = load ppc_fp128, ppc_fp128* [[LDA_ADDR]], align 16
+// 64BIT-NEXT:[[TMP1:%.*]] = load ppc_fp128, ppc_fp128* [[LDB_ADDR]], align 16
+// 64BIT-NEXT:[[RETVAL_REALP:%.*]] = getelementptr inbounds { ppc_fp128, ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 0
+// 64BIT-NEXT:[[RETVAL_IMAGP:%.*]] = getelementptr inbounds { ppc_fp128, ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 1
+// 64BIT-NEXT:store ppc_fp128 [[TMP0]], ppc_fp128* [[RETVAL_REALP]], align 16
+// 64BIT-NEXT:store ppc_fp128 [[TMP1]], ppc_fp128* [[RETVAL_IMAGP]], align 16
+// 64BIT-NEXT:[[TMP2:%.*]] = load { ppc_fp128, ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], align 16
+// 64BIT-NEXT:ret { ppc_fp128, ppc_fp128 } [[TMP2]]
+//
+// 64BITLE-LABEL: @test_xl_cmplxl(
+// 64BITLE-NEXT:  entry:
+// 64BITLE-NEXT:[[RETVAL:%.*]] = alloca { ppc_fp128, ppc_fp128 }, align 16
+// 64BITLE-NEXT:[[LDA_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BITLE-NEXT:[[LDB_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BITLE-NEXT:store ppc_fp128 [[LDA:%.*]], ppc_fp128* [[LDA_ADDR]], align 16
+// 64BITLE-NEXT:store ppc_fp128 [[LDB:%.*]], ppc_fp128* [[LDB_ADDR]], align 16
+// 64BITLE-NEXT:[[TMP0:%.*]] = load ppc_fp128, ppc_fp128* [[LDA_ADDR]], align 16
+// 64BITLE-NEXT:[[TMP1:%.*]] = load ppc_fp128, ppc_fp128* [[LDB_ADDR]], align 16
+// 64BITLE-NEXT:[[RETVAL_REALP:%.*]] = getelementptr inbounds { ppc_fp128, ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 0
+// 64BITLE-NEXT:[[RETVAL_IMAGP:%.*]] = getelementptr inbounds { ppc_fp128, ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 1
+// 64BITLE-NEXT:store ppc_fp128 [[TMP0]], ppc_fp128* [[RETVAL_REALP]], align 16
+// 64BITLE-NEXT:store ppc_fp128 [[TMP1]], ppc_fp128* [[RETVAL_IMAGP]], align 16
+// 64BITLE-NEXT:[[TMP2:%.*]] = load { ppc_fp128, ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], align 16
+// 64BITLE-NEXT:ret { ppc_fp128, ppc_fp128 } [[TMP2]]
+//
+// 64BITAIX-LABEL: @test_xl_cmplxl(
+// 64BITAIX-NEXT:  entry:
+// 64BITAIX-NEXT:[[RETVAL:%.*]] = alloca { double, double }, align 4
+// 64BITAIX-NEXT:[[LDA_ADDR:%.*]] = alloca double, align 8
+// 64BITAIX-NEXT:[[LDB_ADDR:%.*]] = alloca double, align 8
+// 64BITAIX-NEXT:store double [[LDA:%.*]], double* [[LDA_ADDR]], align 8
+// 64BITAIX-NEXT:store double [[LDB:%.*]], double* [[LDB_ADDR]], align 8
+// 64BITAIX-NEXT:[[TMP0:%.*]] = load double, double* [[LDA_ADDR]], align 8
+// 64BITAIX-NEXT:[[TMP1:%.*]] = load double, double* [[LDB_ADDR]], align 8
+// 64BITAIX-NEXT:[[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, { double, double }* [[RETVAL]], i32 0, i32 0
+// 64BITAIX-NEXT:[[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, { double, double }* [[RETVAL]], i32 0, i32 1
+// 64BITAIX-NEXT:store double [[TMP0]], double* [[RETVAL_REALP]], align 4
+// 64BITAIX-NEXT:store double [[TMP1]], double* [[RETVAL_IMAGP]], align 4
+// 64BITAIX-NEXT:[[TMP2:%.*]] = load { double, double }, { double, double }* [[RETVAL]], align 4
+// 64BITAIX-NEXT:ret { double, double } [[TMP2]]
+//
+// 32BIT-LABEL: @test_xl_cmplxl(
+// 32BIT-NEXT:  entry:
+// 32BIT-NEXT:[[LDA_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 32BIT-NEXT:[[LDB_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 32BIT-NEXT:store ppc_fp128 [[LDA:%.*]], ppc_fp128* [[LDA_ADDR]], align 16
+// 32BIT-NEXT:store ppc_fp128 [[LDB:%.*]], ppc_fp128* [[LDB_ADDR]], align 16
+// 32BIT-NEXT:[[TMP0:%.*]] = load ppc_fp128, ppc_fp128* [[LDA_ADDR]], align 16
+// 32BIT-NEXT:[[TMP1:%.*]] = load 

[clang] 9d4faa8 - [PowerPC] Implement cmplxl builtins

2021-08-19 Thread Albion Fung via cfe-commits

Author: Albion Fung
Date: 2021-08-19T21:36:43-05:00
New Revision: 9d4faa8ac3e7f98b1ca09951d4d3a015aeedab38

URL: 
https://github.com/llvm/llvm-project/commit/9d4faa8ac3e7f98b1ca09951d4d3a015aeedab38
DIFF: 
https://github.com/llvm/llvm-project/commit/9d4faa8ac3e7f98b1ca09951d4d3a015aeedab38.diff

LOG: [PowerPC] Implement cmplxl builtins

This patch implements the builtins for cmplxl by utilising
__builtin_complex. This builtin is implemented to match XL
functionality.

Differential revision: https://reviews.llvm.org/D107138

Added: 


Modified: 
clang/lib/Basic/Targets/PPC.cpp
clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c

Removed: 




diff  --git a/clang/lib/Basic/Targets/PPC.cpp b/clang/lib/Basic/Targets/PPC.cpp
index c15d2df33f9f..c8afb71e7dfd 100644
--- a/clang/lib/Basic/Targets/PPC.cpp
+++ b/clang/lib/Basic/Targets/PPC.cpp
@@ -237,6 +237,7 @@ static void defineXLCompatMacros(MacroBuilder ) {
   Builder.defineMacro("__fsqrt", "__builtin_ppc_fsqrt");
   Builder.defineMacro("__fsqrts", "__builtin_ppc_fsqrts");
   Builder.defineMacro("__addex", "__builtin_ppc_addex");
+  Builder.defineMacro("__cmplxl", "__builtin_complex");
 }
 
 /// PPCTargetInfo::getTargetDefines - Return a set of the PowerPC-specific

diff  --git a/clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c 
b/clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c
index f3274fe19c1f..5e1f6a60bc2c 100644
--- a/clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c
+++ b/clang/test/CodeGen/builtins-ppc-xlcompat-cmplx.c
@@ -226,3 +226,115 @@ double _Complex testcmplx(double real, double imag) {
 float _Complex testcmplxf(float real, float imag) {
   return __cmplxf(real, imag);
 }
+
+// 64BIT-LABEL: @test_xl_cmplxl(
+// 64BIT-NEXT:  entry:
+// 64BIT-NEXT:[[RETVAL:%.*]] = alloca { ppc_fp128, ppc_fp128 }, align 16
+// 64BIT-NEXT:[[LDA_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BIT-NEXT:[[LDB_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BIT-NEXT:store ppc_fp128 [[LDA:%.*]], ppc_fp128* [[LDA_ADDR]], align 
16
+// 64BIT-NEXT:store ppc_fp128 [[LDB:%.*]], ppc_fp128* [[LDB_ADDR]], align 
16
+// 64BIT-NEXT:[[TMP0:%.*]] = load ppc_fp128, ppc_fp128* [[LDA_ADDR]], 
align 16
+// 64BIT-NEXT:[[TMP1:%.*]] = load ppc_fp128, ppc_fp128* [[LDB_ADDR]], 
align 16
+// 64BIT-NEXT:[[RETVAL_REALP:%.*]] = getelementptr inbounds { ppc_fp128, 
ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 0
+// 64BIT-NEXT:[[RETVAL_IMAGP:%.*]] = getelementptr inbounds { ppc_fp128, 
ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 1
+// 64BIT-NEXT:store ppc_fp128 [[TMP0]], ppc_fp128* [[RETVAL_REALP]], align 
16
+// 64BIT-NEXT:store ppc_fp128 [[TMP1]], ppc_fp128* [[RETVAL_IMAGP]], align 
16
+// 64BIT-NEXT:[[TMP2:%.*]] = load { ppc_fp128, ppc_fp128 }, { ppc_fp128, 
ppc_fp128 }* [[RETVAL]], align 16
+// 64BIT-NEXT:ret { ppc_fp128, ppc_fp128 } [[TMP2]]
+//
+// 64BITLE-LABEL: @test_xl_cmplxl(
+// 64BITLE-NEXT:  entry:
+// 64BITLE-NEXT:[[RETVAL:%.*]] = alloca { ppc_fp128, ppc_fp128 }, align 16
+// 64BITLE-NEXT:[[LDA_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BITLE-NEXT:[[LDB_ADDR:%.*]] = alloca ppc_fp128, align 16
+// 64BITLE-NEXT:store ppc_fp128 [[LDA:%.*]], ppc_fp128* [[LDA_ADDR]], 
align 16
+// 64BITLE-NEXT:store ppc_fp128 [[LDB:%.*]], ppc_fp128* [[LDB_ADDR]], 
align 16
+// 64BITLE-NEXT:[[TMP0:%.*]] = load ppc_fp128, ppc_fp128* [[LDA_ADDR]], 
align 16
+// 64BITLE-NEXT:[[TMP1:%.*]] = load ppc_fp128, ppc_fp128* [[LDB_ADDR]], 
align 16
+// 64BITLE-NEXT:[[RETVAL_REALP:%.*]] = getelementptr inbounds { ppc_fp128, 
ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 0
+// 64BITLE-NEXT:[[RETVAL_IMAGP:%.*]] = getelementptr inbounds { ppc_fp128, 
ppc_fp128 }, { ppc_fp128, ppc_fp128 }* [[RETVAL]], i32 0, i32 1
+// 64BITLE-NEXT:store ppc_fp128 [[TMP0]], ppc_fp128* [[RETVAL_REALP]], 
align 16
+// 64BITLE-NEXT:store ppc_fp128 [[TMP1]], ppc_fp128* [[RETVAL_IMAGP]], 
align 16
+// 64BITLE-NEXT:[[TMP2:%.*]] = load { ppc_fp128, ppc_fp128 }, { ppc_fp128, 
ppc_fp128 }* [[RETVAL]], align 16
+// 64BITLE-NEXT:ret { ppc_fp128, ppc_fp128 } [[TMP2]]
+//
+// 64BITAIX-LABEL: @test_xl_cmplxl(
+// 64BITAIX-NEXT:  entry:
+// 64BITAIX-NEXT:[[RETVAL:%.*]] = alloca { double, double }, align 4
+// 64BITAIX-NEXT:[[LDA_ADDR:%.*]] = alloca double, align 8
+// 64BITAIX-NEXT:[[LDB_ADDR:%.*]] = alloca double, align 8
+// 64BITAIX-NEXT:store double [[LDA:%.*]], double* [[LDA_ADDR]], align 8
+// 64BITAIX-NEXT:store double [[LDB:%.*]], double* [[LDB_ADDR]], align 8
+// 64BITAIX-NEXT:[[TMP0:%.*]] = load double, double* [[LDA_ADDR]], align 8
+// 64BITAIX-NEXT:[[TMP1:%.*]] = load double, double* [[LDB_ADDR]], align 8
+// 64BITAIX-NEXT:[[RETVAL_REALP:%.*]] = getelementptr inbounds { double, 
double }, { double, double }* [[RETVAL]], i32 0, i32 0
+// 64BITAIX-NEXT:

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision.
LuoYuanke added a comment.
This revision is now accepted and ready to land.

LGTM, thanks. May wait 1 or 2 days for the comments from others.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105267/new/

https://reviews.llvm.org/D105267

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-19 Thread Pengfei Wang via Phabricator via cfe-commits
pengfei added a comment.

In D105265#2955329 , @vitalybuka 
wrote:

> I suspect this error from this or D105331 
> https://lab.llvm.org/buildbot/#/builders/85/builds/6132

Thanks @vitalybuka for the information. I didn't receive this buildbot failure 
notice. I found the latest build has turned green, but I didn't find which 
commit fixed it. I'll keep watching it for a while.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105265/new/

https://reviews.llvm.org/D105265

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] cab12fc - [DebugInfo] convert btf_tag attrs to annotations for DIComposite types

2021-08-19 Thread Yonghong Song via cfe-commits

Author: Yonghong Song
Date: 2021-08-19T18:01:29-07:00
New Revision: cab12fc28c75ea82b747d636a9d20f0840777299

URL: 
https://github.com/llvm/llvm-project/commit/cab12fc28c75ea82b747d636a9d20f0840777299
DIFF: 
https://github.com/llvm/llvm-project/commit/cab12fc28c75ea82b747d636a9d20f0840777299.diff

LOG: [DebugInfo] convert btf_tag attrs to annotations for DIComposite types

Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations.

Differential Revision: https://reviews.llvm.org/D106615

Added: 
clang/test/CodeGen/attr-btf_tag-dicomposite-2.c
clang/test/CodeGen/attr-btf_tag-dicomposite.c

Modified: 
clang/lib/CodeGen/CGDebugInfo.cpp
clang/lib/CodeGen/CGDebugInfo.h

Removed: 




diff  --git a/clang/lib/CodeGen/CGDebugInfo.cpp 
b/clang/lib/CodeGen/CGDebugInfo.cpp
index 81c910f40bf8..967fa7f1493a 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -2063,6 +2063,17 @@ llvm::DINodeArray CGDebugInfo::CollectCXXTemplateParams(
   return CollectTemplateParams(TPList, TAList.asArray(), Unit);
 }
 
+llvm::DINodeArray CGDebugInfo::CollectBTFTagAnnotations(const Decl *D) {
+  SmallVector Annotations;
+  for (const auto *I : D->specific_attrs()) {
+llvm::Metadata *Ops[2] = {
+llvm::MDString::get(CGM.getLLVMContext(), StringRef("btf_tag")),
+llvm::MDString::get(CGM.getLLVMContext(), I->getBTFTag())};
+Annotations.push_back(llvm::MDNode::get(CGM.getLLVMContext(), Ops));
+  }
+  return DBuilder.getOrCreateArray(Annotations);
+}
+
 llvm::DIType *CGDebugInfo::getOrCreateVTablePtrType(llvm::DIFile *Unit) {
   if (VTablePtrType)
 return VTablePtrType;
@@ -3435,9 +3446,13 @@ llvm::DICompositeType 
*CGDebugInfo::CreateLimitedType(const RecordType *Ty) {
 Flags |= llvm::DINode::FlagExportSymbols;
   }
 
+  llvm::DINodeArray Annotations = nullptr;
+  if (D->hasAttr())
+Annotations = CollectBTFTagAnnotations(D);
+
   llvm::DICompositeType *RealDecl = DBuilder.createReplaceableCompositeType(
   getTagForRecord(RD), RDName, RDContext, DefUnit, Line, 0, Size, Align,
-  Flags, Identifier);
+  Flags, Identifier, Annotations);
 
   // Elements of composite types usually have back to the type, creating
   // uniquing cycles.  Distinct nodes are more efficient.

diff  --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index b01165f85a6c..c0674b4511c7 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -292,6 +292,9 @@ class CGDebugInfo {
   CollectCXXTemplateParams(const ClassTemplateSpecializationDecl *TS,
llvm::DIFile *F);
 
+  /// A helper function to collect debug info for btf_tag annotations.
+  llvm::DINodeArray CollectBTFTagAnnotations(const Decl *D);
+
   llvm::DIType *createFieldType(StringRef name, QualType type,
 SourceLocation loc, AccessSpecifier AS,
 uint64_t offsetInBits, uint32_t AlignInBits,

diff  --git a/clang/test/CodeGen/attr-btf_tag-dicomposite-2.c 
b/clang/test/CodeGen/attr-btf_tag-dicomposite-2.c
new file mode 100644
index ..ed937ec28c37
--- /dev/null
+++ b/clang/test/CodeGen/attr-btf_tag-dicomposite-2.c
@@ -0,0 +1,14 @@
+// REQUIRES: x86-registered-target
+// RUN: %clang -target x86_64 -g -S -emit-llvm -o - %s | FileCheck %s
+
+#define __tag1 __attribute__((btf_tag("tag1")))
+#define __tag2 __attribute__((btf_tag("tag2")))
+
+struct __tag1 __tag2 t1;
+
+int foo(struct t1 *arg) {
+  return (int)(long)arg;
+}
+
+// CHECK: define dso_local i32 @foo(
+// CHECK-NOT: annotations

diff  --git a/clang/test/CodeGen/attr-btf_tag-dicomposite.c 
b/clang/test/CodeGen/attr-btf_tag-dicomposite.c
new file mode 100644
index ..514dc4e0ccc1
--- /dev/null
+++ b/clang/test/CodeGen/attr-btf_tag-dicomposite.c
@@ -0,0 +1,52 @@
+// REQUIRES: x86-registered-target
+// RUN: %clang -target x86_64 -g -S -emit-llvm -o - %s | FileCheck %s
+
+#define __tag1 __attribute__((btf_tag("tag1")))
+#define __tag2 __attribute__((btf_tag("tag2")))
+
+struct __tag1 __tag2 t1;
+struct t1 {
+  int a;
+};
+
+int foo(struct t1 *arg) {
+  return arg->a;
+}
+
+// CHECK: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", 
file: ![[#]], line: [[#]], size: 32, elements: ![[#]], annotations: 
![[ANNOT:[0-9]+]])
+// CHECK: ![[ANNOT]] = !{![[TAG1:[0-9]+]], ![[TAG2:[0-9]+]]}
+// CHECK: ![[TAG1]] = !{!"btf_tag", !"tag1"}
+// CHECK: ![[TAG2]] = !{!"btf_tag", !"tag2"}
+
+struct __tag1 t2;
+struct __tag2 t2 {
+  int a;
+};
+
+int foo2(struct t2 *arg) {
+  return arg->a;
+}
+
+// CHECK: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t2", 
file: ![[#]], line: [[#]], size: 32, elements: ![[#]], annotations: 

[PATCH] D108320: Add semantic token modifier for non-const reference parameter

2021-08-19 Thread Nathan Ridge via Phabricator via cfe-commits
nridge added inline comments.



Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:538
+for (size_t I = 0; I < FD->getNumParams(); ++I) {
+  if (const auto *Param = FD->getParamDecl(I)) {
+auto T = Param->getType();

tom-anders wrote:
> sammccall wrote:
> > I feel like you'd be better off using the FunctionProtoType and iterating 
> > over argument types, rather than the argument declarations on a particular 
> > declaration of the function.
> > 
> > e.g. this code is legal in C:
> > ```
> > int x(); // i suspect this is the canonical decl
> > int x(int); // but this one provides the type
> > ```
> > We don't have references in C of course!, but maybe similar issues 
> > lurking...
> I'm not really sure how to get from the CallExpr to the FunctionProtoType, 
> can you give me a hint? 
I think `FD->getType()->getAs()` should do it


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108320/new/

https://reviews.llvm.org/D108320

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Dávid Bolvanský via Phabricator via cfe-commits
xbolva00 added a comment.

>> and it would be more of an optimization than correctness issue as far as I 
>> understand

Yeah, this is right, indeed.

Maybe @rpbeltran has some idea or motivating cases for OR pattern?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Nathan Chancellor via Phabricator via cfe-commits
nathanchance added a comment.

I am still running a series of builds against the Linux kernel but I already 
see one instance of this warning where the suggestion would change the meaning 
of the code in an incorrect way:

  drivers/input/touchscreen.c:81:17: warning: use of bitwise '|' with boolean 
operands [-Wbool-operation]
  data_present = touchscreen_get_prop_u32(dev, "touchscreen-min-x",
 ^~
  drivers/input/touchscreen.c:81:17: warning: use of bitwise '|' with boolean 
operands [-Wbool-operation]
  data_present = touchscreen_get_prop_u32(dev, "touchscreen-min-x",
 ^~
  drivers/input/touchscreen.c:94:17: warning: use of bitwise '|' with boolean 
operands [-Wbool-operation]
  data_present = touchscreen_get_prop_u32(dev, "touchscreen-min-y",
 ^~
  drivers/input/touchscreen.c:94:17: warning: use of bitwise '|' with boolean 
operands [-Wbool-operation]
  data_present = touchscreen_get_prop_u32(dev, "touchscreen-min-y",
 ^~
  drivers/input/touchscreen.c:108:17: warning: use of bitwise '|' with boolean 
operands [-Wbool-operation]
  data_present = touchscreen_get_prop_u32(dev,
 ^
  5 warnings generated.

Which corresponds to this file: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/input/touchscreen.c?h=v5.14-rc6

If the calls to `touchscreen_get_prop_u32` short circuit, we could use 
`maximum` or `fuzz` uninitialized. There might be a cleaner way to rewrite that 
block to avoid the warning but based on the other instances of this warning I 
see, I am not sure `|` vs. `||` is worth warning about (but I am happy to hear 
examples of how it could be a bug). Most people realize `&&` short circuits (as 
`if (a && foo(a->...))` is relatively common) but most probably are not 
thinking about `||` short circuiting (and it would be more of an optimization 
than correctness issue as far as I understand it).

Additionally, I have not caught any instances of `&` being used instead of 
`&&`, including the ones I notated previously; those were caught because only 
the right side had side effects. As was pointed out here and on the mailing 
list 
, 
the `lib/zstd/` warning is probably a bug, as the short circuit should happen 
if `offset_1` is zero, otherwise there is unnecessary work done.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment.

I understand now. Thanks, Craig.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105268/new/

https://reviews.llvm.org/D105268

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108401: [WebAssembly] Make bitmask instructions return unsigned ints

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGfd3bd63df26a: [WebAssembly] Make bitmask instructions return 
unsigned ints (authored by tlively).

Changed prior to commit:
  https://reviews.llvm.org/D108401?vs=367591=367653#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108401/new/

https://reviews.llvm.org/D108401

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c


Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1345,7 +1345,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v16i8(<16 x 
i8> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i8x16_bitmask(v128_t a) {
+uint32_t test_i8x16_bitmask(v128_t a) {
   return wasm_i8x16_bitmask(a);
 }
 
@@ -1577,7 +1577,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v8i16(<8 x 
i16> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i16x8_bitmask(v128_t a) {
+uint32_t test_i16x8_bitmask(v128_t a) {
   return wasm_i16x8_bitmask(a);
 }
 
@@ -1804,7 +1804,7 @@
 // CHECK-NEXT:[[TMP0:%.*]] = tail call i32 @llvm.wasm.bitmask.v4i32(<4 x 
i32> [[A:%.*]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP0]]
 //
-int32_t test_i32x4_bitmask(v128_t a) {
+uint32_t test_i32x4_bitmask(v128_t a) {
   return wasm_i32x4_bitmask(a);
 }
 
@@ -1958,7 +1958,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v2i64(<2 x 
i64> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i64x2_bitmask(v128_t a) {
+uint32_t test_i64x2_bitmask(v128_t a) {
   return wasm_i64x2_bitmask(a);
 }
 
Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -804,7 +804,7 @@
   return __builtin_wasm_all_true_i8x16((__i8x16)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i8x16((__i8x16)__a);
 }
 
@@ -894,7 +894,7 @@
   return __builtin_wasm_all_true_i16x8((__i16x8)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i16x8((__i16x8)__a);
 }
 
@@ -985,7 +985,7 @@
   return __builtin_wasm_all_true_i32x4((__i32x4)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i32x4((__i32x4)__a);
 }
 
@@ -1056,7 +1056,7 @@
   return __builtin_wasm_all_true_i64x2((__i64x2)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i64x2((__i64x2)__a);
 }
 
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def
===
--- clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -119,10 +119,10 @@
 TARGET_BUILTIN(__builtin_wasm_all_true_i32x4, "iV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i64x2, "iV2LLi", "nc", "simd128")
 
-TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "iV16Sc", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "iV8s", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "iV4i", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "iV2LLi", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "UiV16Sc", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "UiV8s", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "UiV4i", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "UiV2LLi", "nc", "simd128")
 
 TARGET_BUILTIN(__builtin_wasm_abs_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_abs_f64x2, "V2dV2d", "nc", "simd128")


Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1345,7 +1345,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v16i8(<16 x i8> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i8x16_bitmask(v128_t a) {
+uint32_t test_i8x16_bitmask(v128_t a) {
   return wasm_i8x16_bitmask(a);
 }
 
@@ -1577,7 +1577,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v8i16(<8 x i16> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i16x8_bitmask(v128_t a) {

[clang] fd3bd63 - [WebAssembly] Make bitmask instructions return unsigned ints

2021-08-19 Thread Thomas Lively via cfe-commits

Author: Thomas Lively
Date: 2021-08-19T16:23:47-07:00
New Revision: fd3bd63df26ad0a3865fd1fcbdbbd0063f2b0761

URL: 
https://github.com/llvm/llvm-project/commit/fd3bd63df26ad0a3865fd1fcbdbbd0063f2b0761
DIFF: 
https://github.com/llvm/llvm-project/commit/fd3bd63df26ad0a3865fd1fcbdbbd0063f2b0761.diff

LOG: [WebAssembly] Make bitmask instructions return unsigned ints

Since they are bitmasks, it will be more common for them to be used and
potentially extended to 64-bit integers as unsigned values rather than signed
values.

Differential Revision: https://reviews.llvm.org/D108401

Added: 


Modified: 
clang/include/clang/Basic/BuiltinsWebAssembly.def
clang/lib/Headers/wasm_simd128.h
clang/test/Headers/wasm.c

Removed: 




diff  --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 04ec45aa3b747..51a819efdee07 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -119,10 +119,10 @@ TARGET_BUILTIN(__builtin_wasm_all_true_i16x8, "iV8s", 
"nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i32x4, "iV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i64x2, "iV2LLi", "nc", "simd128")
 
-TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "iV16Sc", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "iV8s", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "iV4i", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "iV2LLi", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "UiV16Sc", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "UiV8s", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "UiV4i", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "UiV2LLi", "nc", "simd128")
 
 TARGET_BUILTIN(__builtin_wasm_abs_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_abs_f64x2, "V2dV2d", "nc", "simd128")

diff  --git a/clang/lib/Headers/wasm_simd128.h 
b/clang/lib/Headers/wasm_simd128.h
index 712fa03780986..e43c31a36e776 100644
--- a/clang/lib/Headers/wasm_simd128.h
+++ b/clang/lib/Headers/wasm_simd128.h
@@ -804,7 +804,7 @@ static __inline__ bool __DEFAULT_FN_ATTRS 
wasm_i8x16_all_true(v128_t __a) {
   return __builtin_wasm_all_true_i8x16((__i8x16)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i8x16((__i8x16)__a);
 }
 
@@ -894,7 +894,7 @@ static __inline__ bool __DEFAULT_FN_ATTRS 
wasm_i16x8_all_true(v128_t __a) {
   return __builtin_wasm_all_true_i16x8((__i16x8)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i16x8((__i16x8)__a);
 }
 
@@ -985,7 +985,7 @@ static __inline__ bool __DEFAULT_FN_ATTRS 
wasm_i32x4_all_true(v128_t __a) {
   return __builtin_wasm_all_true_i32x4((__i32x4)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i32x4((__i32x4)__a);
 }
 
@@ -1056,7 +1056,7 @@ static __inline__ bool __DEFAULT_FN_ATTRS 
wasm_i64x2_all_true(v128_t __a) {
   return __builtin_wasm_all_true_i64x2((__i64x2)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i64x2((__i64x2)__a);
 }
 

diff  --git a/clang/test/Headers/wasm.c b/clang/test/Headers/wasm.c
index f51f005974f23..c2f412c445199 100644
--- a/clang/test/Headers/wasm.c
+++ b/clang/test/Headers/wasm.c
@@ -1345,7 +1345,7 @@ bool test_i8x16_all_true(v128_t a) {
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v16i8(<16 x 
i8> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i8x16_bitmask(v128_t a) {
+uint32_t test_i8x16_bitmask(v128_t a) {
   return wasm_i8x16_bitmask(a);
 }
 
@@ -1577,7 +1577,7 @@ bool test_i16x8_all_true(v128_t a) {
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v8i16(<8 x 
i16> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i16x8_bitmask(v128_t a) {
+uint32_t test_i16x8_bitmask(v128_t a) {
   return wasm_i16x8_bitmask(a);
 }
 
@@ -1804,7 +1804,7 @@ bool test_i32x4_all_true(v128_t a) {
 // CHECK-NEXT:[[TMP0:%.*]] = tail call i32 @llvm.wasm.bitmask.v4i32(<4 x 
i32> [[A:%.*]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP0]]
 //
-int32_t test_i32x4_bitmask(v128_t a) {
+uint32_t test_i32x4_bitmask(v128_t a) {
   return wasm_i32x4_bitmask(a);
 }
 
@@ -1958,7 +1958,7 @@ bool test_i64x2_all_true(v128_t a) {
 // CHECK-NEXT:

[PATCH] D54943: [clang-tidy] implement const-transformation for cppcoreguidelines-const-correctness

2021-08-19 Thread Tiago Macarios via Phabricator via cfe-commits
tiagoma added a comment.

I am getting false positives with

  struct S{};
  
  void f(__unaligned S*);
  
  void scope()
  {
S s;
f();
  }


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54943/new/

https://reviews.llvm.org/D54943

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108424: [NFC][clang] Move multiversion resolver code generation to llvm/ subdirectory

2021-08-19 Thread Andrei Elovikov via Phabricator via cfe-commits
a.elovikov created this revision.
a.elovikov added reviewers: erichkeane, craig.topper.
Herald added subscribers: pengfei, hiraditya, tpr, mgorny.
a.elovikov requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Some part of the target multiversioning support already resided in
llvm/lib/Support/X86TargetParser.cpp. However, the IR generation could not be
put there because of the component dependencies.

I think Transforms/Utils is a good place to put such kind of utils similar to
AmdGPUEmitPrintf functionality there. The change can allow the use of the
functionality outside clang as it isn't C/C++-specific.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108424

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/CodeGenModule.cpp
  llvm/include/llvm/Transforms/Utils/X86EmitMultiVersionResolver.h
  llvm/lib/Transforms/Utils/CMakeLists.txt
  llvm/lib/Transforms/Utils/X86EmitMultiVersionResolver.cpp

Index: llvm/lib/Transforms/Utils/X86EmitMultiVersionResolver.cpp
===
--- /dev/null
+++ llvm/lib/Transforms/Utils/X86EmitMultiVersionResolver.cpp
@@ -0,0 +1,224 @@
+//===-- X86EmitMultiVersionResolver -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements utitlities to generate code used for CPU dispatch code.
+//
+//===--===//
+
+#include "llvm/Transforms/Utils/X86EmitMultiVersionResolver.h"
+#include "llvm/ADT/StringSwitch.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/Type.h"
+#include "llvm/Support/X86TargetParser.h"
+
+using namespace llvm;
+using namespace llvm::X86;
+
+Value *llvm::formResolverCondition(IRBuilderBase ,
+const MultiVersionResolverOption ) {
+  llvm::Value *Condition = nullptr;
+
+  if (!RO.Conditions.Architecture.empty())
+Condition = llvm::X86::emitCpuIs(Builder, RO.Conditions.Architecture);
+  if (!RO.Conditions.Features.empty()) {
+llvm::Value *FeatureCond =
+llvm::X86::emitCpuSupports(Builder, RO.Conditions.Features);
+Condition =
+Condition ? Builder.CreateAnd(Condition, FeatureCond) : FeatureCond;
+  }
+  return Condition;
+}
+
+static void CreateMultiVersionResolverReturn(Function *Resolver,
+ IRBuilderBase ,
+ Function *FuncToReturn,
+ bool UseIFunc) {
+  if (UseIFunc) {
+Builder.CreateRet(FuncToReturn);
+return;
+  }
+
+  SmallVector Args;
+  for_each(Resolver->args(), [&](Argument ) { Args.push_back(); });
+
+  CallInst *Result = Builder.CreateCall(FuncToReturn, Args);
+  Result->setTailCallKind(CallInst::TCK_MustTail);
+
+  if (Resolver->getReturnType()->isVoidTy())
+Builder.CreateRetVoid();
+  else
+Builder.CreateRet(Result);
+}
+
+void llvm::emitMultiVersionResolver(
+Function *Resolver, ArrayRef Options,
+bool UseIFunc) {
+  assert(Triple(Resolver->getParent()->getTargetTriple()).isX86() &&
+ "Only implemented for x86 targets");
+
+  auto  = Resolver->getContext();
+  // Main function's basic block.
+  BasicBlock *CurBlock = BasicBlock::Create(Ctx, "resolver_entry", Resolver);
+
+  IRBuilder<> Builder(CurBlock, CurBlock->begin());
+  llvm::X86::emitCPUInit(Builder);
+
+  for (const MultiVersionResolverOption  : Options) {
+Builder.SetInsertPoint(CurBlock);
+llvm::Value *Condition = formResolverCondition(Builder, RO);
+
+// The 'default' or 'generic' case.
+if (!Condition) {
+  assert( == Options.end() - 1 &&
+ "Default or Generic case must be last");
+  CreateMultiVersionResolverReturn(Resolver, Builder, RO.Fn, UseIFunc);
+  return;
+}
+
+llvm::BasicBlock *RetBlock =
+BasicBlock::Create(Ctx, "resolver_return", Resolver);
+Builder.SetInsertPoint(RetBlock);
+CreateMultiVersionResolverReturn(Resolver, Builder, RO.Fn, UseIFunc);
+CurBlock = BasicBlock::Create(Ctx, "resolver_else", Resolver);
+Builder.CreateCondBr(Condition, RetBlock, CurBlock);
+  }
+
+  // If no generic/default, emit an unreachable.
+  Builder.SetInsertPoint(CurBlock);
+  CallInst *TrapCall = Builder.CreateIntrinsic(Intrinsic::trap, {}, {});
+  TrapCall->setDoesNotReturn();
+  TrapCall->setDoesNotThrow();
+  Builder.CreateUnreachable();
+}
+
+static Type *getCpuModelType(IRBuilderBase ) {
+  Type *Int32Ty = Builder.getInt32Ty();
+
+  // Matching the struct layout from 

[PATCH] D108423: [NFC][clang] Move IR-independent parts of target MV support to X86TargetParser.cpp

2021-08-19 Thread Andrei Elovikov via Phabricator via cfe-commits
a.elovikov created this revision.
a.elovikov added reviewers: erichkeane, craig.topper.
Herald added a subscriber: hiraditya.
a.elovikov requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

...that is located under llvm/lib/Support/.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108423

Files:
  clang/lib/Basic/Targets/X86.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/CodeGenModule.cpp
  llvm/include/llvm/Support/X86TargetParser.h
  llvm/lib/Support/X86TargetParser.cpp

Index: llvm/lib/Support/X86TargetParser.cpp
===
--- llvm/lib/Support/X86TargetParser.cpp
+++ llvm/lib/Support/X86TargetParser.cpp
@@ -11,7 +11,9 @@
 //===--===//
 
 #include "llvm/Support/X86TargetParser.h"
+#include "llvm/ADT/StringSwitch.h"
 #include "llvm/ADT/Triple.h"
+#include 
 
 using namespace llvm;
 using namespace llvm::X86;
@@ -662,3 +664,43 @@
 if (ImpliedBits[i] && !FeatureInfos[i].Name.empty())
   Features[FeatureInfos[i].Name] = Enabled;
 }
+
+uint64_t llvm::X86::getCpuSupportsMask(ArrayRef FeatureStrs) {
+  // Processor features and mapping to processor feature value.
+  uint64_t FeaturesMask = 0;
+  for (const StringRef  : FeatureStrs) {
+unsigned Feature = StringSwitch(FeatureStr)
+#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY)\
+  .Case(STR, llvm::X86::FEATURE_##ENUM)
+#include "llvm/Support/X86TargetParser.def"
+;
+FeaturesMask |= (1ULL << Feature);
+  }
+  return FeaturesMask;
+}
+
+unsigned llvm::X86::getFeaturePriority(ProcessorFeatures Feat) {
+#ifndef NDEBUG
+  // Check that priorities are set properly in the .def file, i.e.
+#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY) PRIORITY,
+  unsigned Priorities[] = {
+#include "llvm/Support/X86TargetParser.def"
+  std::numeric_limits::max() // Need to consume last comma.
+  };
+  std::array HelperList;
+  std::iota(HelperList.begin(), HelperList.end(), 0);
+  assert(std::is_permutation(HelperList.begin(), HelperList.end(),
+ std::begin(Priorities),
+ std::prev(std::end(Priorities))) &&
+ "Priorites don't form consecutive range!");
+#endif
+
+  switch (Feat) {
+#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY)\
+  case llvm::X86::FEATURE_##ENUM:  \
+return PRIORITY;
+#include "llvm/Support/X86TargetParser.def"
+  default:
+llvm_unreachable("No Feature Priority for non-CPUSupports Features");
+  }
+}
Index: llvm/include/llvm/Support/X86TargetParser.h
===
--- llvm/include/llvm/Support/X86TargetParser.h
+++ llvm/include/llvm/Support/X86TargetParser.h
@@ -13,6 +13,7 @@
 #ifndef LLVM_SUPPORT_X86TARGETPARSER_H
 #define LLVM_SUPPORT_X86TARGETPARSER_H
 
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringMap.h"
 
@@ -154,6 +155,9 @@
 void updateImpliedFeatures(StringRef Feature, bool Enabled,
StringMap );
 
+uint64_t getCpuSupportsMask(ArrayRef FeatureStrs);
+unsigned getFeaturePriority(ProcessorFeatures Feat);
+
 } // namespace X86
 } // namespace llvm
 
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -63,6 +63,7 @@
 #include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MD5.h"
 #include "llvm/Support/TimeProfiler.h"
+#include "llvm/Support/X86TargetParser.h"
 
 using namespace clang;
 using namespace CodeGen;
@@ -3397,8 +3398,8 @@
   llvm::stable_sort(
   Options, [](const CodeGenFunction::MultiVersionResolverOption ,
   const CodeGenFunction::MultiVersionResolverOption ) {
-return CodeGenFunction::GetX86CpuSupportsMask(LHS.Conditions.Features) >
-   CodeGenFunction::GetX86CpuSupportsMask(RHS.Conditions.Features);
+return llvm::X86::getCpuSupportsMask(LHS.Conditions.Features) >
+   llvm::X86::getCpuSupportsMask(RHS.Conditions.Features);
   });
 
   // If the list contains multiple 'default' versions, such as when it contains
@@ -3406,7 +3407,7 @@
   // always run on at least a 'pentium'). We do this by deleting the 'least
   // advanced' (read, lowest mangling letter).
   while (Options.size() > 1 &&
- CodeGenFunction::GetX86CpuSupportsMask(
+ llvm::X86::getCpuSupportsMask(
  (Options.end() - 2)->Conditions.Features) == 0) {
 StringRef LHSName = (Options.end() - 2)->Function->getName();
 StringRef RHSName = (Options.end() - 1)->Function->getName();
Index: clang/lib/CodeGen/CodeGenFunction.h

[PATCH] D108422: [NFC][clang] Move remaining part of X86Target.def to llvm/Support/X86TargetParser.def

2021-08-19 Thread Andrei Elovikov via Phabricator via cfe-commits
a.elovikov created this revision.
a.elovikov added reviewers: erichkeane, craig.topper.
a.elovikov requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108422

Files:
  clang/include/clang/Basic/X86Target.def
  clang/lib/Basic/Targets/X86.cpp
  llvm/include/llvm/Support/X86TargetParser.def

Index: llvm/include/llvm/Support/X86TargetParser.def
===
--- llvm/include/llvm/Support/X86TargetParser.def
+++ llvm/include/llvm/Support/X86TargetParser.def
@@ -208,3 +208,50 @@
 X86_FEATURE   (LVI_LOAD_HARDENING,  "lvi-load-hardening")
 #undef X86_FEATURE_COMPAT
 #undef X86_FEATURE
+
+#ifndef CPU_SPECIFIC
+#define CPU_SPECIFIC(NAME, MANGLING, FEATURES)
+#endif
+
+#ifndef CPU_SPECIFIC_ALIAS
+#define CPU_SPECIFIC_ALIAS(NEW_NAME, NAME)
+#endif
+
+// FIXME: When commented out features are supported in LLVM, enable them here.
+CPU_SPECIFIC("generic", 'A', "")
+CPU_SPECIFIC("pentium", 'B', "")
+CPU_SPECIFIC("pentium_pro", 'C', "+cmov")
+CPU_SPECIFIC("pentium_mmx", 'D', "+mmx")
+CPU_SPECIFIC("pentium_ii", 'E', "+cmov,+mmx")
+CPU_SPECIFIC("pentium_iii", 'H', "+cmov,+mmx,+sse")
+CPU_SPECIFIC_ALIAS("pentium_iii_no_xmm_regs", "pentium_iii")
+CPU_SPECIFIC("pentium_4", 'J', "+cmov,+mmx,+sse,+sse2")
+CPU_SPECIFIC("pentium_m", 'K', "+cmov,+mmx,+sse,+sse2")
+CPU_SPECIFIC("pentium_4_sse3", 'L', "+cmov,+mmx,+sse,+sse2,+sse3")
+CPU_SPECIFIC("core_2_duo_ssse3", 'M', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3")
+CPU_SPECIFIC("core_2_duo_sse4_1", 'N', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1")
+CPU_SPECIFIC("atom", 'O', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+movbe")
+CPU_SPECIFIC("atom_sse4_2", 'c', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+popcnt")
+CPU_SPECIFIC("core_i7_sse4_2", 'P', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+popcnt")
+CPU_SPECIFIC("core_aes_pclmulqdq", 'Q', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+popcnt")
+CPU_SPECIFIC("atom_sse4_2_movbe", 'd', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt")
+CPU_SPECIFIC("goldmont", 'i', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt")
+CPU_SPECIFIC("sandybridge", 'R', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+popcnt,+avx")
+CPU_SPECIFIC_ALIAS("core_2nd_gen_avx", "sandybridge")
+CPU_SPECIFIC("ivybridge", 'S', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+popcnt,+f16c,+avx")
+CPU_SPECIFIC_ALIAS("core_3rd_gen_avx", "ivybridge")
+CPU_SPECIFIC("haswell", 'V', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2")
+CPU_SPECIFIC_ALIAS("core_4th_gen_avx", "haswell")
+CPU_SPECIFIC("core_4th_gen_avx_tsx", 'W', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2")
+CPU_SPECIFIC("broadwell", 'X', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2,+adx")
+CPU_SPECIFIC_ALIAS("core_5th_gen_avx", "broadwell")
+CPU_SPECIFIC("core_5th_gen_avx_tsx", 'Y', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2,+adx")
+CPU_SPECIFIC("knl", 'Z', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2,+avx512f,+adx,+avx512er,+avx512pf,+avx512cd")
+CPU_SPECIFIC_ALIAS("mic_avx512", "knl")
+CPU_SPECIFIC("skylake", 'b', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2,+adx,+mpx")
+CPU_SPECIFIC( "skylake_avx512", 'a', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2,+avx512dq,+avx512f,+adx,+avx512cd,+avx512bw,+avx512vl,+clwb")
+CPU_SPECIFIC("cannonlake", 'e', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2,+avx512dq,+avx512f,+adx,+avx512ifma,+avx512cd,+avx512bw,+avx512vl,+avx512vbmi")
+CPU_SPECIFIC("knm", 'j', "+cmov,+mmx,+sse,+sse2,+sse3,+ssse3,+sse4.1,+sse4.2,+movbe,+popcnt,+f16c,+avx,+fma,+bmi,+lzcnt,+avx2,+avx512f,+adx,+avx512er,+avx512pf,+avx512cd,+avx5124fmaps,+avx5124vnniw,+avx512vpopcntdq")
+
+#undef CPU_SPECIFIC_ALIAS
+#undef CPU_SPECIFIC
Index: clang/lib/Basic/Targets/X86.cpp
===
--- clang/lib/Basic/Targets/X86.cpp
+++ clang/lib/Basic/Targets/X86.cpp
@@ -1103,21 +1103,21 @@
   return llvm::StringSwitch(Name)
 #define CPU_SPECIFIC(NAME, MANGLING, FEATURES) .Case(NAME, true)
 #define CPU_SPECIFIC_ALIAS(NEW_NAME, NAME) .Case(NEW_NAME, true)
-#include "clang/Basic/X86Target.def"
+#include "llvm/Support/X86TargetParser.def"
   .Default(false);
 }
 
 static StringRef CPUSpecificCPUDispatchNameDealias(StringRef Name) {
   return llvm::StringSwitch(Name)
 #define CPU_SPECIFIC_ALIAS(NEW_NAME, NAME) .Case(NEW_NAME, NAME)
-#include "clang/Basic/X86Target.def"

[PATCH] D108421: Mark openmp internal global dso_local

2021-08-19 Thread kamlesh kumar via Phabricator via cfe-commits
kamleshbhalui created this revision.
kamleshbhalui added reviewers: MaskRay, pengfei.
kamleshbhalui added a project: OpenMP.
Herald added subscribers: guansong, yaxunl.
kamleshbhalui requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

Starting from clang-12 openmp started generating internal global variable with 
got relocation even when static relocation  enabled.
In clang-11 shouldAssumeDSOLocal was assuming it dso_local based on static 
relocation model.
Since shouldAssumeDSOLocal  is cleaned up now for respecting dso_local  
generated from frontend, marking openmp internal globals as dso_local.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108421

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp


Index: clang/lib/CodeGen/CGOpenMPRuntime.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2179,11 +2179,14 @@
 return &*Elem.second;
   }
 
-  return Elem.second = new llvm::GlobalVariable(
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
  CGM.getModule(), Ty, /*IsConstant*/ false,
  llvm::GlobalValue::CommonLinkage, 
llvm::Constant::getNullValue(Ty),
  Elem.first(), /*InsertBefore=*/nullptr,
  llvm::GlobalValue::NotThreadLocal, AddressSpace);
+  GV->setDSOLocal(true);
+  Elem.second = GV;
+  return Elem.second;
 }
 
 llvm::Value *CGOpenMPRuntime::getCriticalRegionLock(StringRef CriticalName) {


Index: clang/lib/CodeGen/CGOpenMPRuntime.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2179,11 +2179,14 @@
 return &*Elem.second;
   }
 
-  return Elem.second = new llvm::GlobalVariable(
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
  CGM.getModule(), Ty, /*IsConstant*/ false,
  llvm::GlobalValue::CommonLinkage, llvm::Constant::getNullValue(Ty),
  Elem.first(), /*InsertBefore=*/nullptr,
  llvm::GlobalValue::NotThreadLocal, AddressSpace);
+  GV->setDSOLocal(true);
+  Elem.second = GV;
+  return Elem.second;
 }
 
 llvm::Value *CGOpenMPRuntime::getCriticalRegionLock(StringRef CriticalName) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108401: [WebAssembly] Make bitmask instructions return unsigned ints

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits
tlively added a comment.

Thanks! Will move the relevant test changes up from 
https://reviews.llvm.org/D108412 to here before landing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108401/new/

https://reviews.llvm.org/D108401

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108151: [NFC][clang] Use X86 Features declaration from X86TargetParser

2021-08-19 Thread Andrei Elovikov via Phabricator via cfe-commits
a.elovikov marked 3 inline comments as done.
a.elovikov added inline comments.



Comment at: clang/lib/Basic/Targets/X86.cpp:1071
+assert(llvm::is_contained(Priorities, Priority) &&
+   "Priorites don't form consequtive range!");
+  }

erichkeane wrote:
> erichkeane wrote:
> > craig.topper wrote:
> > > erichkeane wrote:
> > > > If all you care about is whether they are a consecutive range, why not 
> > > > just use `std::is_sorted`?
> > > The Priorities array isn't sorted. It's just whatever order the 
> > > X86_FEATURE_COMPAT lists them.
> > > 
> > > The values need to be unique and in a contiguous range.
> > Then I'd suggest something like: `llvm::sort`, then `assert *(end - 1) - 
> > *begin == std::distance(begin, end) && llvm::adjacent_find` or something.
> > 
> > I definitely didn't get that point out of this odd for-loop and 
> > is_contained.  There is perhaps at trick with std::min and std::max too.  
> > Though, it looks like this is perhaps trying to prove that the range is 0 
> > to the the array size, right?  In that case, perhaps there is something 
> > easier.
> > 
> > Also a nit, it is `consecutive` in that case.
> Actually...
> 
> std::array HelperList;
> std::iota(HelperList.begin(), HelperList.end());
> std::is_permutation(HelperList.begin(), HelperList.end(),  
> std::begin(Priorities), std::end(Priorities));
I thought about std::sort + std::iota + std::equal but wasn't sure how readable 
that would be with the extra helper object (range version isn't available in 
C++14) and hoped my version would be more compact.

Since it wasn't obvious what it does I shamelessly copied your suggestion (and 
learnt about std::is_permutation and array_lengthof when doing it).

Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108151/new/

https://reviews.llvm.org/D108151

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108151: [NFC][clang] Use X86 Features declaration from X86TargetParser

2021-08-19 Thread Andrei Elovikov via Phabricator via cfe-commits
a.elovikov updated this revision to Diff 367640.
a.elovikov added a comment.

Apply reviewers' suggestions. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108151/new/

https://reviews.llvm.org/D108151

Files:
  clang/include/clang/Basic/X86Target.def
  clang/lib/Basic/Targets/X86.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  llvm/include/llvm/Support/X86TargetParser.def

Index: llvm/include/llvm/Support/X86TargetParser.def
===
--- llvm/include/llvm/Support/X86TargetParser.def
+++ llvm/include/llvm/Support/X86TargetParser.def
@@ -91,54 +91,59 @@
 X86_CPU_SUBTYPE(INTEL_COREI7_ROCKETLAKE, "rocketlake")
 #undef X86_CPU_SUBTYPE
 
-
-// This macro is used for cpu types present in compiler-rt/libgcc.
+// This macro is used for cpu types present in compiler-rt/libgcc. The third
+// parameter PRIORITY is as required by the attribute 'target' checking. Note
+// that not all are supported/prioritized by GCC, so synchronization with GCC's
+// implementation may require changing some existing values.
+//
+// We cannot just re-sort the list though because its order is dictated by the
+// order of bits in CodeGenFunction::GetX86CpuSupportsMask.
 #ifndef X86_FEATURE_COMPAT
-#define X86_FEATURE_COMPAT(ENUM, STR) X86_FEATURE(ENUM, STR)
+#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY) X86_FEATURE(ENUM, STR)
 #endif
 
 #ifndef X86_FEATURE
 #define X86_FEATURE(ENUM, STR)
 #endif
 
-X86_FEATURE_COMPAT(CMOV,"cmov")
-X86_FEATURE_COMPAT(MMX, "mmx")
-X86_FEATURE_COMPAT(POPCNT,  "popcnt")
-X86_FEATURE_COMPAT(SSE, "sse")
-X86_FEATURE_COMPAT(SSE2,"sse2")
-X86_FEATURE_COMPAT(SSE3,"sse3")
-X86_FEATURE_COMPAT(SSSE3,   "ssse3")
-X86_FEATURE_COMPAT(SSE4_1,  "sse4.1")
-X86_FEATURE_COMPAT(SSE4_2,  "sse4.2")
-X86_FEATURE_COMPAT(AVX, "avx")
-X86_FEATURE_COMPAT(AVX2,"avx2")
-X86_FEATURE_COMPAT(SSE4_A,  "sse4a")
-X86_FEATURE_COMPAT(FMA4,"fma4")
-X86_FEATURE_COMPAT(XOP, "xop")
-X86_FEATURE_COMPAT(FMA, "fma")
-X86_FEATURE_COMPAT(AVX512F, "avx512f")
-X86_FEATURE_COMPAT(BMI, "bmi")
-X86_FEATURE_COMPAT(BMI2,"bmi2")
-X86_FEATURE_COMPAT(AES, "aes")
-X86_FEATURE_COMPAT(PCLMUL,  "pclmul")
-X86_FEATURE_COMPAT(AVX512VL,"avx512vl")
-X86_FEATURE_COMPAT(AVX512BW,"avx512bw")
-X86_FEATURE_COMPAT(AVX512DQ,"avx512dq")
-X86_FEATURE_COMPAT(AVX512CD,"avx512cd")
-X86_FEATURE_COMPAT(AVX512ER,"avx512er")
-X86_FEATURE_COMPAT(AVX512PF,"avx512pf")
-X86_FEATURE_COMPAT(AVX512VBMI,  "avx512vbmi")
-X86_FEATURE_COMPAT(AVX512IFMA,  "avx512ifma")
-X86_FEATURE_COMPAT(AVX5124VNNIW,"avx5124vnniw")
-X86_FEATURE_COMPAT(AVX5124FMAPS,"avx5124fmaps")
-X86_FEATURE_COMPAT(AVX512VPOPCNTDQ, "avx512vpopcntdq")
-X86_FEATURE_COMPAT(AVX512VBMI2, "avx512vbmi2")
-X86_FEATURE_COMPAT(GFNI,"gfni")
-X86_FEATURE_COMPAT(VPCLMULQDQ,  "vpclmulqdq")
-X86_FEATURE_COMPAT(AVX512VNNI,  "avx512vnni")
-X86_FEATURE_COMPAT(AVX512BITALG,"avx512bitalg")
-X86_FEATURE_COMPAT(AVX512BF16,  "avx512bf16")
-X86_FEATURE_COMPAT(AVX512VP2INTERSECT, "avx512vp2intersect")
+X86_FEATURE_COMPAT(CMOV,"cmov",  0)
+X86_FEATURE_COMPAT(MMX, "mmx",   1)
+X86_FEATURE_COMPAT(POPCNT,  "popcnt",9)
+X86_FEATURE_COMPAT(SSE, "sse",   2)
+X86_FEATURE_COMPAT(SSE2,"sse2",  3)
+X86_FEATURE_COMPAT(SSE3,"sse3",  4)
+X86_FEATURE_COMPAT(SSSE3,   "ssse3", 5)
+X86_FEATURE_COMPAT(SSE4_1,  "sse4.1",7)
+X86_FEATURE_COMPAT(SSE4_2,  "sse4.2",8)
+X86_FEATURE_COMPAT(AVX, "avx",   12)
+X86_FEATURE_COMPAT(AVX2,"avx2",  18)
+X86_FEATURE_COMPAT(SSE4_A,  "sse4a", 6)
+X86_FEATURE_COMPAT(FMA4,"fma4",  14)
+X86_FEATURE_COMPAT(XOP, "xop",   15)
+X86_FEATURE_COMPAT(FMA, "fma",   16)
+X86_FEATURE_COMPAT(AVX512F, "avx512f",   19)
+X86_FEATURE_COMPAT(BMI, "bmi",   13)
+X86_FEATURE_COMPAT(BMI2,"bmi2",  17)
+X86_FEATURE_COMPAT(AES, "aes",   10)
+X86_FEATURE_COMPAT(PCLMUL,  "pclmul",11)
+X86_FEATURE_COMPAT(AVX512VL,"avx512vl",  20)
+X86_FEATURE_COMPAT(AVX512BW,"avx512bw",  21)
+X86_FEATURE_COMPAT(AVX512DQ,"avx512dq",  22)
+X86_FEATURE_COMPAT(AVX512CD,"avx512cd",  23)
+X86_FEATURE_COMPAT(AVX512ER,"avx512er",  24)

[PATCH] D106615: [Clang][LLVM] generate btf_tag annotations for DIComposite types

2021-08-19 Thread Yonghong Song via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2fded193e7a8: [DebugInfo] generate btf_tag annotations for 
DIComposite types (authored by yonghong-song).

Changed prior to commit:
  https://reviews.llvm.org/D106615?vs=367109=367637#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106615/new/

https://reviews.llvm.org/D106615

Files:
  llvm/include/llvm/IR/DIBuilder.h
  llvm/include/llvm/IR/DebugInfoMetadata.h
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/MetadataLoader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/DIBuilder.cpp
  llvm/lib/IR/DebugInfoMetadata.cpp
  llvm/lib/IR/LLVMContextImpl.h
  llvm/test/Bitcode/attr-btf_tag-dicomposite.ll

Index: llvm/test/Bitcode/attr-btf_tag-dicomposite.ll
===
--- /dev/null
+++ llvm/test/Bitcode/attr-btf_tag-dicomposite.ll
@@ -0,0 +1,36 @@
+; REQUIRES: x86-registered-target
+; RUN: llvm-as < %s | llvm-dis | FileCheck %s
+
+%struct.t = type { i32 }
+
+@g = dso_local global %struct.t zeroinitializer, align 4, !dbg !0
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!13, !14, !15, !16, !17}
+!llvm.ident = !{!18}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "g", scope: !2, file: !3, line: 2, type: !6, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 13.0.0 (https://github.com/llvm/llvm-project.git a20bed0ba269a4f9b67e58093c50af9ef0730fd1)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "struct.c", directory: "/home/yhs/work/tests/llvm/btf_tag")
+!4 = !{}
+!5 = !{!0}
+!6 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t", file: !3, line: 1, size: 32, elements: !7, annotations: !10)
+!7 = !{!8}
+!8 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !6, file: !3, line: 1, baseType: !9, size: 32)
+!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!10 = !{!11, !12}
+!11 = !{!"btf_tag", !"a"}
+!12 = !{!"btf_tag", !"b"}
+
+; CHECK:distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t"
+; CHECK-SAME:   annotations: ![[ANNOT:[0-9]+]]
+; CHECK:![[ANNOT]] = !{![[TAG1:[0-9]+]], ![[TAG2:[0-9]+]]}
+; CHECK:![[TAG1]] = !{!"btf_tag", !"a"}
+; CHECK:![[TAG2]] = !{!"btf_tag", !"b"}
+!13 = !{i32 7, !"Dwarf Version", i32 4}
+!14 = !{i32 2, !"Debug Info Version", i32 3}
+!15 = !{i32 1, !"wchar_size", i32 4}
+!16 = !{i32 7, !"uwtable", i32 1}
+!17 = !{i32 7, !"frame-pointer", i32 2}
+!18 = !{!"clang version 13.0.0 (https://github.com/llvm/llvm-project.git a20bed0ba269a4f9b67e58093c50af9ef0730fd1)"}
Index: llvm/lib/IR/LLVMContextImpl.h
===
--- llvm/lib/IR/LLVMContextImpl.h
+++ llvm/lib/IR/LLVMContextImpl.h
@@ -569,6 +569,7 @@
   Metadata *Associated;
   Metadata *Allocated;
   Metadata *Rank;
+  Metadata *Annotations;
 
   MDNodeKeyImpl(unsigned Tag, MDString *Name, Metadata *File, unsigned Line,
 Metadata *Scope, Metadata *BaseType, uint64_t SizeInBits,
@@ -577,14 +578,15 @@
 Metadata *VTableHolder, Metadata *TemplateParams,
 MDString *Identifier, Metadata *Discriminator,
 Metadata *DataLocation, Metadata *Associated,
-Metadata *Allocated, Metadata *Rank)
+Metadata *Allocated, Metadata *Rank, Metadata *Annotations)
   : Tag(Tag), Name(Name), File(File), Line(Line), Scope(Scope),
 BaseType(BaseType), SizeInBits(SizeInBits), OffsetInBits(OffsetInBits),
 AlignInBits(AlignInBits), Flags(Flags), Elements(Elements),
 RuntimeLang(RuntimeLang), VTableHolder(VTableHolder),
 TemplateParams(TemplateParams), Identifier(Identifier),
 Discriminator(Discriminator), DataLocation(DataLocation),
-Associated(Associated), Allocated(Allocated), Rank(Rank) {}
+Associated(Associated), Allocated(Allocated), Rank(Rank),
+Annotations(Annotations) {}
   MDNodeKeyImpl(const DICompositeType *N)
   : Tag(N->getTag()), Name(N->getRawName()), File(N->getRawFile()),
 Line(N->getLine()), Scope(N->getRawScope()),
@@ -597,7 +599,7 @@
 Discriminator(N->getRawDiscriminator()),
 DataLocation(N->getRawDataLocation()),
 Associated(N->getRawAssociated()), Allocated(N->getRawAllocated()),
-Rank(N->getRawRank()) {}
+Rank(N->getRawRank()), Annotations(N->getRawAnnotations()) {}
 
   bool isKeyOf(const DICompositeType *RHS) const {
 return Tag == RHS->getTag() && Name == RHS->getRawName() &&
@@ -614,7 +616,8 @@
Discriminator == RHS->getRawDiscriminator() &&
  

[PATCH] D108412: [WebAssembly] Add SIMD intrinsics using unsigned integers

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits
tlively added a comment.

In D108412#2955996 , @craig.topper 
wrote:

> Did you read this twitter thread too or just coincidence? 
> https://twitter.com/rygorous/status/1428207170403725316?s=20

Yes I did :D


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108412/new/

https://reviews.llvm.org/D108412

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108407: [CodeGen][WIP] Avoid generating Record layouts for pointee types

2021-08-19 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

Notion seems plausible - though if there's some way to refactor so there's less 
need for manual insertion/maintenance of calls to `ConvertTypeForMem` that'd be 
good/important. I don't think there'd be anything fundamentally wrong with this 
approach - though checking some workloads to see if you can get bit identical 
results (eg: does some interesting binaries (including a clang selfhost) built 
with/without this patch compile to exactly the same file?) would probably be a 
good place to start to check the soundness.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108407/new/

https://reviews.llvm.org/D108407

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108412: [WebAssembly] Add SIMD intrinsics using unsigned integers

2021-08-19 Thread Craig Topper via Phabricator via cfe-commits
craig.topper added a comment.

Did you read this twitter thread too or just coincidence? 
https://twitter.com/rygorous/status/1428207170403725316?s=20


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108412/new/

https://reviews.llvm.org/D108412

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Ryan Beltran via Phabricator via cfe-commits
rpbeltran added a comment.

Thanks! And sorry for missing that point in your first comment.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108415: [WebAssembly] Make shift values unsigned in wasm_simd128.h

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits
tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

On some platforms, negative shift values mean to shift in the opposite
direction, but this is not true with WebAssembly. To avoid confusion, make the
shift values in the shift intrinsics unsigned.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108415

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1603,7 +1603,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i8x16_shl(v128_t a, int32_t b) {
+v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
 }
 
@@ -1617,7 +1617,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i8x16_shr(v128_t a, int32_t b) {
+v128_t test_i8x16_shr(v128_t a, uint32_t b) {
   return wasm_i8x16_shr(a, b);
 }
 
@@ -1631,7 +1631,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_u8x16_shr(v128_t a, int32_t b) {
+v128_t test_u8x16_shr(v128_t a, uint32_t b) {
   return wasm_u8x16_shr(a, b);
 }
 
@@ -1824,7 +1824,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i16x8_shl(v128_t a, int32_t b) {
+v128_t test_i16x8_shl(v128_t a, uint32_t b) {
   return wasm_i16x8_shl(a, b);
 }
 
@@ -1838,7 +1838,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i16x8_shr(v128_t a, int32_t b) {
+v128_t test_i16x8_shr(v128_t a, uint32_t b) {
   return wasm_i16x8_shr(a, b);
 }
 
@@ -1852,7 +1852,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_u16x8_shr(v128_t a, int32_t b) {
+v128_t test_u16x8_shr(v128_t a, uint32_t b) {
   return wasm_u16x8_shr(a, b);
 }
 
@@ -2048,7 +2048,7 @@
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHL_I]]
 //
-v128_t test_i32x4_shl(v128_t a, int32_t b) {
+v128_t test_i32x4_shl(v128_t a, uint32_t b) {
   return wasm_i32x4_shl(a, b);
 }
 
@@ -2059,7 +2059,7 @@
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHR_I]]
 //
-v128_t test_i32x4_shr(v128_t a, int32_t b) {
+v128_t test_i32x4_shr(v128_t a, uint32_t b) {
   return wasm_i32x4_shr(a, b);
 }
 
@@ -2070,7 +2070,7 @@
 // CHECK-NEXT:[[SHR_I:%.*]] = lshr <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHR_I]]
 //
-v128_t test_u32x4_shr(v128_t a, int32_t b) {
+v128_t test_u32x4_shr(v128_t a, uint32_t b) {
   return wasm_u32x4_shr(a, b);
 }
 
@@ -2198,42 +2198,42 @@
 // CHECK-LABEL: @test_i64x2_shl(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[CONV_I:%.*]] = sext i32 [[B:%.*]] to i64
+// CHECK-NEXT:[[CONV_I:%.*]] = zext i32 [[B:%.*]] to i64
 // CHECK-NEXT:[[SPLAT_SPLATINSERT_I:%.*]] = insertelement <2 x i64> poison, i64 [[CONV_I]], i32 0
 // CHECK-NEXT:[[SPLAT_SPLAT_I:%.*]] = shufflevector <2 x i64> [[SPLAT_SPLATINSERT_I]], <2 x i64> poison, <2 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <2 x i64> [[TMP0]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x i64> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
-v128_t test_i64x2_shl(v128_t a, int32_t b) {
+v128_t test_i64x2_shl(v128_t a, uint32_t b) {
   return wasm_i64x2_shl(a, b);
 }
 
 // CHECK-LABEL: @test_i64x2_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[CONV_I:%.*]] = sext i32 [[B:%.*]] to i64
+// CHECK-NEXT:[[CONV_I:%.*]] = zext i32 [[B:%.*]] to i64
 // CHECK-NEXT:[[SPLAT_SPLATINSERT_I:%.*]] = insertelement <2 x i64> poison, i64 [[CONV_I]], i32 0
 // CHECK-NEXT:[[SPLAT_SPLAT_I:%.*]] = shufflevector <2 x i64> [[SPLAT_SPLATINSERT_I]], <2 x i64> poison, <2 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <2 x i64> [[TMP0]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x i64> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
-v128_t test_i64x2_shr(v128_t a, int32_t b) {
+v128_t test_i64x2_shr(v128_t a, uint32_t b) {
   return wasm_i64x2_shr(a, b);
 }
 
 // CHECK-LABEL: @test_u64x2_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:

[PATCH] D108412: [WebAssembly] Add SIMD intrinsics using unsigned integers

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits
tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

For each SIMD intrinsic function that takes or returns a scalar signed integer
value, ensure there is a corresponding intrinsic that returns or an
unsigned value. This is a convenience for users who use -Wsign-conversion so
they don't have to insert explicit casts, especially when the intrinsic
arguments are integer literals that fit into the unsigned integer type but not
the signed type.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108412

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -3,7 +3,7 @@
 
 // FIXME: This should not be using -O2 and implicitly testing the entire IR opt pipeline.
 
-// RUN: %clang %s -O2 -emit-llvm -S -o - -target wasm32-unknown-unknown -msimd128 -Wcast-qual -fno-lax-vector-conversions -Werror | FileCheck %s
+// RUN: %clang %s -O2 -emit-llvm -S -o - -target wasm32-unknown-unknown -msimd128 -Wall -Weverything -Wno-missing-prototypes -fno-lax-vector-conversions -Werror | FileCheck %s
 
 #include 
 
@@ -213,7 +213,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store(void *mem, v128_t a) {
-  return wasm_v128_store(mem, a);
+  wasm_v128_store(mem, a);
 }
 
 // CHECK-LABEL: @test_v128_store8_lane(
@@ -224,7 +224,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store8_lane(uint8_t *ptr, v128_t vec) {
-  return wasm_v128_store8_lane(ptr, vec, 15);
+  wasm_v128_store8_lane(ptr, vec, 15);
 }
 
 // CHECK-LABEL: @test_v128_store16_lane(
@@ -235,7 +235,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store16_lane(uint16_t *ptr, v128_t vec) {
-  return wasm_v128_store16_lane(ptr, vec, 7);
+  wasm_v128_store16_lane(ptr, vec, 7);
 }
 
 // CHECK-LABEL: @test_v128_store32_lane(
@@ -245,7 +245,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store32_lane(uint32_t *ptr, v128_t vec) {
-  return wasm_v128_store32_lane(ptr, vec, 3);
+  wasm_v128_store32_lane(ptr, vec, 3);
 }
 
 // CHECK-LABEL: @test_v128_store64_lane(
@@ -256,7 +256,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store64_lane(uint64_t *ptr, v128_t vec) {
-  return wasm_v128_store64_lane(ptr, vec, 1);
+  wasm_v128_store64_lane(ptr, vec, 1);
 }
 
 // CHECK-LABEL: @test_i8x16_make(
@@ -284,6 +284,31 @@
   return wasm_i8x16_make(c0, c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15);
 }
 
+// CHECK-LABEL: @test_u8x16_make(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <16 x i8> undef, i8 [[C0:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT1_I:%.*]] = insertelement <16 x i8> [[VECINIT_I]], i8 [[C1:%.*]], i32 1
+// CHECK-NEXT:[[VECINIT2_I:%.*]] = insertelement <16 x i8> [[VECINIT1_I]], i8 [[C2:%.*]], i32 2
+// CHECK-NEXT:[[VECINIT3_I:%.*]] = insertelement <16 x i8> [[VECINIT2_I]], i8 [[C3:%.*]], i32 3
+// CHECK-NEXT:[[VECINIT4_I:%.*]] = insertelement <16 x i8> [[VECINIT3_I]], i8 [[C4:%.*]], i32 4
+// CHECK-NEXT:[[VECINIT5_I:%.*]] = insertelement <16 x i8> [[VECINIT4_I]], i8 [[C5:%.*]], i32 5
+// CHECK-NEXT:[[VECINIT6_I:%.*]] = insertelement <16 x i8> [[VECINIT5_I]], i8 [[C6:%.*]], i32 6
+// CHECK-NEXT:[[VECINIT7_I:%.*]] = insertelement <16 x i8> [[VECINIT6_I]], i8 [[C7:%.*]], i32 7
+// CHECK-NEXT:[[VECINIT8_I:%.*]] = insertelement <16 x i8> [[VECINIT7_I]], i8 [[C8:%.*]], i32 8
+// CHECK-NEXT:[[VECINIT9_I:%.*]] = insertelement <16 x i8> [[VECINIT8_I]], i8 [[C9:%.*]], i32 9
+// CHECK-NEXT:[[VECINIT10_I:%.*]] = insertelement <16 x i8> [[VECINIT9_I]], i8 [[C10:%.*]], i32 10
+// CHECK-NEXT:[[VECINIT11_I:%.*]] = insertelement <16 x i8> [[VECINIT10_I]], i8 [[C11:%.*]], i32 11
+// CHECK-NEXT:[[VECINIT12_I:%.*]] = insertelement <16 x i8> [[VECINIT11_I]], i8 [[C12:%.*]], i32 12
+// CHECK-NEXT:[[VECINIT13_I:%.*]] = insertelement <16 x i8> [[VECINIT12_I]], i8 [[C13:%.*]], i32 13
+// CHECK-NEXT:[[VECINIT14_I:%.*]] = insertelement <16 x i8> [[VECINIT13_I]], i8 [[C14:%.*]], i32 14
+// CHECK-NEXT:[[VECINIT15_I:%.*]] = insertelement <16 x i8> [[VECINIT14_I]], i8 [[C15:%.*]], i32 15
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <16 x i8> [[VECINIT15_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP0]]
+//
+v128_t test_u8x16_make(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3, uint8_t c4, uint8_t c5, uint8_t c6, uint8_t c7, uint8_t c8, uint8_t c9, uint8_t c10, uint8_t c11, uint8_t c12, uint8_t c13, uint8_t c14, uint8_t c15) {
+  return wasm_u8x16_make(c0, c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15);
+}
+
 // CHECK-LABEL: @test_i16x8_make(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <8 x i16> undef, i16 [[C0:%.*]], 

[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Dávid Bolvanský via Phabricator via cfe-commits
xbolva00 added a comment.

@rpbeltran please check new revision, I added support for bitwise OR.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Dávid Bolvanský via Phabricator via cfe-commits
xbolva00 added a comment.

In D108003#2955009 , @aeubanks wrote:

> In D108003#2944178 , @aeubanks 
> wrote:
>
>> I ran this over Chrome and ran into a use case that looks legitimate. It 
>> seems like the pattern in LLVM where we want to run a bunch of 
>> transformations and get if any of them changed anything.
>>
>> https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/api/audio/echo_canceller3_config.cc;drc=cbdbb8c1666fd08a094422905e73391706a05b03;l=111
>
> The remaining places where this fired looked like places where `&&` would be 
> better than `&`, except for one where the code was treating bools as one bit 
> integers and doing various bitwise operations on them
> https://source.chromium.org/chromium/chromium/src/+/main:third_party/distributed_point_functions/src/dpf/distributed_point_function.cc;drc=87b84b3834343e141ec94e3321f4d1c7be8a7a9d;l=230



In D108003#2955009 , @aeubanks wrote:

> In D108003#2944178 , @aeubanks 
> wrote:
>
>> I ran this over Chrome and ran into a use case that looks legitimate. It 
>> seems like the pattern in LLVM where we want to run a bunch of 
>> transformations and get if any of them changed anything.
>>
>> https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/api/audio/echo_canceller3_config.cc;drc=cbdbb8c1666fd08a094422905e73391706a05b03;l=111
>
> The remaining places where this fired looked like places where `&&` would be 
> better than `&`, except for one where the code was treating bools as one bit 
> integers and doing various bitwise operations on them
> https://source.chromium.org/chromium/chromium/src/+/main:third_party/distributed_point_functions/src/dpf/distributed_point_function.cc;drc=87b84b3834343e141ec94e3321f4d1c7be8a7a9d;l=230

New revision should be more conversative to avoid warning on cases you 
mentioned.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Dávid Bolvanský via Phabricator via cfe-commits
xbolva00 updated this revision to Diff 367611.
xbolva00 edited the summary of this revision.
xbolva00 added a comment.

- Only warn when both sides have potentional side effects (conversative, but 
covers motivating case, reduces useless noise - which may hide real bug - 
caused by this warning)
- Added support for bitwise | + tests.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaChecking.cpp
  clang/test/Sema/warn-bitwise-and-bool.c
  clang/test/Sema/warn-bitwise-or-bool.c

Index: clang/test/Sema/warn-bitwise-or-bool.c
===
--- /dev/null
+++ clang/test/Sema/warn-bitwise-or-bool.c
@@ -0,0 +1,50 @@
+// RUN: %clang_cc1 -x c -fsyntax-only -verify -Wbool-operation %s
+// RUN: %clang_cc1 -x c -fsyntax-only -verify -Wall %s
+// RUN: %clang_cc1 -x c -fsyntax-only -Wbool-operation -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s
+// RUN: %clang_cc1 -x c++ -fsyntax-only -verify -Wbool-operation %s
+// RUN: %clang_cc1 -x c++ -fsyntax-only -verify -Wall %s
+// RUN: %clang_cc1 -x c++ -fsyntax-only -Wbool-operation -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s
+
+#ifdef __cplusplus
+typedef bool boolean;
+#else
+typedef _Bool boolean;
+#endif
+
+boolean foo(void);
+boolean bar(void);
+boolean baz(void) __attribute__((const));
+void sink(boolean);
+
+#define FOO foo()
+
+void test(boolean a, boolean b, int *p, volatile int *q, int i) {
+  b = a | b;
+  b = foo() | a;
+  b = (p != 0) | (*p == 42);
+  b = foo() | (*q == 42); // expected-warning {{use of bitwise '|' with boolean operands}}
+  b = a | foo();
+  b = foo() | bar();  // expected-warning {{use of bitwise '|' with boolean operands}}
+  // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:13-[[@LINE-1]]:14}:"||"
+  b = foo() | !bar(); // expected-warning {{use of bitwise '|' with boolean operands}}
+  // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:13-[[@LINE-1]]:14}:"||"
+  b = a | baz();
+  b = bar() | FOO; // expected-warning {{use of bitwise '|' with boolean operands}}
+   // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:13-[[@LINE-1]]:14}:"||"
+  b = b | foo();
+  b = bar() | (i > 4);
+  b = (i == 7) | foo();
+#ifdef __cplusplus
+  b = foo() bitor bar(); // expected-warning {{use of bitwise '|' with boolean operands}}
+#endif
+
+  if (foo() | bar()) // expected-warning {{use of bitwise '|' with boolean operands}}
+;
+
+  sink(a | b);
+  sink(a | foo());
+  sink(foo() | bar()); // expected-warning {{use of bitwise '|' with boolean operands}}
+
+  int n = i + 10;
+  b = (n | (n - 1));
+}
Index: clang/test/Sema/warn-bitwise-and-bool.c
===
--- /dev/null
+++ clang/test/Sema/warn-bitwise-and-bool.c
@@ -0,0 +1,50 @@
+// RUN: %clang_cc1 -x c -fsyntax-only -verify -Wbool-operation %s
+// RUN: %clang_cc1 -x c -fsyntax-only -verify -Wall %s
+// RUN: %clang_cc1 -x c -fsyntax-only -Wbool-operation -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s
+// RUN: %clang_cc1 -x c++ -fsyntax-only -verify -Wbool-operation %s
+// RUN: %clang_cc1 -x c++ -fsyntax-only -verify -Wall %s
+// RUN: %clang_cc1 -x c++ -fsyntax-only -Wbool-operation -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s
+
+#ifdef __cplusplus
+typedef bool boolean;
+#else
+typedef _Bool boolean;
+#endif
+
+boolean foo(void);
+boolean bar(void);
+boolean baz(void) __attribute__((const));
+void sink(boolean);
+
+#define FOO foo()
+
+void test(boolean a, boolean b, int *p, volatile int *q, int i) {
+  b = a & b;
+  b = foo() & a;
+  b = (p != 0) & (*p == 42);
+  b = foo() & (*q == 42); // expected-warning {{use of bitwise '&' with boolean operands}}
+  b = a & foo();
+  b = foo() & bar();  // expected-warning {{use of bitwise '&' with boolean operands}}
+  // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:13-[[@LINE-1]]:14}:"&&"
+  b = foo() & !bar(); // expected-warning {{use of bitwise '&' with boolean operands}}
+  // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:13-[[@LINE-1]]:14}:"&&"
+  b = a & baz();
+  b = bar() & FOO; // expected-warning {{use of bitwise '&' with boolean operands}}
+   // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:13-[[@LINE-1]]:14}:"&&"
+  b = b & foo();
+  b = bar() & (i > 4);
+  b = (i == 7) & foo();
+#ifdef __cplusplus
+  b = foo() bitand bar(); // expected-warning {{use of bitwise '&' with boolean operands}}
+#endif
+
+  if (foo() & bar()) // expected-warning {{use of bitwise '&' with boolean operands}}
+;
+
+  sink(a & b);
+  sink(a & foo());
+  sink(foo() & bar()); // expected-warning {{use of bitwise '&' with boolean operands}}
+
+  int n = i + 10;
+  b = (n & (n - 1));
+}
Index: clang/lib/Sema/SemaChecking.cpp
===
--- clang/lib/Sema/SemaChecking.cpp
+++ clang/lib/Sema/SemaChecking.cpp

[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: llvm/test/CodeGen/X86/asan-check-memaccess-or.ll:47
+  %2 = bitcast i64* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 3, i32 3, i32 1)

Is this test out of date? Code has fewer arguments now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108407: [CodeGen][WIP] Avoid generating Record layouts for pointee types

2021-08-19 Thread Raphael Isemann via Phabricator via cfe-commits
teemperor added a comment.

I'm mostly putting this up to get some early feedback if anyone sees a problem 
with using opaque types here (e.g. it breaks some optimizations, etc.). If it 
does, it would still be nice if we could at least make this happen on some 
opt-in bases as it would be very beneficial for improving the performance of 
LLDB.




Comment at: llvm/include/llvm/IR/Instructions.h:1176
  ->isOpaqueOrPointeeTypeMatches(ResultElementType));
+  assert(PointeeType->isSized());
   init(Ptr, IdxList, NameStr);

)This change and the one below slipped in by accident, that was more of a 
debugging help that I wanted to put up as a separate patch.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108407/new/

https://reviews.llvm.org/D108407

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

eugenis wrote:
> vitalybuka wrote:
> > pcc wrote:
> > > eugenis wrote:
> > > > vitalybuka wrote:
> > > > > kstoimenov wrote:
> > > > > > vitalybuka wrote:
> > > > > > > vitalybuka wrote:
> > > > > > > > PTAL at lvm.read_register.i32
> > > > > > > > 
> > > > > > > > How about:
> > > > > > > > 
> > > > > > > > llvm.asan.check.memaccess ->
> > > > > > > >   lvm.asan.check_read
> > > > > > > >   lvm.asan.check_write
> > > > > > > >   lvm.asan.kernel.check_read
> > > > > > > >   lvm.asan.kernel.check_write
> > > > > > > > 
> > > > > > > > Even better
> > > > > > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > > > > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > > > > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > > > > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > > > > > 
> > > > > > > Looks like underscore is not used in intrinsic names, so 
> > > > > > > essentially the same with dots.
> > > > > > Sounds good to me. I do the full expansion so there will be 20 
> > > > > > intrinsics altogether. I will update the code and ping you when 
> > > > > > done. 
> > > > > @pcc @eugenis 
> > > > > WDYT, I think later we can do the same for HWASAN?
> > > > I don't see what these multiple intrinsics give us that a single 
> > > > memaccess one does not provide?
> > > > 
> > > > As long as access type and similar arguments are immediates.
> > > > 
> > > Agree with @eugenis - these sorts of intrinsic variants are typically 
> > > used for distinguishing different pointer element types and we're in the 
> > > process of getting rid of those anyway.
> > @pcc @eugenis Then do you prefer to encode is_write+size+kernel into 
> > non-human unreadable AccessInfo, like hwasan, or separate 0/1 arguments.
> > I probably prefer AccessInfo, as they both unreadable, but the hwasan 
> > version is shorter.
> don't have a strong opinion, but sometimes I wish that hwasan outlined 
> function names were more readable. The magic number in the names takes effort 
> to decode.
> 
AccessInfo



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

pcc wrote:
> kstoimenov wrote:
> > vitalybuka wrote:
> > > eugenis wrote:
> > > > vitalybuka wrote:
> > > > > pcc wrote:
> > > > > > eugenis wrote:
> > > > > > > vitalybuka wrote:
> > > > > > > > kstoimenov wrote:
> > > > > > > > > vitalybuka wrote:
> > > > > > > > > > vitalybuka wrote:
> > > > > > > > > > > PTAL at lvm.read_register.i32
> > > > > > > > > > > 
> > > > > > > > > > > How about:
> > > > > > > > > > > 
> > > > > > > > > > > llvm.asan.check.memaccess ->
> > > > > > > > > > >   lvm.asan.check_read
> > > > > > > > > > >   lvm.asan.check_write
> > > > > > > > > > >   lvm.asan.kernel.check_read
> > > > > > > > > > >   lvm.asan.kernel.check_write
> > > > > > > > > > > 
> > > > > > > > > > > Even better
> > > > > > > > > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > > > > > > > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > > > > > > > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > > > > > > > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > > > > > > > > 
> > > > > > > > > > Looks like underscore is not used in intrinsic names, so 
> > > > > > > > > > essentially the same with dots.
> > > > > > > > > Sounds good to me. I do the full expansion so there will be 
> > > > > > > > > 20 intrinsics altogether. I will update the code and ping you 
> > > > > > > > > when done. 
> > > > > > > > @pcc @eugenis 
> > > > > > > > WDYT, I think later we can do the same for HWASAN?
> > > > > > > I don't see what these multiple intrinsics give us that a single 
> > > > > > > memaccess one does not provide?
> > > > > > > 
> > > > > > > As long as access type and similar arguments are immediates.
> > > > > > > 
> > > > > > Agree with @eugenis - these sorts of intrinsic variants are 
> > > > > > typically used for distinguishing different pointer element types 
> > > > > > and we're in the process of getting rid of those anyway.
> > > > > @pcc @eugenis Then do you prefer to encode is_write+size+kernel into 
> > > > > non-human unreadable AccessInfo, like hwasan, or separate 0/1 
> > > > > arguments.
> > > > > I probably prefer AccessInfo, as they both unreadable, but the hwasan 
> > > > > version is shorter.
> > > > don't have a strong opinion, but sometimes I wish that hwasan outlined 
> > > > function names were more readable. The magic number in the names takes 
> > > > effort to decode.
> > > > 
> > > AccessInfo
> > I think I am gonna go with int_asan_check(?_kernel)_(load|store) and pass 
> > the size as parameter. What do you think? 
> You mean in a register? I think that could mean more register pressure 

[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Peter Collingbourne via Phabricator via cfe-commits
pcc added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

kstoimenov wrote:
> eugenis wrote:
> > vitalybuka wrote:
> > > pcc wrote:
> > > > eugenis wrote:
> > > > > vitalybuka wrote:
> > > > > > kstoimenov wrote:
> > > > > > > vitalybuka wrote:
> > > > > > > > vitalybuka wrote:
> > > > > > > > > PTAL at lvm.read_register.i32
> > > > > > > > > 
> > > > > > > > > How about:
> > > > > > > > > 
> > > > > > > > > llvm.asan.check.memaccess ->
> > > > > > > > >   lvm.asan.check_read
> > > > > > > > >   lvm.asan.check_write
> > > > > > > > >   lvm.asan.kernel.check_read
> > > > > > > > >   lvm.asan.kernel.check_write
> > > > > > > > > 
> > > > > > > > > Even better
> > > > > > > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > > > > > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > > > > > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > > > > > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > > > > > > 
> > > > > > > > Looks like underscore is not used in intrinsic names, so 
> > > > > > > > essentially the same with dots.
> > > > > > > Sounds good to me. I do the full expansion so there will be 20 
> > > > > > > intrinsics altogether. I will update the code and ping you when 
> > > > > > > done. 
> > > > > > @pcc @eugenis 
> > > > > > WDYT, I think later we can do the same for HWASAN?
> > > > > I don't see what these multiple intrinsics give us that a single 
> > > > > memaccess one does not provide?
> > > > > 
> > > > > As long as access type and similar arguments are immediates.
> > > > > 
> > > > Agree with @eugenis - these sorts of intrinsic variants are typically 
> > > > used for distinguishing different pointer element types and we're in 
> > > > the process of getting rid of those anyway.
> > > @pcc @eugenis Then do you prefer to encode is_write+size+kernel into 
> > > non-human unreadable AccessInfo, like hwasan, or separate 0/1 arguments.
> > > I probably prefer AccessInfo, as they both unreadable, but the hwasan 
> > > version is shorter.
> > don't have a strong opinion, but sometimes I wish that hwasan outlined 
> > function names were more readable. The magic number in the names takes 
> > effort to decode.
> > 
> I think I am gonna go with int_asan_check(?_kernel)_(load|store) and pass the 
> size as parameter. What do you think? 
You mean in a register? I think that could mean more register pressure -> 
higher code size.

The magic numbers are unfortunate but they aren't that hard to decode (maybe we 
should be printing them as hex to make it a bit easier). I suppose we could 
pretty print the access info into the symbol name but only a few people will be 
looking at these so I'm not sure it's worth the effort.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108407: [CodeGen][WIP] Avoid generating Record layouts for pointee types

2021-08-19 Thread Raphael Isemann via Phabricator via cfe-commits
teemperor created this revision.
teemperor added reviewers: dblaikie, rjmccall, rsmith, v.g.vassilev.
teemperor added a project: clang.
Herald added a subscriber: dexonsmith.
teemperor requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits.
Herald added a project: LLVM.

This is a WIP patch that tries to avoid creating a RecordLayout in Clang and 
instead just emit an opaque structure type
as if we only had a forward declarations. The main motivation for this patch is 
actually just supporting a use case in LLDB
where laying out types can be very expensive as it usually triggers parsing of 
debug information.

The changes in this patch can be summarized as:

- `CodeGenTypes::ConvertRecordDeclType` (and related funcs) have a new 
parameter that tells us if we need the definition. It's currently only set to 
false for Clang pointer types.
- There are a few new places where I added (temporary) calls to 
`ConvertTypeForMem()` on some pointee types. The reason is that the code after 
is usually creating GEP instructions where we need a non-opaque source type. We 
can't do this automatically from the GEP factory methods as they would need to 
know the clang::Type to automatically do this (and they only have the 
llvm::Type that can't be mapped back to a clang::Type from what I can see, but 
that might be incorrect).
- A few test that needed to be adjusted as they relied on e.g. `Foo *x` to be 
enough to force `Foo` to be laid out/emitted.

There are still about a dozen more tests failing but from what I can see they 
all just need to be adjusted to force specific types to be emitted. I'll fix 
those up once there is consensus that this patch is going in the right 
direction.

Some benchmarks: I did a stage2 build of LLVM+Clang with my patch and those are 
the stats:

  current ToT Clang:
  2232421 - total amount of struct types created
94911 - of which are opaque struct types
  
  with this patch:
  1715074 - total amount of struct types created (-23%)
   173127 - of which are opaque struct types (+82%)

I built a part of Clang (the last 300 source files in the 
compile_commands.json) and the average time on my 64 core machine changes like 
this (as per hyperfine):

  Benchmark #1: parallel --progress -j63 -a ToT-clang
Time (mean ± σ): 27.703 s ±  0.168 s[User: 1434.619 s, System: 
66.687 s]
Range (min … max):   27.459 s … 27.891 s10 runs
   
  Benchmark #2: parallel --progress -j63 -a with-patch
Time (mean ± σ): 27.439 s ±  0.111 s[User: 1427.739 s, System: 
66.220 s]
Range (min … max):   27.300 s … 27.625 s10 runs
   
  Summary
'parallel --progress -j63 -a with-patch' ran
  1.01 ± 0.01 times faster than 'parallel --progress -j63 -a ToT-clang'


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108407

Files:
  clang/include/clang/AST/Type.h
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CGExprCXX.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CodeGenTypes.cpp
  clang/lib/CodeGen/CodeGenTypes.h
  clang/test/CodeGen/c11atomics.c
  clang/test/CodeGenCXX/class-layout.cpp
  clang/test/CodeGenCXX/pr18962.cpp
  clang/test/CodeGenCXX/warn-padded-packed.cpp
  llvm/include/llvm/IR/Instructions.h

Index: llvm/include/llvm/IR/Instructions.h
===
--- llvm/include/llvm/IR/Instructions.h
+++ llvm/include/llvm/IR/Instructions.h
@@ -1173,6 +1173,7 @@
   ResultElementType(getIndexedType(PointeeType, IdxList)) {
   assert(cast(getType()->getScalarType())
  ->isOpaqueOrPointeeTypeMatches(ResultElementType));
+  assert(PointeeType->isSized());
   init(Ptr, IdxList, NameStr);
 }
 
@@ -1187,6 +1188,7 @@
   ResultElementType(getIndexedType(PointeeType, IdxList)) {
   assert(cast(getType()->getScalarType())
  ->isOpaqueOrPointeeTypeMatches(ResultElementType));
+  assert(PointeeType->isSized());
   init(Ptr, IdxList, NameStr);
 }
 
Index: clang/test/CodeGenCXX/warn-padded-packed.cpp
===
--- clang/test/CodeGenCXX/warn-padded-packed.cpp
+++ clang/test/CodeGenCXX/warn-padded-packed.cpp
@@ -148,6 +148,6 @@
 
 
 // The warnings are emitted when the layout of the structs is computed, so we have to use them.
-void f(S1*, S2*, S3*, S4*, S5*, S6*, S7*, S8*, S9*, S10*, S11*, S12*, S13*,
-   S14*, S15*, S16*, S17*, S18*, S19*, S20*, S21*, S22*, S23*, S24*, S25*,
-   S26*, S27*){}
+void f(S1, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, S12, S13,
+   S14, S15, S16, S17, S18, S19, S20, S21, S22, S23, S24, S25,
+   S26, S27){}
Index: clang/test/CodeGenCXX/pr18962.cpp
===
--- clang/test/CodeGenCXX/pr18962.cpp
+++ clang/test/CodeGenCXX/pr18962.cpp
@@ -27,6 +27,5 @@
 // We end up using an opaque type for 'append' to avoid circular references.
 // CHECK: %class.A = 

[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Kirill Stoimenov via Phabricator via cfe-commits
kstoimenov marked an inline comment as done.
kstoimenov added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

eugenis wrote:
> vitalybuka wrote:
> > pcc wrote:
> > > eugenis wrote:
> > > > vitalybuka wrote:
> > > > > kstoimenov wrote:
> > > > > > vitalybuka wrote:
> > > > > > > vitalybuka wrote:
> > > > > > > > PTAL at lvm.read_register.i32
> > > > > > > > 
> > > > > > > > How about:
> > > > > > > > 
> > > > > > > > llvm.asan.check.memaccess ->
> > > > > > > >   lvm.asan.check_read
> > > > > > > >   lvm.asan.check_write
> > > > > > > >   lvm.asan.kernel.check_read
> > > > > > > >   lvm.asan.kernel.check_write
> > > > > > > > 
> > > > > > > > Even better
> > > > > > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > > > > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > > > > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > > > > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > > > > > 
> > > > > > > Looks like underscore is not used in intrinsic names, so 
> > > > > > > essentially the same with dots.
> > > > > > Sounds good to me. I do the full expansion so there will be 20 
> > > > > > intrinsics altogether. I will update the code and ping you when 
> > > > > > done. 
> > > > > @pcc @eugenis 
> > > > > WDYT, I think later we can do the same for HWASAN?
> > > > I don't see what these multiple intrinsics give us that a single 
> > > > memaccess one does not provide?
> > > > 
> > > > As long as access type and similar arguments are immediates.
> > > > 
> > > Agree with @eugenis - these sorts of intrinsic variants are typically 
> > > used for distinguishing different pointer element types and we're in the 
> > > process of getting rid of those anyway.
> > @pcc @eugenis Then do you prefer to encode is_write+size+kernel into 
> > non-human unreadable AccessInfo, like hwasan, or separate 0/1 arguments.
> > I probably prefer AccessInfo, as they both unreadable, but the hwasan 
> > version is shorter.
> don't have a strong opinion, but sometimes I wish that hwasan outlined 
> function names were more readable. The magic number in the names takes effort 
> to decode.
> 
I think I am gonna go with int_asan_check(?_kernel)_(load|store) and pass the 
size as parameter. What do you think? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108377: [asan] Implemented flag to emit intrinsics to optimize ASan callbacks.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: clang/test/CodeGen/asan-use-callbacks.cpp:9
+// RUN: -fsanitize=address %s -fsanitize-address-outline-instrumentation \
+// RUN: -mllvm -asan-optimize-callbacks \
+// RUN: | FileCheck %s --check-prefixes=CHECK-OPTIMIZED

vitalybuka wrote:
> for -mllvm flag we need test under llvm/test not clang.
> 
> llvm test needs to be more meaningful, e.g. check precice value passed into 
> intrinsic
> and have 3 versions:
> ... -mllvm -asan-optimize-callbacks=0
> ... -mllvm -asan-optimize-callbacks=1
> ... default
to clarify
Goal of clang/test/CodeGen/asan-use-callbacks.cpp to test 
-fsanitize-address-outline-instrumentation which is frontend (clang) flag.
-mllvm are not clang flags, even clang knows how to forward them to llvm.

I believe we will not add corresponding frontend clang flag for 
asan-optimize-callbacks. We will make it default ON after some testing and 
benchmarking.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108377/new/

https://reviews.llvm.org/D108377

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108377: [asan] Implemented flag to emit intrinsics to optimize ASan callbacks.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: clang/test/CodeGen/asan-use-callbacks.cpp:9
+// RUN: -fsanitize=address %s -fsanitize-address-outline-instrumentation \
+// RUN: -mllvm -asan-optimize-callbacks \
+// RUN: | FileCheck %s --check-prefixes=CHECK-OPTIMIZED

for -mllvm flag we need test under llvm/test not clang.

llvm test needs to be more meaningful, e.g. check precice value passed into 
intrinsic
and have 3 versions:
... -mllvm -asan-optimize-callbacks=0
... -mllvm -asan-optimize-callbacks=1
... default


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108377/new/

https://reviews.llvm.org/D108377

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Kirill Stoimenov via Phabricator via cfe-commits
kstoimenov updated this revision to Diff 367601.
kstoimenov retitled this revision from "[asan] Implemented intrinsic for 
the custom calling convention similar used by HWASan for X86." to "[asan] 
Implemented intrinsic for the custom calling convention similar used by HWASan 
for X86.".
kstoimenov added a comment.

Removed IntrInaccessibleMemOnly.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

Files:
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Target/X86/X86AsmPrinter.cpp
  llvm/lib/Target/X86/X86AsmPrinter.h
  llvm/lib/Target/X86/X86InstrCompiler.td
  llvm/lib/Target/X86/X86MCInstLower.cpp
  llvm/lib/Target/X86/X86RegisterInfo.td
  llvm/test/CodeGen/X86/asan-check-memaccess-add.ll
  llvm/test/CodeGen/X86/asan-check-memaccess-or.ll

Index: llvm/test/CodeGen/X86/asan-check-memaccess-or.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/asan-check-memaccess-or.ll
@@ -0,0 +1,234 @@
+; RUN: llc < %s | FileCheck %s
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @load1(i8* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load1_rn[[RN1:.*]]
+; CHECK:  callq   __asan_check_store1_rn[[RN1]]
+; CHECK-NEXT: retq
+  call void @llvm.asan.check.memaccess(i8* %x, i64 2147450880, i32 0,
+   i32 0, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %x, i64 2147450880, i32 1,
+   i32 0, i32 3, i32 1)
+  ret void
+}
+
+define void @load2(i16* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load2_rn[[RN2:.*]]
+; CHECK:  callq   __asan_check_store2_rn[[RN2]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i16* %x to i64
+  %2 = bitcast i16* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 1, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 1, i32 3, i32 1)
+  ret void
+}
+
+define void @load4(i32* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load4_rn[[RN4:.*]]
+; CHECK:  callq   __asan_check_store4_rn[[RN4]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i32* %x to i64
+  %2 = bitcast i32* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 2, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 2, i32 3, i32 1)
+  ret void
+}
+define void @load8(i64* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load8_rn[[RN8:.*]]
+; CHECK:  callq   __asan_check_store8_rn[[RN8]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i64* %x to i64
+  %2 = bitcast i64* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 3, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 3, i32 3, i32 1)
+  ret void
+}
+
+define void @load16(i128* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load16_rn[[RN16:.*]]
+; CHECK:  callq   __asan_check_store16_rn[[RN16]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i128* %x to i64
+  %2 = bitcast i128* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 4, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 4, i32 3, i32 1)
+  ret void
+}
+
+; CHECK:  __asan_check_load1_rn[[RN1]]:
+; CHECK-NEXT: movq[[REG:.*]], %r8
+; CHECK-NEXT: shrq$3, %r8
+; CHECK-NEXT: orq $2147450880, %r8{{.*}}
+; CHECK-NEXT: movb(%r8), %r8b
+; CHECK-NEXT: testb   %r8b, %r8b
+; CHECK-NEXT: jne [[EXTRA:.*]]
+; CHECK-NEXT: [[RET:.*]]:
+; CHECK-NEXT: retq
+; CHECK-NEXT: [[EXTRA]]:
+; CHECK-NEXT: pushq   %rcx
+; CHECK-NEXT: movq[[REG]], %rcx
+; CHECK-NEXT: andl$7, %ecx
+; CHECK-NEXT: cmpl%r8d, %ecx
+; CHECK-NEXT: popq%rcx
+; CHECK-NEXT: jl  [[RET]]
+; CHECK-NEXT: movq[[REG:.*]], %rdi
+; CHECK-NEXT: jmp __asan_report_load1
+
+; CHECK:  __asan_check_load2_rn[[RN2]]:
+; CHECK-NEXT: movq[[REG:.*]], %r8
+; CHECK-NEXT: shrq$3, %r8
+; CHECK-NEXT: orq $2147450880, %r8{{.*}}
+; CHECK-NEXT: movb(%r8), %r8b
+; CHECK-NEXT: testb   %r8b, %r8b
+; CHECK-NEXT: jne [[EXTRA:.*]]
+; CHECK-NEXT: [[RET:.*]]:
+; CHECK-NEXT: retq
+; CHECK-NEXT: [[EXTRA]]:
+; CHECK-NEXT: pushq   %rcx
+; CHECK-NEXT: 

[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

vitalybuka wrote:
> pcc wrote:
> > eugenis wrote:
> > > vitalybuka wrote:
> > > > kstoimenov wrote:
> > > > > vitalybuka wrote:
> > > > > > vitalybuka wrote:
> > > > > > > PTAL at lvm.read_register.i32
> > > > > > > 
> > > > > > > How about:
> > > > > > > 
> > > > > > > llvm.asan.check.memaccess ->
> > > > > > >   lvm.asan.check_read
> > > > > > >   lvm.asan.check_write
> > > > > > >   lvm.asan.kernel.check_read
> > > > > > >   lvm.asan.kernel.check_write
> > > > > > > 
> > > > > > > Even better
> > > > > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > > > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > > > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > > > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > > > > 
> > > > > > Looks like underscore is not used in intrinsic names, so 
> > > > > > essentially the same with dots.
> > > > > Sounds good to me. I do the full expansion so there will be 20 
> > > > > intrinsics altogether. I will update the code and ping you when done. 
> > > > @pcc @eugenis 
> > > > WDYT, I think later we can do the same for HWASAN?
> > > I don't see what these multiple intrinsics give us that a single 
> > > memaccess one does not provide?
> > > 
> > > As long as access type and similar arguments are immediates.
> > > 
> > Agree with @eugenis - these sorts of intrinsic variants are typically used 
> > for distinguishing different pointer element types and we're in the process 
> > of getting rid of those anyway.
> @pcc @eugenis Then do you prefer to encode is_write+size+kernel into 
> non-human unreadable AccessInfo, like hwasan, or separate 0/1 arguments.
> I probably prefer AccessInfo, as they both unreadable, but the hwasan version 
> is shorter.
don't have a strong opinion, but sometimes I wish that hwasan outlined function 
names were more readable. The magic number in the names takes effort to decode.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

pcc wrote:
> eugenis wrote:
> > vitalybuka wrote:
> > > kstoimenov wrote:
> > > > vitalybuka wrote:
> > > > > vitalybuka wrote:
> > > > > > PTAL at lvm.read_register.i32
> > > > > > 
> > > > > > How about:
> > > > > > 
> > > > > > llvm.asan.check.memaccess ->
> > > > > >   lvm.asan.check_read
> > > > > >   lvm.asan.check_write
> > > > > >   lvm.asan.kernel.check_read
> > > > > >   lvm.asan.kernel.check_write
> > > > > > 
> > > > > > Even better
> > > > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > > > 
> > > > > Looks like underscore is not used in intrinsic names, so essentially 
> > > > > the same with dots.
> > > > Sounds good to me. I do the full expansion so there will be 20 
> > > > intrinsics altogether. I will update the code and ping you when done. 
> > > @pcc @eugenis 
> > > WDYT, I think later we can do the same for HWASAN?
> > I don't see what these multiple intrinsics give us that a single memaccess 
> > one does not provide?
> > 
> > As long as access type and similar arguments are immediates.
> > 
> Agree with @eugenis - these sorts of intrinsic variants are typically used 
> for distinguishing different pointer element types and we're in the process 
> of getting rid of those anyway.
@pcc @eugenis Then do you prefer to encode is_write+size+kernel into non-human 
unreadable AccessInfo, like hwasan, or separate 0/1 arguments.
I probably prefer AccessInfo, as they both unreadable, but the hwasan version 
is shorter.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Peter Collingbourne via Phabricator via cfe-commits
pcc added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

eugenis wrote:
> vitalybuka wrote:
> > kstoimenov wrote:
> > > vitalybuka wrote:
> > > > vitalybuka wrote:
> > > > > PTAL at lvm.read_register.i32
> > > > > 
> > > > > How about:
> > > > > 
> > > > > llvm.asan.check.memaccess ->
> > > > >   lvm.asan.check_read
> > > > >   lvm.asan.check_write
> > > > >   lvm.asan.kernel.check_read
> > > > >   lvm.asan.kernel.check_write
> > > > > 
> > > > > Even better
> > > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > > 
> > > > Looks like underscore is not used in intrinsic names, so essentially 
> > > > the same with dots.
> > > Sounds good to me. I do the full expansion so there will be 20 intrinsics 
> > > altogether. I will update the code and ping you when done. 
> > @pcc @eugenis 
> > WDYT, I think later we can do the same for HWASAN?
> I don't see what these multiple intrinsics give us that a single memaccess 
> one does not provide?
> 
> As long as access type and similar arguments are immediates.
> 
Agree with @eugenis - these sorts of intrinsic variants are typically used for 
distinguishing different pointer element types and we're in the process of 
getting rid of those anyway.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108132: Add implicit map for a list item appears in a reduction clause.

2021-08-19 Thread Jennifer Yu via Phabricator via cfe-commits
jyu2 added a comment.

Thank you so much for Alex's review!!!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108132/new/

https://reviews.llvm.org/D108132

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108132: Add implicit map for a list item appears in a reduction clause.

2021-08-19 Thread Jennifer Yu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGc274b1986680: Add implicit map for a list item appears in a 
reduction clause. (authored by jyu2).

Changed prior to commit:
  https://reviews.llvm.org/D108132?vs=366991=367597#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108132/new/

https://reviews.llvm.org/D108132

Files:
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaOpenMP.cpp
  clang/lib/Sema/TreeTransform.h
  clang/test/OpenMP/reduction_implicit_map.cpp
  openmp/libomptarget/test/mapping/reduction_implicit_map.cpp

Index: openmp/libomptarget/test/mapping/reduction_implicit_map.cpp
===
--- /dev/null
+++ openmp/libomptarget/test/mapping/reduction_implicit_map.cpp
@@ -0,0 +1,28 @@
+// RUN: %libomptarget-compilexx-run-and-check-generic
+
+// amdgcn does not have printf definition
+// UNSUPPORTED: amdgcn-amd-amdhsa
+
+#include 
+
+void sum(int* input, int size, int* output)
+{
+#pragma omp target teams distribute parallel for reduction(+:output[0]) \
+ map(to:input[0:size])
+  for (int i = 0; i < size; i++)
+output[0] += input[i];
+}
+int main()
+{
+  const int size = 100;
+  int *array = new int[size];
+  int result = 0;
+  for (int i = 0; i < size; i++)
+array[i] = i + 1;
+  sum(array, size, );
+  // CHECK: Result=5050
+  printf("Result=%d\n", result);
+  delete[] array;
+  return 0;
+}
+
Index: clang/test/OpenMP/reduction_implicit_map.cpp
===
--- /dev/null
+++ clang/test/OpenMP/reduction_implicit_map.cpp
@@ -0,0 +1,122 @@
+// RUN: %clang_cc1 -verify -fopenmp -fopenmp-cuda-mode -x c++ \
+// RUN:  -triple powerpc64le-unknown-unknown -DCUDA \
+// RUN:  -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o \
+// RUN:  %t-ppc-host.bc
+
+// RUN: %clang_cc1 -verify -fopenmp -fopenmp-cuda-mode -x c++ \
+// RUN:  -triple nvptx64-unknown-unknown -DCUA \
+// RUN:  -fopenmp-targets=nvptx64-nvidia-cuda -DCUDA -emit-llvm %s \
+// RUN:  -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc \
+// RUN:  -o - | FileCheck %s --check-prefix CHECK
+
+// RUN: %clang_cc1 -verify -fopenmp -x c++ \
+// RUN:   -triple powerpc64le-unknown-unknown -DDIAG\
+// RUN:   -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm \
+// RUN:   %s -o - | FileCheck  %s \
+// RUN:   --check-prefix=CHECK1
+
+// RUN: %clang_cc1 -verify -fopenmp -x c++ \
+// RUN:   -triple i386-unknown-unknown \
+// RUN:   -fopenmp-targets=i386-pc-linux-gnu -emit-llvm \
+// RUN:   %s -o - | FileCheck  %s \
+// RUN:   --check-prefix=CHECK2
+
+
+#if defined(CUDA)
+// expected-no-diagnostics
+
+int foo(int n) {
+  double *e;
+  //no error and no implicit map generated for e[:1]
+  #pragma omp target parallel reduction(+: e[:1])
+*e=10;
+  ;
+  return 0;
+}
+// CHECK-NOT @.offload_maptypes
+// CHECK: call void @__kmpc_nvptx_end_reduce_nowait(
+#elif defined(DIAG)
+class S2 {
+  mutable int a;
+public:
+  S2():a(0) { }
+  S2(S2 ):a(s2.a) { }
+  S2  +(S2 );
+};
+int bar() {
+ S2 o[5];
+  //warnig "copyable and not guaranteed to be mapped correctly" and
+  //implicit map generated.
+#pragma omp target parallel reduction(+:o[0]) //expected-warning {{Type 'S2' is not trivially copyable and not guaranteed to be mapped correctly}}
+  for (int i = 0; i < 10; i++);
+  double b[10][10][10];
+  //no error no implicit map generated, the map for b is generated but not
+  //for b[0:2][2:4][1].
+#pragma omp target parallel for reduction(task, +: b[0:2][2:4][1])
+  for (long long i = 0; i < 10; ++i);
+  return 0;
+}
+// map for variable o
+// CHECK1: offload_sizes = private unnamed_addr constant [1 x i64] [i64 4]
+// CHECK1: offload_maptypes = private unnamed_addr constant [1 x i64] [i64 547]
+// map for b:
+// CHECK1: @.offload_sizes{{.*}} = private unnamed_addr constant [1 x i64] [i64 8000]
+// CHECK1: @.offload_maptypes{{.*}} = private unnamed_addr constant [1 x i64] [i64 547]
+#else
+// expected-no-diagnostics
+
+// generate implicit map for array elements or array sections in reduction
+// clause. In following case: the implicit map is generate for output[0]
+// with map size 4 and output[:3] with map size 12.
+void sum(int* input, int size, int* output)
+{
+#pragma omp target teams distribute parallel for reduction(+: output[0]) \
+ map(to: input [0:size])
+  for (int i = 0; i < size; i++)
+output[0] += input[i];
+#pragma omp target teams distribute parallel for reduction(+: output[:3])  \
+ map(to: input [0:size])
+  for (int i = 0; i < size; i++)
+output[0] += input[i];
+  int a[10];
+#pragma omp target parallel reduction(+: a[:2])
+  for (int i = 0; i < size; i++)
+;
+#pragma omp target parallel reduction(+: a[3])
+  for (int i = 0; i < size; i++)
+;
+}
+//CHECK2: 

[clang] c274b19 - Add implicit map for a list item appears in a reduction clause.

2021-08-19 Thread Jennifer Yu via cfe-commits

Author: Jennifer Yu
Date: 2021-08-19T12:53:47-07:00
New Revision: c274b198668040acc239b349ef1f7820c91a95c8

URL: 
https://github.com/llvm/llvm-project/commit/c274b198668040acc239b349ef1f7820c91a95c8
DIFF: 
https://github.com/llvm/llvm-project/commit/c274b198668040acc239b349ef1f7820c91a95c8.diff

LOG: Add implicit map for a list item appears in a reduction clause.

A new rule is added in 5.0:
If a list item appears in a reduction, lastprivate or linear clause
on a combined target construct then it is treated as if it also appears
in a map clause with a map-type of tofrom.

Currently map clauses for all capture variables are added implicitly.
But missing for list item of expression for array elements or array
sections.

The change is to add implicit map clause for array of elements used in
reduction clause. Skip adding map clause if the expression is not
mappable.
Noted: For linear and lastprivate, since only variable name is
accepted, the map has been added though capture variables.

To do so:
During the mappable checking, if error, ignore diagnose and skip
adding implicit map clause.

The changes:
1> Add code to generate implicit map in ActOnOpenMPExecutableDirective,
   for omp 5.0 and up.
2> Add extra default parameter NoDiagnose in ActOnOpenMPMapClause:
Use that to skip error as well as skip adding implicit map during the
mappable checking.

Note: there are only tow places need to be check for NoDiagnose. Rest
of them either the check is for < omp 5.0 or the error already generated for
reduction clause.

Differential Revision: https://reviews.llvm.org/D108132

Added: 
clang/test/OpenMP/reduction_implicit_map.cpp
openmp/libomptarget/test/mapping/reduction_implicit_map.cpp

Modified: 
clang/include/clang/Sema/Sema.h
clang/lib/Sema/SemaOpenMP.cpp
clang/lib/Sema/TreeTransform.h

Removed: 




diff  --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index e38f50733bebe..0205c28c48569 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -11246,15 +11246,14 @@ class Sema final {
  SourceLocation ModifierLoc,
  SourceLocation EndLoc);
   /// Called on well-formed 'map' clause.
-  OMPClause *
-  ActOnOpenMPMapClause(ArrayRef MapTypeModifiers,
-   ArrayRef MapTypeModifiersLoc,
-   CXXScopeSpec ,
-   DeclarationNameInfo ,
-   OpenMPMapClauseKind MapType, bool IsMapTypeImplicit,
-   SourceLocation MapLoc, SourceLocation ColonLoc,
-   ArrayRef VarList, const OMPVarListLocTy ,
-   ArrayRef UnresolvedMappers = llvm::None);
+  OMPClause *ActOnOpenMPMapClause(
+  ArrayRef MapTypeModifiers,
+  ArrayRef MapTypeModifiersLoc,
+  CXXScopeSpec , DeclarationNameInfo ,
+  OpenMPMapClauseKind MapType, bool IsMapTypeImplicit,
+  SourceLocation MapLoc, SourceLocation ColonLoc, ArrayRef VarList,
+  const OMPVarListLocTy , bool NoDiagnose = false,
+  ArrayRef UnresolvedMappers = llvm::None);
   /// Called on well-formed 'num_teams' clause.
   OMPClause *ActOnOpenMPNumTeamsClause(Expr *NumTeams, SourceLocation StartLoc,
SourceLocation LParenLoc,

diff  --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp
index b09bda1138cc6..a64e5855441fc 100644
--- a/clang/lib/Sema/SemaOpenMP.cpp
+++ b/clang/lib/Sema/SemaOpenMP.cpp
@@ -5812,6 +5812,31 @@ StmtResult Sema::ActOnOpenMPExecutableDirective(
 ErrorFound = true;
   }
 }
+// OpenMP 5.0 [2.19.7]
+// If a list item appears in a reduction, lastprivate or linear
+// clause on a combined target construct then it is treated as
+// if it also appears in a map clause with a map-type of tofrom
+if (getLangOpts().OpenMP >= 50 && Kind != OMPD_target &&
+isOpenMPTargetExecutionDirective(Kind)) {
+  SmallVector ImplicitExprs;
+  for (OMPClause *C : Clauses) {
+if (auto *RC = dyn_cast(C))
+  for (Expr *E : RC->varlists())
+if (!isa(E->IgnoreParenImpCasts()))
+  ImplicitExprs.emplace_back(E);
+  }
+  if (!ImplicitExprs.empty()) {
+ArrayRef Exprs = ImplicitExprs;
+CXXScopeSpec MapperIdScopeSpec;
+DeclarationNameInfo MapperId;
+if (OMPClause *Implicit = ActOnOpenMPMapClause(
+OMPC_MAP_MODIFIER_unknown, SourceLocation(), MapperIdScopeSpec,
+MapperId, OMPC_MAP_tofrom,
+/*IsMapTypeImplicit=*/true, SourceLocation(), SourceLocation(),
+Exprs, OMPVarListLocTy(), /*NoDiagnose=*/true))
+  ClausesWithImplicit.emplace_back(Implicit);
+  }
+}
 for (unsigned I = 0, E = DefaultmapKindNum; I < E; ++I) {
   int ClauseKindCnt = -1;
   for (ArrayRef 

[PATCH] D106994: [modules] Fix miscompilation when using two RecordDecl definitions with the same name.

2021-08-19 Thread Volodymyr Sapsai via Phabricator via cfe-commits
vsapsai added a comment.

Tested clang with this change on internal code and there were no regressions. 
Also have done limited testing of runtime behavior of projects built with this 
clang - no errors encountered. So the testing so far hasn't found any issues.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106994/new/

https://reviews.llvm.org/D106994

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108403: Fix assertion when generating diagnostic for inline namespaces

2021-08-19 Thread Erich Keane via Phabricator via cfe-commits
erichkeane added a comment.

This whole function seems a little suspect, but I don't have a good example of 
a place it would break.  Is there no cases where a lookup could result in the 
same COUNT but different declaration set? I guess it is more the question of 
whether a transparent context can 'lose' a name lookup (perhaps a case of 
conflicting names?), then have it added by the local namespace.




Comment at: clang/include/clang/AST/Decl.h:620
+const DeclContext *Parent = getParent();
+while (Parent->isTransparentContext())
+  Parent = Parent->getParent();

This loop seems useful enough to be its own function in DeclContext?  I think I 
remember seeing us do this for a different patch, right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108403/new/

https://reviews.llvm.org/D108403

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106615: [Clang][LLVM] generate btf_tag annotations for DIComposite types

2021-08-19 Thread Yonghong Song via Phabricator via cfe-commits
yonghong-song added a comment.

In D106615#2955622 , @dblaikie wrote:

> Looks alright - please commit the LLVM and Clang portions of this separately. 
> (LLVM first, shouldn't require any changes to clang/should be standalone, 
> then committing the clang change after that, using the new API surface area 
> to implement the desired functionality)

Thanks! Will do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106615/new/

https://reviews.llvm.org/D106615

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108403: Fix assertion when generating diagnostic for inline namespaces

2021-08-19 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman created this revision.
aaron.ballman added reviewers: erichkeane, rsmith, rjmccall, jyknight.
aaron.ballman requested review of this revision.
Herald added a project: clang.

When calculating the name to display for inline namespaces, we have custom 
logic to try to hide redundant inline namespaces from the diagnostic. 
Calculating these redundancies requires performing a lookup in the parent 
declaration context, but that lookup should not try to look through transparent 
declaration contexts, like linkage specifications. Instead, loop up the 
declaration context chain until we find a non-transparent context and use that 
instead.

This fixes PR49954.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108403

Files:
  clang/include/clang/AST/Decl.h
  clang/test/Misc/diag-inline-namespace.cpp


Index: clang/test/Misc/diag-inline-namespace.cpp
===
--- clang/test/Misc/diag-inline-namespace.cpp
+++ clang/test/Misc/diag-inline-namespace.cpp
@@ -48,3 +48,14 @@
   T t4; // expected-error {{implicit instantiation of 
undefined template 'N::T'}}
   T t5; // expected-error {{implicit instantiation of 
undefined template 'N::T'}}
 }
+
+namespace dont_crash {
+// A malformed lookup involving inline namespaces in a linkage specification
+// would previous cause an assertion due to the way diagnostics are emitted.
+extern "C++" inline namespace {
+namespace a {
+  a : b // expected-error {{unexpected ':' in nested name specifier; did you 
mean '::'?}} \
+// expected-error {{no type named 'b' in namespace 'dont_crash::a'}}
+} // expected-error {{expected unqualified-id}}
+} // inline namespace
+} // dont_crash
Index: clang/include/clang/AST/Decl.h
===
--- clang/include/clang/AST/Decl.h
+++ clang/include/clang/AST/Decl.h
@@ -614,7 +614,12 @@
 if (!isInline())
   return false;
 auto X = lookup(Name);
-auto Y = getParent()->lookup(Name);
+// We should not perform a lookup within a transparent context, so walk
+// up the context chain until we find a non-transparent context.
+const DeclContext *Parent = getParent();
+while (Parent->isTransparentContext())
+  Parent = Parent->getParent();
+auto Y = Parent->lookup(Name);
 return std::distance(X.begin(), X.end()) ==
   std::distance(Y.begin(), Y.end());
   }


Index: clang/test/Misc/diag-inline-namespace.cpp
===
--- clang/test/Misc/diag-inline-namespace.cpp
+++ clang/test/Misc/diag-inline-namespace.cpp
@@ -48,3 +48,14 @@
   T t4; // expected-error {{implicit instantiation of undefined template 'N::T'}}
   T t5; // expected-error {{implicit instantiation of undefined template 'N::T'}}
 }
+
+namespace dont_crash {
+// A malformed lookup involving inline namespaces in a linkage specification
+// would previous cause an assertion due to the way diagnostics are emitted.
+extern "C++" inline namespace {
+namespace a {
+  a : b // expected-error {{unexpected ':' in nested name specifier; did you mean '::'?}} \
+// expected-error {{no type named 'b' in namespace 'dont_crash::a'}}
+} // expected-error {{expected unqualified-id}}
+} // inline namespace
+} // dont_crash
Index: clang/include/clang/AST/Decl.h
===
--- clang/include/clang/AST/Decl.h
+++ clang/include/clang/AST/Decl.h
@@ -614,7 +614,12 @@
 if (!isInline())
   return false;
 auto X = lookup(Name);
-auto Y = getParent()->lookup(Name);
+// We should not perform a lookup within a transparent context, so walk
+// up the context chain until we find a non-transparent context.
+const DeclContext *Parent = getParent();
+while (Parent->isTransparentContext())
+  Parent = Parent->getParent();
+auto Y = Parent->lookup(Name);
 return std::distance(X.begin(), X.end()) ==
   std::distance(Y.begin(), Y.end());
   }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106615: [Clang][LLVM] generate btf_tag annotations for DIComposite types

2021-08-19 Thread David Blaikie via Phabricator via cfe-commits
dblaikie accepted this revision.
dblaikie added a comment.
This revision is now accepted and ready to land.

Looks alright - please commit the LLVM and Clang portions of this separately. 
(LLVM first, shouldn't require any changes to clang/should be standalone, then 
committing the clang change after that, using the new API surface area to 
implement the desired functionality)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106615/new/

https://reviews.llvm.org/D106615

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107703: [AST][clangd] Expose documentation of Attrs on hover.

2021-08-19 Thread Simon Pilgrim via Phabricator via cfe-commits
RKSimon added inline comments.



Comment at: clang/utils/TableGen/ClangAttrEmitter.cpp:4231
+  // Only look at the first documentation if there are several.
+  // (As of now, only one attribute has multiple documentation entries).
+  break;

sammccall wrote:
> kadircet wrote:
> > not sure if this comment will stay useful.
> I want a comment to avoid a chesterton's fence:
>  - the motivation for doing something lazy is that this is really rare
>  - it's sensible to revisit this if it stops being rare
> 
> Reworded it to make this more explicit.
coverity is complaining that the for loop will never execute more than once, 
would it be worth refactoring?
```
if (!Docs.empty) {
  const auto *D = Docs[0];
  ...
}
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107703/new/

https://reviews.llvm.org/D107703

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108401: [WebAssembly] Make bitmask instructions return unsigned ints

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits
tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Since they are bitmasks, it will be more common for them to be used and
potentially extended to 64-bit integers as unsigned values rather than signed
values.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108401

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/Headers/wasm_simd128.h


Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -804,7 +804,7 @@
   return __builtin_wasm_all_true_i8x16((__i8x16)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i8x16((__i8x16)__a);
 }
 
@@ -894,7 +894,7 @@
   return __builtin_wasm_all_true_i16x8((__i16x8)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i16x8((__i16x8)__a);
 }
 
@@ -985,7 +985,7 @@
   return __builtin_wasm_all_true_i32x4((__i32x4)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i32x4((__i32x4)__a);
 }
 
@@ -1056,7 +1056,7 @@
   return __builtin_wasm_all_true_i64x2((__i64x2)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i64x2((__i64x2)__a);
 }
 
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def
===
--- clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -119,10 +119,10 @@
 TARGET_BUILTIN(__builtin_wasm_all_true_i32x4, "iV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i64x2, "iV2LLi", "nc", "simd128")
 
-TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "iV16Sc", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "iV8s", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "iV4i", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "iV2LLi", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "UiV16Sc", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "UiV8s", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "UiV4i", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "UiV2LLi", "nc", "simd128")
 
 TARGET_BUILTIN(__builtin_wasm_abs_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_abs_f64x2, "V2dV2d", "nc", "simd128")


Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -804,7 +804,7 @@
   return __builtin_wasm_all_true_i8x16((__i8x16)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i8x16((__i8x16)__a);
 }
 
@@ -894,7 +894,7 @@
   return __builtin_wasm_all_true_i16x8((__i16x8)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i16x8((__i16x8)__a);
 }
 
@@ -985,7 +985,7 @@
   return __builtin_wasm_all_true_i32x4((__i32x4)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i32x4((__i32x4)__a);
 }
 
@@ -1056,7 +1056,7 @@
   return __builtin_wasm_all_true_i64x2((__i64x2)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i64x2((__i64x2)__a);
 }
 
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def
===
--- clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -119,10 +119,10 @@
 TARGET_BUILTIN(__builtin_wasm_all_true_i32x4, "iV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i64x2, "iV2LLi", "nc", "simd128")
 
-TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "iV16Sc", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, 

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367589.
JonChesterfield added a comment.

- whitespace, drop asserts


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108380/new/

https://reviews.llvm.org/D108380

Files:
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/Basic/Targets/AMDGPU.cpp
  clang/lib/Basic/Targets/AMDGPU.h
  clang/lib/Basic/Targets/NVPTX.cpp
  clang/lib/Basic/Targets/NVPTX.h
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h

Index: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h
===
--- llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h
+++ llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h
@@ -62,19 +62,13 @@
   const unsigned GV_Slot_Size;
   /// The default value of maximum number of threads in a worker warp.
   const unsigned GV_Warp_Size;
-  /// Alternate warp size for some AMDGCN architectures. Same as GV_Warp_Size
-  /// for NVPTX.
-  const unsigned GV_Warp_Size_32;
-  /// The number of bits required to represent the max number of threads in warp
-  const unsigned GV_Warp_Size_Log2;
-  /// GV_Warp_Size * GV_Slot_Size,
-  const unsigned GV_Warp_Slot_Size;
+
+  constexpr unsigned warpSlotSize() const {
+return GV_Warp_Size * GV_Slot_Size;
+  }
+
   /// the maximum number of teams.
   const unsigned GV_Max_Teams;
-  /// Global Memory Alignment
-  const unsigned GV_Mem_Align;
-  /// (~0u >> (GV_Warp_Size - GV_Warp_Size_Log2))
-  const unsigned GV_Warp_Size_Log2_Mask;
   // An alternative to the heavy data sharing infrastructure that uses global
   // memory is one that uses device __shared__ memory.  The amount of such space
   // (in bytes) reserved by the OpenMP runtime is noted here.
@@ -83,47 +77,32 @@
   const unsigned GV_Max_WG_Size;
   // The default maximum team size for a working group
   const unsigned GV_Default_WG_Size;
-  // This is GV_Max_WG_Size / GV_WarpSize. 32 for NVPTX and 16 for AMDGCN.
-  const unsigned GV_Max_Warp_Number;
-  /// The slot size that should be reserved for a working warp.
-  /// (~0u >> (GV_Warp_Size - GV_Warp_Size_Log2))
-  const unsigned GV_Warp_Size_Log2_MaskL;
+
+  constexpr unsigned maxWarpNumber() const {
+return GV_Max_WG_Size / GV_Warp_Size;
+  }
 };
 
 /// For AMDGPU GPUs
 static constexpr GV AMDGPUGridValues = {
-448,   // GV_Threads
-256,   // GV_Slot_Size
-64,// GV_Warp_Size
-32,// GV_Warp_Size_32
-6, // GV_Warp_Size_Log2
-64 * 256,  // GV_Warp_Slot_Size
-128,   // GV_Max_Teams
-256,   // GV_Mem_Align
-63,// GV_Warp_Size_Log2_Mask
-896,   // GV_SimpleBufferSize
-1024,  // GV_Max_WG_Size,
-256,   // GV_Defaut_WG_Size
-1024 / 64, // GV_Max_WG_Size / GV_WarpSize
-63 // GV_Warp_Size_Log2_MaskL
+448,  // GV_Threads
+256,  // GV_Slot_Size
+64,   // GV_Warp_Size
+128,  // GV_Max_Teams
+896,  // GV_SimpleBufferSize
+1024, // GV_Max_WG_Size,
+256,  // GV_Default_WG_Size
 };
 
 /// For Nvidia GPUs
 static constexpr GV NVPTXGridValues = {
-992,   // GV_Threads
-256,   // GV_Slot_Size
-32,// GV_Warp_Size
-32,// GV_Warp_Size_32
-5, // GV_Warp_Size_Log2
-32 * 256,  // GV_Warp_Slot_Size
-1024,  // GV_Max_Teams
-256,   // GV_Mem_Align
-(~0u >> (32 - 5)), // GV_Warp_Size_Log2_Mask
-896,   // GV_SimpleBufferSize
-1024,  // GV_Max_WG_Size
-128,   // GV_Defaut_WG_Size
-1024 / 32, // GV_Max_WG_Size / GV_WarpSize
-31 // GV_Warp_Size_Log2_MaskL
+992,  // GV_Threads
+256,  // GV_Slot_Size
+32,   // GV_Warp_Size
+1024, // GV_Max_Teams
+896,  // GV_SimpleBufferSize
+1024, // GV_Max_WG_Size
+128,  // GV_Default_WG_Size
 };
 
 } // namespace omp
Index: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -22,6 +22,7 @@
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/Frontend/OpenMP/OMPGridValues.h"
 #include "llvm/IR/IntrinsicsNVPTX.h"
+#include "llvm/Support/MathExtras.h"
 
 using namespace clang;
 using namespace CodeGen;
@@ -106,8 +107,7 @@
 /// is the same for all known NVPTX architectures.
 enum MachineConfiguration : unsigned {
   /// See "llvm/Frontend/OpenMP/OMPGridValues.h" for various related target
-  /// specific Grid Values like GV_Warp_Size, GV_Warp_Size_Log2,
-  /// and GV_Warp_Size_Log2_Mask.
+  /// specific Grid Values like GV_Warp_Size, GV_Slot_Size
 
   /// Global memory alignment for performance.
   GlobalMemoryAlignment = 128,
@@ -535,7 +535,8 @@
 /// on the NVPTX device, to generate more efficient code.
 static llvm::Value 

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367587.
JonChesterfield added a comment.

- delete log2 accessors per review comments


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108380/new/

https://reviews.llvm.org/D108380

Files:
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/Basic/Targets/AMDGPU.cpp
  clang/lib/Basic/Targets/AMDGPU.h
  clang/lib/Basic/Targets/NVPTX.cpp
  clang/lib/Basic/Targets/NVPTX.h
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h

Index: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h
===
--- llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h
+++ llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h
@@ -62,19 +62,14 @@
   const unsigned GV_Slot_Size;
   /// The default value of maximum number of threads in a worker warp.
   const unsigned GV_Warp_Size;
-  /// Alternate warp size for some AMDGCN architectures. Same as GV_Warp_Size
-  /// for NVPTX.
-  const unsigned GV_Warp_Size_32;
-  /// The number of bits required to represent the max number of threads in warp
-  const unsigned GV_Warp_Size_Log2;
-  /// GV_Warp_Size * GV_Slot_Size,
-  const unsigned GV_Warp_Slot_Size;
+
+  constexpr unsigned warpSlotSize() const {
+return GV_Warp_Size * GV_Slot_Size;
+  }
+
   /// the maximum number of teams.
   const unsigned GV_Max_Teams;
-  /// Global Memory Alignment
-  const unsigned GV_Mem_Align;
-  /// (~0u >> (GV_Warp_Size - GV_Warp_Size_Log2))
-  const unsigned GV_Warp_Size_Log2_Mask;
+
   // An alternative to the heavy data sharing infrastructure that uses global
   // memory is one that uses device __shared__ memory.  The amount of such space
   // (in bytes) reserved by the OpenMP runtime is noted here.
@@ -83,49 +78,40 @@
   const unsigned GV_Max_WG_Size;
   // The default maximum team size for a working group
   const unsigned GV_Default_WG_Size;
-  // This is GV_Max_WG_Size / GV_WarpSize. 32 for NVPTX and 16 for AMDGCN.
-  const unsigned GV_Max_Warp_Number;
-  /// The slot size that should be reserved for a working warp.
-  /// (~0u >> (GV_Warp_Size - GV_Warp_Size_Log2))
-  const unsigned GV_Warp_Size_Log2_MaskL;
+
+  constexpr unsigned maxWarpNumber() const {
+return GV_Max_WG_Size / GV_Warp_Size;
+  }
 };
 
 /// For AMDGPU GPUs
 static constexpr GV AMDGPUGridValues = {
-448,   // GV_Threads
-256,   // GV_Slot_Size
-64,// GV_Warp_Size
-32,// GV_Warp_Size_32
-6, // GV_Warp_Size_Log2
-64 * 256,  // GV_Warp_Slot_Size
-128,   // GV_Max_Teams
-256,   // GV_Mem_Align
-63,// GV_Warp_Size_Log2_Mask
-896,   // GV_SimpleBufferSize
-1024,  // GV_Max_WG_Size,
-256,   // GV_Defaut_WG_Size
-1024 / 64, // GV_Max_WG_Size / GV_WarpSize
-63 // GV_Warp_Size_Log2_MaskL
+448,  // GV_Threads
+256,  // GV_Slot_Size
+64,   // GV_Warp_Size
+128,  // GV_Max_Teams
+896,  // GV_SimpleBufferSize
+1024, // GV_Max_WG_Size,
+256,  // GV_Default_WG_Size
 };
 
+static_assert(64 * 256 == AMDGPUGridValues.warpSlotSize(), "");
+static_assert(1024 / 64 == AMDGPUGridValues.maxWarpNumber(), "");
+
 /// For Nvidia GPUs
 static constexpr GV NVPTXGridValues = {
-992,   // GV_Threads
-256,   // GV_Slot_Size
-32,// GV_Warp_Size
-32,// GV_Warp_Size_32
-5, // GV_Warp_Size_Log2
-32 * 256,  // GV_Warp_Slot_Size
-1024,  // GV_Max_Teams
-256,   // GV_Mem_Align
-(~0u >> (32 - 5)), // GV_Warp_Size_Log2_Mask
-896,   // GV_SimpleBufferSize
-1024,  // GV_Max_WG_Size
-128,   // GV_Defaut_WG_Size
-1024 / 32, // GV_Max_WG_Size / GV_WarpSize
-31 // GV_Warp_Size_Log2_MaskL
+992,  // GV_Threads
+256,  // GV_Slot_Size
+32,   // GV_Warp_Size
+1024, // GV_Max_Teams
+896,  // GV_SimpleBufferSize
+1024, // GV_Max_WG_Size
+128,  // GV_Default_WG_Size
 };
 
+static_assert(32 * 256 == NVPTXGridValues.warpSlotSize(), "");
+static_assert(1024 / 32 == NVPTXGridValues.maxWarpNumber(), "");
+
 } // namespace omp
 } // namespace llvm
 
Index: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -22,6 +22,7 @@
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/Frontend/OpenMP/OMPGridValues.h"
 #include "llvm/IR/IntrinsicsNVPTX.h"
+#include "llvm/Support/MathExtras.h"
 
 using namespace clang;
 using namespace CodeGen;
@@ -106,8 +107,7 @@
 /// is the same for all known NVPTX architectures.
 enum MachineConfiguration : unsigned {
   /// See "llvm/Frontend/OpenMP/OMPGridValues.h" for various related target
-  /// specific Grid Values like GV_Warp_Size, 

[PATCH] D108320: Add semantic token modifier for non-const reference parameter

2021-08-19 Thread Tom Praschan via Phabricator via cfe-commits
tom-anders added inline comments.



Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:538
+for (size_t I = 0; I < FD->getNumParams(); ++I) {
+  if (const auto *Param = FD->getParamDecl(I)) {
+auto T = Param->getType();

sammccall wrote:
> I feel like you'd be better off using the FunctionProtoType and iterating 
> over argument types, rather than the argument declarations on a particular 
> declaration of the function.
> 
> e.g. this code is legal in C:
> ```
> int x(); // i suspect this is the canonical decl
> int x(int); // but this one provides the type
> ```
> We don't have references in C of course!, but maybe similar issues lurking...
I'm not really sure how to get from the CallExpr to the FunctionProtoType, can 
you give me a hint? 



Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:547
+if (isa(Arg)) {
+  Location = Arg->getBeginLoc();
+} else if (auto *M = dyn_cast(Arg)) {

sammccall wrote:
> tom-anders wrote:
> > sammccall wrote:
> > > nridge wrote:
> > > > For a qualified name (e.g. `A::B`), I think this is going to return the 
> > > > beginning of the qualifier, whereas we only want to highlight the last 
> > > > name (otherwise there won't be a matching token from the first pass).
> > > > 
> > > > So I think we want `getLocation()` instead.
> > > > 
> > > > (Also makes a good test case.)
> > > And getLocation() will do the right thing for DeclRefExpr, MemberExpr, 
> > > and others, so this can just be `isa` with no 
> > > need for dyn_cast.
> > I'm not sure which getLocation() you're talking about here. There's 
> > DeclRefExpr::getLocation(), but neither Expr::getLocation() nor 
> > MemberExpr::getLocation(). Am I missing something?
> No, I think I'm just going mad (I was thinking of Decl::getLocation I guess).
> Never mind and sorry!
np :D 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108320/new/

https://reviews.llvm.org/D108320

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108377: [asan] Implemented flag to emit intrinsics to optimize ASan callbacks.

2021-08-19 Thread Kirill Stoimenov via Phabricator via cfe-commits
kstoimenov updated this revision to Diff 367585.
kstoimenov added a comment.

Removed unused encodeMemToShadowInfo.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108377/new/

https://reviews.llvm.org/D108377

Files:
  clang/test/CodeGen/asan-use-callbacks.cpp
  llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp


Index: llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -348,6 +348,10 @@
 static cl::opt ClOpt("asan-opt", cl::desc("Optimize instrumentation"),
cl::Hidden, cl::init(true));
 
+static cl::opt ClOptimizeCallbacks("asan-optimize-callbacks",
+ cl::desc("Optimize callbacks"),
+ cl::Hidden, cl::init(false));
+
 static cl::opt ClOptSameTemp(
 "asan-opt-same-temp", cl::desc("Instrument the same temp just once"),
 cl::Hidden, cl::init(true));
@@ -634,6 +638,7 @@
 C = &(M.getContext());
 LongSize = M.getDataLayout().getPointerSizeInBits();
 IntptrTy = Type::getIntNTy(*C, LongSize);
+Int8PtrTy = Type::getInt8PtrTy(*C);
 TargetTriple = Triple(M.getTargetTriple());
 
 Mapping = getShadowMapping(TargetTriple, LongSize, this->CompileKernel);
@@ -724,6 +729,7 @@
   bool UseAfterScope;
   AsanDetectStackUseAfterReturnMode UseAfterReturn;
   Type *IntptrTy;
+  Type *Int8PtrTy;
   ShadowMapping Mapping;
   FunctionCallee AsanHandleNoReturnFunc;
   FunctionCallee AsanPtrCmpFunction, AsanPtrSubFunction;
@@ -1753,12 +1759,21 @@
   size_t AccessSizeIndex = TypeSizeToSizeIndex(TypeSize);
 
   if (UseCalls) {
-if (Exp == 0)
-  IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][0][AccessSizeIndex],
- AddrLong);
-else
-  IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][1][AccessSizeIndex],
- {AddrLong, ConstantInt::get(IRB.getInt32Ty(), Exp)});
+if (ClOptimizeCallbacks) {
+  Value *Ptr8 = IRB.CreatePointerCast(Addr, Int8PtrTy);
+  Module *M = IRB.GetInsertBlock()->getParent()->getParent();
+  IRB.CreateCall(
+  Intrinsic::getDeclaration(M, Intrinsic::asan_check_memaccess),
+  {Ptr8, ConstantInt::get(IRB.getInt8Ty(), IsWrite),
+   ConstantInt::get(IRB.getInt8Ty(), AccessSizeIndex)});
+} else {
+  if (Exp == 0)
+IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][0][AccessSizeIndex],
+   AddrLong);
+  else
+IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][1][AccessSizeIndex],
+   {AddrLong, ConstantInt::get(IRB.getInt32Ty(), Exp)});
+}
 return;
   }
 
Index: clang/test/CodeGen/asan-use-callbacks.cpp
===
--- clang/test/CodeGen/asan-use-callbacks.cpp
+++ clang/test/CodeGen/asan-use-callbacks.cpp
@@ -1,12 +1,18 @@
-// RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -fsanitize=address \
-// RUN: -o - %s \
+// RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -o - \
+// RUN: -fsanitize=address %s \
 // RUN: | FileCheck %s --check-prefixes=CHECK-NO-OUTLINE
 // RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -o - \
 // RUN: -fsanitize=address %s -fsanitize-address-outline-instrumentation \
 // RUN: | FileCheck %s --check-prefixes=CHECK-OUTLINE
+// RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -o - \
+// RUN: -fsanitize=address %s -fsanitize-address-outline-instrumentation \
+// RUN: -mllvm -asan-optimize-callbacks \
+// RUN: | FileCheck %s --check-prefixes=CHECK-OPTIMIZED
+
 
 // CHECK-NO-OUTLINE-NOT: call{{.*}}@__asan_load4
 // CHECK-OUTLINE: call{{.*}}@__asan_load4
+// CHECK-OPTIMIZED: call{{.*}}@llvm.asan.check.memaccess(i8*{{.*}}, i64{{.*}}, 
i32{{.*}})
 
 int deref(int *p) {
   return *p;


Index: llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -348,6 +348,10 @@
 static cl::opt ClOpt("asan-opt", cl::desc("Optimize instrumentation"),
cl::Hidden, cl::init(true));
 
+static cl::opt ClOptimizeCallbacks("asan-optimize-callbacks",
+ cl::desc("Optimize callbacks"),
+ cl::Hidden, cl::init(false));
+
 static cl::opt ClOptSameTemp(
 "asan-opt-same-temp", cl::desc("Instrument the same temp just once"),
 cl::Hidden, cl::init(true));
@@ -634,6 +638,7 @@
 C = &(M.getContext());
 LongSize = M.getDataLayout().getPointerSizeInBits();
 IntptrTy = Type::getIntNTy(*C, LongSize);
+Int8PtrTy = Type::getInt8PtrTy(*C);
 TargetTriple = 

[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Dávid Bolvanský via Phabricator via cfe-commits
xbolva00 added a comment.

In D108003#2955058 , @rpbeltran wrote:

> This patch seems like a great contribution! Really glad to see this being 
> added. I did have a question though on why this only appears to catch "&" vs 
> "&&" instead of doing the same for "|" vs "||". It seems like both operators 
> have roughly the same potential for confusion. Could we add support for 
> bitwise vs logical or in this?

Yeah, I mentioned in the first comment as part of “open questions”. Basically 
at first we need to find out some reasonable “heuristics” when to warn and then 
support for | should be added.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented custom calling convention similar to the one used by HWASan for X86.

2021-08-19 Thread Evgenii Stepanov via Phabricator via cfe-commits
eugenis added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

vitalybuka wrote:
> kstoimenov wrote:
> > vitalybuka wrote:
> > > vitalybuka wrote:
> > > > PTAL at lvm.read_register.i32
> > > > 
> > > > How about:
> > > > 
> > > > llvm.asan.check.memaccess ->
> > > >   lvm.asan.check_read
> > > >   lvm.asan.check_write
> > > >   lvm.asan.kernel.check_read
> > > >   lvm.asan.kernel.check_write
> > > > 
> > > > Even better
> > > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > > 
> > > Looks like underscore is not used in intrinsic names, so essentially the 
> > > same with dots.
> > Sounds good to me. I do the full expansion so there will be 20 intrinsics 
> > altogether. I will update the code and ping you when done. 
> @pcc @eugenis 
> WDYT, I think later we can do the same for HWASAN?
I don't see what these multiple intrinsics give us that a single memaccess one 
does not provide?

As long as access type and similar arguments are immediates.




Comment at: llvm/include/llvm/IR/Intrinsics.td:1642
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],
+[IntrInaccessibleMemOnly, ImmArg>, 
ImmArg>]>;
+

We've just removed IntrInaccessibleMemOnly from hwasan. This needs to alias 
shadow updates.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments.



Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:102
+return R;
+  }
 };

jdoerfert wrote:
> JonChesterfield wrote:
> > jdoerfert wrote:
> > > It should be in the device rtl then, no?
> > This header is currently used from clang and the (amdgpu, could also be 
> > cuda if we like) host plugin. Possibly also from llvm. As of D108391 it 
> > would be used from the devicertl.
> > 
> > The idea is to have a single source of truth for the various magic numbers 
> > that the pieces should agree on and llvm is the common point on the 
> > dependency tree. I'm currently interested in that because I want to change 
> > some of them for gfx10 and have that magically ripple through the 
> > components. I'm not totally confident that will work out nicely for the 
> > host plugin as it has to dynamically handle different architectures but I 
> > think it'll be good enough.
> > 
> > It's not totally ideal to hand spin a function that is in the math support 
> > header but I also don't want to try to make various llvm headers 
> > ffreestanding-safe.
> >  It's not totally ideal to hand spin a function that is in the math support 
> > header but I also don't want to try to make various llvm headers 
> > ffreestanding-safe.
> 
> The function is only needed in the device rtl. Put it in the device rtl.
This particular function is only called by warpSizeLog2 which is currently only 
used by CGOpenMPRuntimeGPU. The deviceRTL doesn't call the function. However if 
this header includes the rest of llvm support then it can't call any of the 
others either.

I'm going to drop the log2 accessor (and this function) in favour of two calls 
into math support from CGOpenMPRuntime.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108380/new/

https://reviews.llvm.org/D108380

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108377: [asan] Implemented custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Kirill Stoimenov via Phabricator via cfe-commits
kstoimenov updated this revision to Diff 367577.
kstoimenov added a comment.

Update before expanding intrinsics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108377/new/

https://reviews.llvm.org/D108377

Files:
  clang/test/CodeGen/asan-use-callbacks.cpp
  llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp


Index: llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -348,6 +348,10 @@
 static cl::opt ClOpt("asan-opt", cl::desc("Optimize instrumentation"),
cl::Hidden, cl::init(true));
 
+static cl::opt ClOptimizeCallbacks("asan-optimize-callbacks",
+ cl::desc("Optimize callbacks"),
+ cl::Hidden, cl::init(false));
+
 static cl::opt ClOptSameTemp(
 "asan-opt-same-temp", cl::desc("Instrument the same temp just once"),
 cl::Hidden, cl::init(true));
@@ -634,6 +638,7 @@
 C = &(M.getContext());
 LongSize = M.getDataLayout().getPointerSizeInBits();
 IntptrTy = Type::getIntNTy(*C, LongSize);
+Int8PtrTy = Type::getInt8PtrTy(*C);
 TargetTriple = Triple(M.getTargetTriple());
 
 Mapping = getShadowMapping(TargetTriple, LongSize, this->CompileKernel);
@@ -684,6 +689,7 @@
  Value *SizeArgument, uint32_t Exp);
   void instrumentMemIntrinsic(MemIntrinsic *MI);
   Value *memToShadow(Value *Shadow, IRBuilder<> );
+  void encodeMemToShadowInfo(int64_t *AccessInfo);
   bool suppressInstrumentationSiteForDebug(int );
   bool instrumentFunction(Function , const TargetLibraryInfo *TLI);
   bool maybeInsertAsanInitAtFunctionEntry(Function );
@@ -724,6 +730,7 @@
   bool UseAfterScope;
   AsanDetectStackUseAfterReturnMode UseAfterReturn;
   Type *IntptrTy;
+  Type *Int8PtrTy;
   ShadowMapping Mapping;
   FunctionCallee AsanHandleNoReturnFunc;
   FunctionCallee AsanPtrCmpFunction, AsanPtrSubFunction;
@@ -1753,12 +1760,21 @@
   size_t AccessSizeIndex = TypeSizeToSizeIndex(TypeSize);
 
   if (UseCalls) {
-if (Exp == 0)
-  IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][0][AccessSizeIndex],
- AddrLong);
-else
-  IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][1][AccessSizeIndex],
- {AddrLong, ConstantInt::get(IRB.getInt32Ty(), Exp)});
+if (ClOptimizeCallbacks) {
+  Value *Ptr8 = IRB.CreatePointerCast(Addr, Int8PtrTy);
+  Module *M = IRB.GetInsertBlock()->getParent()->getParent();
+  IRB.CreateCall(
+  Intrinsic::getDeclaration(M, Intrinsic::asan_check_memaccess),
+  {Ptr8, ConstantInt::get(IRB.getInt8Ty(), IsWrite),
+   ConstantInt::get(IRB.getInt8Ty(), AccessSizeIndex)});
+} else {
+  if (Exp == 0)
+IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][0][AccessSizeIndex],
+   AddrLong);
+  else
+IRB.CreateCall(AsanMemoryAccessCallback[IsWrite][1][AccessSizeIndex],
+   {AddrLong, ConstantInt::get(IRB.getInt32Ty(), Exp)});
+}
 return;
   }
 
Index: clang/test/CodeGen/asan-use-callbacks.cpp
===
--- clang/test/CodeGen/asan-use-callbacks.cpp
+++ clang/test/CodeGen/asan-use-callbacks.cpp
@@ -1,12 +1,18 @@
-// RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -fsanitize=address \
-// RUN: -o - %s \
+// RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -o - \
+// RUN: -fsanitize=address %s \
 // RUN: | FileCheck %s --check-prefixes=CHECK-NO-OUTLINE
 // RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -o - \
 // RUN: -fsanitize=address %s -fsanitize-address-outline-instrumentation \
 // RUN: | FileCheck %s --check-prefixes=CHECK-OUTLINE
+// RUN: %clang -target x86_64-linux-gnu -S -emit-llvm -o - \
+// RUN: -fsanitize=address %s -fsanitize-address-outline-instrumentation \
+// RUN: -mllvm -asan-optimize-callbacks \
+// RUN: | FileCheck %s --check-prefixes=CHECK-OPTIMIZED
+
 
 // CHECK-NO-OUTLINE-NOT: call{{.*}}@__asan_load4
 // CHECK-OUTLINE: call{{.*}}@__asan_load4
+// CHECK-OPTIMIZED: call{{.*}}@llvm.asan.check.memaccess(i8*{{.*}}, i64{{.*}}, 
i32{{.*}})
 
 int deref(int *p) {
   return *p;


Index: llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
===
--- llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -348,6 +348,10 @@
 static cl::opt ClOpt("asan-opt", cl::desc("Optimize instrumentation"),
cl::Hidden, cl::init(true));
 
+static cl::opt ClOptimizeCallbacks("asan-optimize-callbacks",
+ cl::desc("Optimize callbacks"),
+   

[PATCH] D107850: [asan] Implemented custom calling convention similar to the one used by HWASan for X86.

2021-08-19 Thread Kirill Stoimenov via Phabricator via cfe-commits
kstoimenov updated this revision to Diff 367575.
kstoimenov added a comment.

Update before expanding intrinsics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

Files:
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Target/X86/X86AsmPrinter.cpp
  llvm/lib/Target/X86/X86AsmPrinter.h
  llvm/lib/Target/X86/X86InstrCompiler.td
  llvm/lib/Target/X86/X86MCInstLower.cpp
  llvm/lib/Target/X86/X86RegisterInfo.td
  llvm/test/CodeGen/X86/asan-check-memaccess-add.ll
  llvm/test/CodeGen/X86/asan-check-memaccess-or.ll

Index: llvm/test/CodeGen/X86/asan-check-memaccess-or.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/asan-check-memaccess-or.ll
@@ -0,0 +1,234 @@
+; RUN: llc < %s | FileCheck %s
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @load1(i8* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load1_rn[[RN1:.*]]
+; CHECK:  callq   __asan_check_store1_rn[[RN1]]
+; CHECK-NEXT: retq
+  call void @llvm.asan.check.memaccess(i8* %x, i64 2147450880, i32 0,
+   i32 0, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %x, i64 2147450880, i32 1,
+   i32 0, i32 3, i32 1)
+  ret void
+}
+
+define void @load2(i16* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load2_rn[[RN2:.*]]
+; CHECK:  callq   __asan_check_store2_rn[[RN2]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i16* %x to i64
+  %2 = bitcast i16* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 1, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 1, i32 3, i32 1)
+  ret void
+}
+
+define void @load4(i32* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load4_rn[[RN4:.*]]
+; CHECK:  callq   __asan_check_store4_rn[[RN4]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i32* %x to i64
+  %2 = bitcast i32* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 2, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 2, i32 3, i32 1)
+  ret void
+}
+define void @load8(i64* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load8_rn[[RN8:.*]]
+; CHECK:  callq   __asan_check_store8_rn[[RN8]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i64* %x to i64
+  %2 = bitcast i64* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 3, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 3, i32 3, i32 1)
+  ret void
+}
+
+define void @load16(i128* nocapture readonly %x) {
+; CHECK:  callq   __asan_check_load16_rn[[RN16:.*]]
+; CHECK:  callq   __asan_check_store16_rn[[RN16]]
+; CHECK-NEXT: retq
+  %1 = ptrtoint i128* %x to i64
+  %2 = bitcast i128* %x to i8*
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 0,
+   i32 4, i32 3, i32 1)
+  call void @llvm.asan.check.memaccess(i8* %2, i64 2147450880, i32 1,
+   i32 4, i32 3, i32 1)
+  ret void
+}
+
+; CHECK:  __asan_check_load1_rn[[RN1]]:
+; CHECK-NEXT: movq[[REG:.*]], %r8
+; CHECK-NEXT: shrq$3, %r8
+; CHECK-NEXT: orq $2147450880, %r8{{.*}}
+; CHECK-NEXT: movb(%r8), %r8b
+; CHECK-NEXT: testb   %r8b, %r8b
+; CHECK-NEXT: jne [[EXTRA:.*]]
+; CHECK-NEXT: [[RET:.*]]:
+; CHECK-NEXT: retq
+; CHECK-NEXT: [[EXTRA]]:
+; CHECK-NEXT: pushq   %rcx
+; CHECK-NEXT: movq[[REG]], %rcx
+; CHECK-NEXT: andl$7, %ecx
+; CHECK-NEXT: cmpl%r8d, %ecx
+; CHECK-NEXT: popq%rcx
+; CHECK-NEXT: jl  [[RET]]
+; CHECK-NEXT: movq[[REG:.*]], %rdi
+; CHECK-NEXT: jmp __asan_report_load1
+
+; CHECK:  __asan_check_load2_rn[[RN2]]:
+; CHECK-NEXT: movq[[REG:.*]], %r8
+; CHECK-NEXT: shrq$3, %r8
+; CHECK-NEXT: orq $2147450880, %r8{{.*}}
+; CHECK-NEXT: movb(%r8), %r8b
+; CHECK-NEXT: testb   %r8b, %r8b
+; CHECK-NEXT: jne [[EXTRA:.*]]
+; CHECK-NEXT: [[RET:.*]]:
+; CHECK-NEXT: retq
+; CHECK-NEXT: [[EXTRA]]:
+; CHECK-NEXT: pushq   %rcx
+; CHECK-NEXT: movq[[REG]], %rcx
+; CHECK-NEXT: andl$7, %ecx
+; CHECK-NEXT: addl$1, %ecx
+; CHECK-NEXT: cmpl%r8d, %ecx
+; CHECK-NEXT: popq%rcx
+; CHECK-NEXT: jl  [[RET]]
+; CHECK-NEXT: 

[PATCH] D107850: [asan] Implemented custom calling convention similar to the one used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added a subscriber: eugenis.
vitalybuka added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

kstoimenov wrote:
> vitalybuka wrote:
> > vitalybuka wrote:
> > > PTAL at lvm.read_register.i32
> > > 
> > > How about:
> > > 
> > > llvm.asan.check.memaccess ->
> > >   lvm.asan.check_read
> > >   lvm.asan.check_write
> > >   lvm.asan.kernel.check_read
> > >   lvm.asan.kernel.check_write
> > > 
> > > Even better
> > >   lvm.asan.check_read.{i8, i16, i32, ...}
> > >   lvm.asan.check_write.{i8, i16, i32, ...}
> > >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> > >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > > 
> > Looks like underscore is not used in intrinsic names, so essentially the 
> > same with dots.
> Sounds good to me. I do the full expansion so there will be 20 intrinsics 
> altogether. I will update the code and ping you when done. 
@pcc @eugenis 
WDYT, I think later we can do the same for HWASAN?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added inline comments.



Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:102
+return R;
+  }
 };

JonChesterfield wrote:
> jdoerfert wrote:
> > It should be in the device rtl then, no?
> This header is currently used from clang and the (amdgpu, could also be cuda 
> if we like) host plugin. Possibly also from llvm. As of D108391 it would be 
> used from the devicertl.
> 
> The idea is to have a single source of truth for the various magic numbers 
> that the pieces should agree on and llvm is the common point on the 
> dependency tree. I'm currently interested in that because I want to change 
> some of them for gfx10 and have that magically ripple through the components. 
> I'm not totally confident that will work out nicely for the host plugin as it 
> has to dynamically handle different architectures but I think it'll be good 
> enough.
> 
> It's not totally ideal to hand spin a function that is in the math support 
> header but I also don't want to try to make various llvm headers 
> ffreestanding-safe.
>  It's not totally ideal to hand spin a function that is in the math support 
> header but I also don't want to try to make various llvm headers 
> ffreestanding-safe.

The function is only needed in the device rtl. Put it in the device rtl.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108380/new/

https://reviews.llvm.org/D108380

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108320: Add semantic token modifier for non-const reference parameter

2021-08-19 Thread Sam McCall via Phabricator via cfe-commits
sammccall added inline comments.



Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:547
+if (isa(Arg)) {
+  Location = Arg->getBeginLoc();
+} else if (auto *M = dyn_cast(Arg)) {

tom-anders wrote:
> sammccall wrote:
> > nridge wrote:
> > > For a qualified name (e.g. `A::B`), I think this is going to return the 
> > > beginning of the qualifier, whereas we only want to highlight the last 
> > > name (otherwise there won't be a matching token from the first pass).
> > > 
> > > So I think we want `getLocation()` instead.
> > > 
> > > (Also makes a good test case.)
> > And getLocation() will do the right thing for DeclRefExpr, MemberExpr, and 
> > others, so this can just be `isa` with no need for 
> > dyn_cast.
> I'm not sure which getLocation() you're talking about here. There's 
> DeclRefExpr::getLocation(), but neither Expr::getLocation() nor 
> MemberExpr::getLocation(). Am I missing something?
No, I think I'm just going mad (I was thinking of Decl::getLocation I guess).
Never mind and sorry!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108320/new/

https://reviews.llvm.org/D108320

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented custom calling convention similar to the one used by HWASan for X86.

2021-08-19 Thread Kirill Stoimenov via Phabricator via cfe-commits
kstoimenov marked an inline comment as done.
kstoimenov added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

vitalybuka wrote:
> vitalybuka wrote:
> > PTAL at lvm.read_register.i32
> > 
> > How about:
> > 
> > llvm.asan.check.memaccess ->
> >   lvm.asan.check_read
> >   lvm.asan.check_write
> >   lvm.asan.kernel.check_read
> >   lvm.asan.kernel.check_write
> > 
> > Even better
> >   lvm.asan.check_read.{i8, i16, i32, ...}
> >   lvm.asan.check_write.{i8, i16, i32, ...}
> >   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
> >   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> > 
> Looks like underscore is not used in intrinsic names, so essentially the same 
> with dots.
Sounds good to me. I do the full expansion so there will be 20 intrinsics 
altogether. I will update the code and ping you when done. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80392: [mips][mc][clang] Use pc-relative relocations in .eh_frame

2021-08-19 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment.

If you want to do this proper, you may look at how -fbinutils-version= was 
implemented.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80392/new/

https://reviews.llvm.org/D80392

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80392: [mips][mc][clang] Use pc-relative relocations in .eh_frame

2021-08-19 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay requested changes to this revision.
MaskRay added a comment.
This revision now requires changes to proceed.

`MCTargetOptionsCommandFlags.cpp` may be a better place for the internal 
cl::opt option.




Comment at: llvm/lib/MC/MCObjectFileInfo.cpp:30
+cl::opt
+MipsPC64Relocation("mmips-pc64-rel", cl::init(true),
+   cl::desc("Use MIPS 64-bit PC-relative relocations"));

-m prefix is for assembler and dropver options.

Internal codegen options don't need the prefix.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80392/new/

https://reviews.llvm.org/D80392

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added a subscriber: kstoimenov.
vitalybuka added a comment.

FYI @kstoimenov


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105265/new/

https://reviews.llvm.org/D105265

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108320: Add semantic token modifier for non-const reference parameter

2021-08-19 Thread Tom Praschan via Phabricator via cfe-commits
tom-anders planned changes to this revision.
tom-anders added inline comments.



Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:547
+if (isa(Arg)) {
+  Location = Arg->getBeginLoc();
+} else if (auto *M = dyn_cast(Arg)) {

sammccall wrote:
> nridge wrote:
> > For a qualified name (e.g. `A::B`), I think this is going to return the 
> > beginning of the qualifier, whereas we only want to highlight the last name 
> > (otherwise there won't be a matching token from the first pass).
> > 
> > So I think we want `getLocation()` instead.
> > 
> > (Also makes a good test case.)
> And getLocation() will do the right thing for DeclRefExpr, MemberExpr, and 
> others, so this can just be `isa` with no need for 
> dyn_cast.
I'm not sure which getLocation() you're talking about here. There's 
DeclRefExpr::getLocation(), but neither Expr::getLocation() nor 
MemberExpr::getLocation(). Am I missing something?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108320/new/

https://reviews.llvm.org/D108320

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added a comment.

I suspect this error from this or D105331 
https://lab.llvm.org/buildbot/#/builders/85/builds/6132


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105265/new/

https://reviews.llvm.org/D105265

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108366: [clang][deps] Deduce resource directory from the compiler path

2021-08-19 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added a comment.

The patch seems mostly straightforward, but can you clarify whether there was a 
functionality change for clang-scan-deps? My reading of the code suggests not, 
because it was using ResourceDirectoryCache before and still will... in which 
case, can you walk me through how the test covers this code change? (Maybe some 
comments in the test would help.)

In D108366#2954380 , @jansvoboda11 
wrote:

> Tagging @kousikk, since this is related to D69122 
>  that introduced `ResourceDirectoryCache` to 
> `clang-scan-deps`. When the compilation command doesn't have a 
> `-resource-dir` argument, `ResourceDirectoryCache` invokes the specified 
> compiler with `-print-resource-dir` and injects the result into the 
> command-line as `-resource-dir`.
>
> This happens way before the dependency scanner worker is invoked, meaning the 
> logic this patch tweaks won't usually kick in. The test passes only because 
> the invocation of `/our/custom/bin/clang -print-resource-dir` made by 
> `ResourceDirectoryCache` silently fails (the binary doesn't exist), allowing 
> the worker to deduce the resource directory using regular driver logic.
>
> I think both `clang-scan-deps` and the downstream libclang API clearly head 
> towards only supporting compilers built the same way (same version, 
> architecture, etc.). The modular dependency scanner already returns 
> command-lines of cc1 arguments that are not stable across Clang versions.
>
> I wanted to see if we can reach consensus on removing 
> `ResourceDirectoryCache` entirely. It makes the resource directory deduction 
> much more lightweight and is in line with the direction we're already going 
> regarding compiler compatibility. It also allows `clang-scan-deps` and 
> libclang API to have the same behavior, which is a desirable property IMO. If 
> users really want the behavior of `ResourceDirectoryCache`, they can keep 
> using prior versions `clang-scan-deps`.
>
> What do you all think?

Jan and I already talked offline, but FTR, I agree with the direction of 
expecting the "dependency scanning" APIs to run from a libclang/clang-scan-deps 
that's installed alongside clang itself. This is required for the dependency 
scanning to be correct anyway (since different compilers could have different 
pre-defined macros).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108366/new/

https://reviews.llvm.org/D108366

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [clang] f22e586 - [Sema] CheckObjCBridgeNSCast - fix dead code warning. NFCI.

2021-08-19 Thread Vitaly Buka via cfe-commits
Sorry, It's not.

On Thu, 19 Aug 2021 at 11:21, Vitaly Buka  wrote:

> Could this be a crash from this patch
> https://lab.llvm.org/buildbot/#/builders/85/builds/6135/steps/10/logs/stdio
> ?
>
> On Wed, 18 Aug 2021 at 03:53, Simon Pilgrim via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
>>
>> Author: Simon Pilgrim
>> Date: 2021-08-18T11:53:27+01:00
>> New Revision: f22e5869a012af09e130d804bba441eff261b1fe
>>
>> URL:
>> https://github.com/llvm/llvm-project/commit/f22e5869a012af09e130d804bba441eff261b1fe
>> DIFF:
>> https://github.com/llvm/llvm-project/commit/f22e5869a012af09e130d804bba441eff261b1fe.diff
>>
>> LOG: [Sema] CheckObjCBridgeNSCast - fix dead code warning. NFCI.
>>
>> Target is only ever non-null when we find an existing type, so move its
>> declaration inside that case, and remove the dead code where Target was
>> always null.
>>
>> Added:
>>
>>
>> Modified:
>> clang/lib/Sema/SemaExprObjC.cpp
>>
>> Removed:
>>
>>
>>
>>
>> 
>> diff  --git a/clang/lib/Sema/SemaExprObjC.cpp
>> b/clang/lib/Sema/SemaExprObjC.cpp
>> index 8a9c933fc93f..9e46801ea508 100644
>> --- a/clang/lib/Sema/SemaExprObjC.cpp
>> +++ b/clang/lib/Sema/SemaExprObjC.cpp
>> @@ -4015,12 +4015,11 @@ static bool CheckObjCBridgeNSCast(Sema ,
>> QualType castType, Expr *castExpr,
>>  if (Parm->isStr("id"))
>>return true;
>>
>> -NamedDecl *Target = nullptr;
>>  // Check for an existing type with this name.
>>  LookupResult R(S, DeclarationName(Parm), SourceLocation(),
>> Sema::LookupOrdinaryName);
>>  if (S.LookupName(R, S.TUScope)) {
>> -  Target = R.getFoundDecl();
>> +  NamedDecl *Target = R.getFoundDecl();
>>if (Target && isa(Target)) {
>>  ObjCInterfaceDecl *ExprClass =
>> cast(Target);
>>  if (const ObjCObjectPointerType *InterfacePointerType =
>> @@ -4056,8 +4055,6 @@ static bool CheckObjCBridgeNSCast(Sema , QualType
>> castType, Expr *castExpr,
>>   diag::err_objc_cf_bridged_not_interface)
>><< castExpr->getType() << Parm;
>>S.Diag(TDNDecl->getBeginLoc(), diag::note_declared_at);
>> -  if (Target)
>> -S.Diag(Target->getBeginLoc(), diag::note_declared_at);
>>  }
>>  return true;
>>}
>>
>>
>>
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [clang] f22e586 - [Sema] CheckObjCBridgeNSCast - fix dead code warning. NFCI.

2021-08-19 Thread Vitaly Buka via cfe-commits
Could this be a crash from this patch
https://lab.llvm.org/buildbot/#/builders/85/builds/6135/steps/10/logs/stdio
?

On Wed, 18 Aug 2021 at 03:53, Simon Pilgrim via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

>
> Author: Simon Pilgrim
> Date: 2021-08-18T11:53:27+01:00
> New Revision: f22e5869a012af09e130d804bba441eff261b1fe
>
> URL:
> https://github.com/llvm/llvm-project/commit/f22e5869a012af09e130d804bba441eff261b1fe
> DIFF:
> https://github.com/llvm/llvm-project/commit/f22e5869a012af09e130d804bba441eff261b1fe.diff
>
> LOG: [Sema] CheckObjCBridgeNSCast - fix dead code warning. NFCI.
>
> Target is only ever non-null when we find an existing type, so move its
> declaration inside that case, and remove the dead code where Target was
> always null.
>
> Added:
>
>
> Modified:
> clang/lib/Sema/SemaExprObjC.cpp
>
> Removed:
>
>
>
>
> 
> diff  --git a/clang/lib/Sema/SemaExprObjC.cpp
> b/clang/lib/Sema/SemaExprObjC.cpp
> index 8a9c933fc93f..9e46801ea508 100644
> --- a/clang/lib/Sema/SemaExprObjC.cpp
> +++ b/clang/lib/Sema/SemaExprObjC.cpp
> @@ -4015,12 +4015,11 @@ static bool CheckObjCBridgeNSCast(Sema ,
> QualType castType, Expr *castExpr,
>  if (Parm->isStr("id"))
>return true;
>
> -NamedDecl *Target = nullptr;
>  // Check for an existing type with this name.
>  LookupResult R(S, DeclarationName(Parm), SourceLocation(),
> Sema::LookupOrdinaryName);
>  if (S.LookupName(R, S.TUScope)) {
> -  Target = R.getFoundDecl();
> +  NamedDecl *Target = R.getFoundDecl();
>if (Target && isa(Target)) {
>  ObjCInterfaceDecl *ExprClass =
> cast(Target);
>  if (const ObjCObjectPointerType *InterfacePointerType =
> @@ -4056,8 +4055,6 @@ static bool CheckObjCBridgeNSCast(Sema , QualType
> castType, Expr *castExpr,
>   diag::err_objc_cf_bridged_not_interface)
><< castExpr->getType() << Parm;
>S.Diag(TDNDecl->getBeginLoc(), diag::note_declared_at);
> -  if (Target)
> -S.Diag(Target->getBeginLoc(), diag::note_declared_at);
>  }
>  return true;
>}
>
>
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments.



Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:102
+return R;
+  }
 };

jdoerfert wrote:
> It should be in the device rtl then, no?
This header is currently used from clang and the (amdgpu, could also be cuda if 
we like) host plugin. Possibly also from llvm. As of D108391 it would be used 
from the devicertl.

The idea is to have a single source of truth for the various magic numbers that 
the pieces should agree on and llvm is the common point on the dependency tree. 
I'm currently interested in that because I want to change some of them for 
gfx10 and have that magically ripple through the components. I'm not totally 
confident that will work out nicely for the host plugin as it has to 
dynamically handle different architectures but I think it'll be good enough.

It's not totally ideal to hand spin a function that is in the math support 
header but I also don't want to try to make various llvm headers 
ffreestanding-safe.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108380/new/

https://reviews.llvm.org/D108380

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108377: [asan] Implemented custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp:571
+  getShadowMapping(TargetTriple, M.getDataLayout().getPointerSizeInBits(),
+   ClEnableKasan.getNumOccurrences() > 0);
+  *ShadowBase = Mapping.Offset;

Unfortunately we need Kernel as argument, similar to RecoverShift in HWAsan.
Look at CL flags as internal per process constants. So if value is controlled 
by them then it's OK to assume the same value.
Kernel value however can be set by fronted e.g. from clang and does not match 
ClEnableKasan.

I propose to have own set of intrinsics for kernel/non-kernel to avoid fancy 
bit packing or unreadable 1/0 arguments.
Later we can cleunup hwasan as well.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108377/new/

https://reviews.llvm.org/D108377

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented custom calling convention similar to the one used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

vitalybuka wrote:
> PTAL at lvm.read_register.i32
> 
> How about:
> 
> llvm.asan.check.memaccess ->
>   lvm.asan.check_read
>   lvm.asan.check_write
>   lvm.asan.kernel.check_read
>   lvm.asan.kernel.check_write
> 
> Even better
>   lvm.asan.check_read.{i8, i16, i32, ...}
>   lvm.asan.check_write.{i8, i16, i32, ...}
>   lvm.asan.kernel.check_read.{i8, i16, i32, ...}
>   lvm.asan.kernel.check_write.{i8, i16, i32, ...}
> 
Looks like underscore is not used in intrinsic names, so essentially the same 
with dots.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105177: [clangd] Implemented indexing of standard library

2021-08-19 Thread Nathan Ridge via Phabricator via cfe-commits
nridge added inline comments.



Comment at: clang-tools-extra/clangd/unittests/StdLibIndexTests.cpp:51
+  Req.AnyScope = true;
+  EXPECT_THAT(match(*Index, Req),
+  UnorderedElementsAre(llvm::StringRef("myfunc"),

kuhnel wrote:
> @sammccall I seem to be running into a use-after-free problem here. Debugging 
> the whole thing shows that `Index` is pointing to an invalid address. So the 
> problem is somewhere between returning the `unique_ptr` from 
> `indexUmbrellaHeaders(...)` and assigning it to the `Index` variable.
> 
> Can you please take a look and give me a hint how to fix this?
I think your issue may be that `Dex` doesn't actually take ownership of the 
slabs that get passed to it; the slabs [need to outlive 
it](https://searchfox.org/llvm/rev/cab7c52acdf508f73186dfe49b8cb012bb9129b2/clang-tools-extra/clangd/index/dex/Dex.h#39).

`Dex` has another constructor which allows it to also take ownership, and a 
[Dex::build()](https://searchfox.org/llvm/rev/cab7c52acdf508f73186dfe49b8cb012bb9129b2/clang-tools-extra/clangd/index/dex/Dex.cpp#26)
 helper function to call it -- you probably want to be using that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105177/new/

https://reviews.llvm.org/D105177

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107850: [asan] Implemented custom calling convention similar to the one used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:1640
 
+def int_asan_check_memaccess :
+  Intrinsic<[],[llvm_ptr_ty, llvm_i8_ty, llvm_i8_ty],

PTAL at lvm.read_register.i32

How about:

llvm.asan.check.memaccess ->
  lvm.asan.check_read
  lvm.asan.check_write
  lvm.asan.kernel.check_read
  lvm.asan.kernel.check_write

Even better
  lvm.asan.check_read.{i8, i16, i32, ...}
  lvm.asan.check_write.{i8, i16, i32, ...}
  lvm.asan.kernel.check_read.{i8, i16, i32, ...}
  lvm.asan.kernel.check_write.{i8, i16, i32, ...}



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107850/new/

https://reviews.llvm.org/D107850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added inline comments.



Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:102
+return R;
+  }
 };

It should be in the device rtl then, no?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108380/new/

https://reviews.llvm.org/D108380

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108320: Add semantic token modifier for non-const reference parameter

2021-08-19 Thread Nathan Ridge via Phabricator via cfe-commits
nridge added inline comments.



Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:314
 //   (these tend to be vague, like Type or Unknown)
+// - Resolved tokens (i.e. without the "dependent-name" modifier) with kind
+//   "Unknown" are less reliable than resolved tokens with other kinds

sammccall wrote:
> nridge wrote:
> > We should consider the case where a dependent name is passed by non-const 
> > reference, for example:
> > 
> > ```
> > void increment_counter(int&);
> > 
> > template 
> > void bar() {
> >increment_counter(T::static_counter);
> > }
> > ```
> > 
> > This case does not work yet with the current patch (the dependent name is a 
> > `DependentScopeDeclRefExpr` rather than a `DeclRefExpr`), but we'll want to 
> > make it work in the future.
> > 
> > With the conflict resolution logic in this patch, the `Unknown` token kind 
> > from `highlightPassedByNonConstReference()` will be chosen over the 
> > dependent token kind.
> > 
> > As it happens, the dependent token kind for expressions is also `Unknown` 
> > so it doesn't matter, but perhaps we shouldn't be relying on this. Perhaps 
> > the following would make more sense:
> > 
> > 1. a token with `Unknown` as the kind has the lowest priority
> > 2. then a token with the `DependentName` modifier (next lowest)
> > 3. then everything else?
> The conflict-resolution idea is subtle (and IME hard to debug). I'm wary of 
> overloading it by deliberately introducing "conflicts" that should actually 
> be merged. Did you consider the idea of tracking extra modifiers separately 
> and merging them in at the end?
> 
> ---
> 
> BTW: we're stretching the meaning of `Unknown` here. There are two subtly 
> different concepts:
>  - clangd happens not to have determined the kind of this token, e.g. because 
> we missed a case (uses in this patch)
>  - clangd has determined that per C++ rules the kind of token is ambiguous 
> (uses prior to this patch)
> Call me weird, but I have "Unknown" highlighted in bright orange in my 
> editor, because I want to know about the second case :-)
I don't have a strong opinion on the options here, just wanted to chime in and 
say I also highlight `Unknown` prominently for similar reasons :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108320/new/

https://reviews.llvm.org/D108320

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108247: [CUDA] Improve CUDA version detection and diagnostics.

2021-08-19 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments.



Comment at: clang/lib/Driver/ToolChains/Cuda.cpp:209-211
+  Version = FS.exists(LibDevicePath + "/libdevice.10.bc")
+? Version = CudaVersion::NEW
+: Version = CudaVersion::CUDA_70;

tra wrote:
> Hahnfeld wrote:
> > The compiler is now warning here because of the assignment to `VERSION` in 
> > the ternary operator
> I'll fix it shortly. I also need to figure out why my build does not produce 
> the warning. 
The exact warning is
```
LLVM/src/clang/lib/Driver/ToolChains/Cuda.cpp: In constructor 
‘clang::driver::CudaInstallationDetector::CudaInstallationDetector(const 
clang::driver::Driver&, const llvm::Triple&, const llvm::opt::ArgList&)’:   
   
LLVM/src/clang/lib/Driver/ToolChains/Cuda.cpp:207:15: warning: operation on 
‘((clang::driver::CudaInstallationDetector*)this)->clang::driver::CudaInstallationDetector::Version’
 may be undefined [-Wsequence-point]  
   Version = FS.exists(LibDevicePath + "/libdevice.10.bc")
   ^~~
 ? Version = CudaVersion::NEW
 
 : Version = CudaVersion::CUDA_70;
 ~~~
```
Re-reading the code, I actually think this is a false positive of GCC 8.3.1 (on 
CentOS 8) and `Version` is never undefined. But it was redundant, so it's good 
to remove it anyway.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108247/new/

https://reviews.llvm.org/D108247

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108392: [OpenCL] Fix parsing of opencl-c.h in CL 3.0 with device-scope atomics enabled

2021-08-19 Thread Kévin Petit via Phabricator via cfe-commits
kpet created this revision.
kpet added a reviewer: Anastasia.
Herald added subscribers: ldrumm, jfb, kristof.beyls, yaxunl.
kpet requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Also declare and test the __opencl_c_atomic_scope_device and
__opencl_c_atomic_scope_all_devices features. The header is testing for them
but they were never defined by Clang.

With the new features declared, test/Headers/opencl-c-header.cl does
exercise the declaration this change fixes.

Signed-off-by: Kevin Petit 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108392

Files:
  clang/include/clang/Basic/OpenCLExtensions.def
  clang/lib/Headers/opencl-c.h
  clang/test/SemaOpenCL/features.cl


Index: clang/test/SemaOpenCL/features.cl
===
--- clang/test/SemaOpenCL/features.cl
+++ clang/test/SemaOpenCL/features.cl
@@ -24,6 +24,8 @@
 // FEATURES: #define __opencl_c_3d_image_writes 1
 // FEATURES: #define __opencl_c_atomic_order_acq_rel 1
 // FEATURES: #define __opencl_c_atomic_order_seq_cst 1
+// FEATURES: #define __opencl_c_atomic_scope_all_devices 1
+// FEATURES: #define __opencl_c_atomic_scope_device 1
 // FEATURES: #define __opencl_c_device_enqueue 1
 // FEATURES: #define __opencl_c_fp64 1
 // FEATURES: #define __opencl_c_generic_address_space 1
@@ -38,6 +40,8 @@
 // NO-FEATURES-NOT: __opencl_c_3d_image_writes
 // NO-FEATURES-NOT: __opencl_c_atomic_order_acq_rel
 // NO-FEATURES-NOT: __opencl_c_atomic_order_seq_cst
+// NO-FEATURES-NOT: __opencl_c_atomic_scope_all_devices
+// NO-FEATURES-NOT: __opencl_c_atomic_scope_device
 // NO-FEATURES-NOT: __opencl_c_device_enqueue
 // NO-FEATURES-NOT: __opencl_c_fp64
 // NO-FEATURES-NOT: __opencl_c_generic_address_space
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -13378,7 +13378,7 @@
 int __ovld atomic_fetch_xor(volatile __global atomic_int *object, int operand);
 int __ovld atomic_fetch_xor(volatile __local atomic_int *object, int operand);
 uint __ovld atomic_fetch_xor(volatile __global atomic_uint *object, uint 
operand);
-uint __ovld atomic_fetch_xor(volatile __local atomic_uint *object, uint 
operand);i
+uint __ovld atomic_fetch_xor(volatile __local atomic_uint *object, uint 
operand);
 int __ovld atomic_fetch_and(volatile __global atomic_int *object, int operand);
 int __ovld atomic_fetch_and(volatile __local atomic_int *object, int operand);
 uint __ovld atomic_fetch_and(volatile __global atomic_uint *object, uint 
operand);
Index: clang/include/clang/Basic/OpenCLExtensions.def
===
--- clang/include/clang/Basic/OpenCLExtensions.def
+++ clang/include/clang/Basic/OpenCLExtensions.def
@@ -110,6 +110,8 @@
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 300, 
OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 300, 
OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 300, 
OCL_C_30)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_device, false, 300, 
OCL_C_30)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_all_devices, false, 300, 
OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_3d_image_writes, false, 300, OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 300, OCL_C_30)


Index: clang/test/SemaOpenCL/features.cl
===
--- clang/test/SemaOpenCL/features.cl
+++ clang/test/SemaOpenCL/features.cl
@@ -24,6 +24,8 @@
 // FEATURES: #define __opencl_c_3d_image_writes 1
 // FEATURES: #define __opencl_c_atomic_order_acq_rel 1
 // FEATURES: #define __opencl_c_atomic_order_seq_cst 1
+// FEATURES: #define __opencl_c_atomic_scope_all_devices 1
+// FEATURES: #define __opencl_c_atomic_scope_device 1
 // FEATURES: #define __opencl_c_device_enqueue 1
 // FEATURES: #define __opencl_c_fp64 1
 // FEATURES: #define __opencl_c_generic_address_space 1
@@ -38,6 +40,8 @@
 // NO-FEATURES-NOT: __opencl_c_3d_image_writes
 // NO-FEATURES-NOT: __opencl_c_atomic_order_acq_rel
 // NO-FEATURES-NOT: __opencl_c_atomic_order_seq_cst
+// NO-FEATURES-NOT: __opencl_c_atomic_scope_all_devices
+// NO-FEATURES-NOT: __opencl_c_atomic_scope_device
 // NO-FEATURES-NOT: __opencl_c_device_enqueue
 // NO-FEATURES-NOT: __opencl_c_fp64
 // NO-FEATURES-NOT: __opencl_c_generic_address_space
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -13378,7 +13378,7 @@
 int __ovld atomic_fetch_xor(volatile __global atomic_int *object, int operand);
 int __ovld atomic_fetch_xor(volatile __local atomic_int *object, int operand);
 uint __ovld atomic_fetch_xor(volatile 

[PATCH] D107878: [SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader

2021-08-19 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added inline comments.



Comment at: llvm/lib/CodeGen/MIRSampleProfile.cpp:289
+
+bool MIRProfileLoaderPass::runOnMachineFunction(MachineFunction ) {
+  if (!MIRSampleLoader->isValid())

JDevlieghere wrote:
> Why is this outside the `llvm` namespace? 
I think it's common style in LLVM to have function definitions outside of 
namespaces -- IMO, the odd thing here is that the preceding function 
definitions were inside the namespace.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107878/new/

https://reviews.llvm.org/D107878

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107690: [Modules] Do not remove failed modules after the control block phase

2021-08-19 Thread Yaron Keren via Phabricator via cfe-commits
yaron.keren added inline comments.



Comment at: clang/lib/Serialization/ASTReader.cpp:4268
 // Read the AST block.
 if (ASTReadResult Result = ReadASTBlock(F, ClientLoadCapabilities))
+  return Failure;

vsapsai wrote:
> yaron.keren wrote:
> > Result is unused now.
> Thanks for pointing it out. Are there any bots failing because of that now? 
> Asking if should have a small urgent fix or if can wait for 
> https://reviews.llvm.org/D108268 to land.
Just a compiler warning, could wait.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107690/new/

https://reviews.llvm.org/D107690

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108377: [asan] Implemented custom calling convention similar used by HWASan for X86.

2021-08-19 Thread Vitaly Buka via Phabricator via cfe-commits
vitalybuka added inline comments.



Comment at: llvm/include/llvm/Transforms/Instrumentation/AddressSanitizer.h:151
+// Get AddressSanitizer parameters.
+void getAddressSanitizerParams(Module , uint64_t *ShadowBase,
+   int *MappingScale, bool *OrShadowOffset);

Could  please please replace Module with targetTriple and pointerSizeInBits.
We don't need entire module for that even if current callers have it.




Comment at: 
llvm/include/llvm/Transforms/Instrumentation/AddressSanitizerCommon.h:90
 
+void getASanShadowMapping(int *Scale, uint64_t *Offset, bool *OrShadowOffset);
+

I don't see implementation of this one.

And I expected getAddressSanitizerParams here, not in AddressSanitizer.h. Is 
there a reson to declare it there?




Comment at: llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp:566
 
+void getAddressSanitizerParams(Module , uint64_t *ShadowBase,
+   int *MappingScale, bool *OrShadowOffset) {

It would be nice to introduce this function as a separate NFC patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108377/new/

https://reviews.llvm.org/D108377

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107690: [Modules] Do not remove failed modules after the control block phase

2021-08-19 Thread Volodymyr Sapsai via Phabricator via cfe-commits
vsapsai added inline comments.



Comment at: clang/lib/Serialization/ASTReader.cpp:4268
 // Read the AST block.
 if (ASTReadResult Result = ReadASTBlock(F, ClientLoadCapabilities))
+  return Failure;

yaron.keren wrote:
> Result is unused now.
Thanks for pointing it out. Are there any bots failing because of that now? 
Asking if should have a small urgent fix or if can wait for 
https://reviews.llvm.org/D108268 to land.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107690/new/

https://reviews.llvm.org/D107690

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108301: [MSP430][Clang] Update hard-coded MCU data

2021-08-19 Thread Jozef Lawrynowicz via Phabricator via cfe-commits
jozefl added a comment.

Now that I am in the process of implementing the processing of the "CPU"
feature, I've realized the decision to store the CPU and HWMult information as
enums instead of strings has some downsides that may outweigh the benefits:

- All string values passed to options need to be first converted to enums 
before they can be processed
- Enums need to be converted back to strings in for use in diagnostics
- Additional code is required to perform these conversions

If all CPU and HWMult features are stored as strings, no conversions are
necessary, and the overall amount of code that needs to change for this MCU
data update is small.

As mentioned in the original commit message, the benefits of using enums are
that we have a canonical "invalid" value for when the user input is invalid,
and the hard-coded data is guaranteed to be valid (but not necessarily
correct) since all enums used to define features must be defined.

The processing of the CPU feature is much simpler than the hwmult features,
which is why all the conversions to and from enums seem even more unnecessary
now.

What do you think @asl? Should I get rid of the enums?
That is was I'm leaning towards right now.

Thanks,
Jozef


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108301/new/

https://reviews.llvm.org/D108301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108003: [Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

2021-08-19 Thread Ryan Beltran via Phabricator via cfe-commits
rpbeltran added a comment.

This patch seems like a great contribution! Really glad to see this being 
added. I did have a question though on why this only appears to catch "&" vs 
"&&" instead of doing the same for "|" vs "||". It seems like both operators 
have roughly the same potential for confusion. Could we add support for bitwise 
vs logical or in this?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108003/new/

https://reviews.llvm.org/D108003

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108387: [WebAssembly] Restore builtins and intrinsics for pmin/pmax

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits
tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Partially reverts 85157c007903 
, which 
had removed these builtins and intrinsics
in favor of normal codegen patterns. It turns out that it is possible for the
patterns to be split over multiple basic blocks, however, which means that DAG
ISel is not able to select them to the pmin/pmax instructions. To make sure the
SIMD intrinsics generate the correct instructions in these cases, reintroduce
the clang builtins and corresponding LLVM intrinsics, but also keep the normal
pattern matching as well.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108387

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -540,6 +540,26 @@
   ret <4 x float> %a
 }
 
+; CHECK-LABEL: pmin_v4f32:
+; CHECK-NEXT: .functype pmin_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.pmin.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @pmin_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.pmin.v4f32(<4 x float> %a, <4 x float> %b)
+  ret <4 x float> %v
+}
+
+; CHECK-LABEL: pmax_v4f32:
+; CHECK-NEXT: .functype pmax_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.pmax.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @pmax_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.pmax.v4f32(<4 x float> %a, <4 x float> %b)
+  ret <4 x float> %v
+}
+
 ; CHECK-LABEL: ceil_v4f32:
 ; CHECK-NEXT: .functype ceil_v4f32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f32x4.ceil $push[[R:[0-9]+]]=, $0{{$}}
@@ -595,6 +615,26 @@
   ret <2 x double> %a
 }
 
+; CHECK-LABEL: pmin_v2f64:
+; CHECK-NEXT: .functype pmin_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.pmin.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @pmin_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.pmin.v2f64(<2 x double> %a, <2 x double> %b)
+  ret <2 x double> %v
+}
+
+; CHECK-LABEL: pmax_v2f64:
+; CHECK-NEXT: .functype pmax_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.pmax.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @pmax_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.pmax.v2f64(<2 x double> %a, <2 x double> %b)
+  ret <2 x double> %v
+}
+
 ; CHECK-LABEL: ceil_v2f64:
 ; CHECK-NEXT: .functype ceil_v2f64 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f64x2.ceil $push[[R:[0-9]+]]=, $0{{$}}
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1165,6 +1165,16 @@
   (pmax $lhs, $rhs)>;
 }
 
+// And match the pmin/pmax LLVM intrinsics as well
+def : Pat<(v4f32 (int_wasm_pmin (v4f32 V128:$lhs), (v4f32 V128:$rhs))),
+  (PMIN_F32x4 V128:$lhs, V128:$rhs)>;
+def : Pat<(v4f32 (int_wasm_pmax (v4f32 V128:$lhs), (v4f32 V128:$rhs))),
+  (PMAX_F32x4 V128:$lhs, V128:$rhs)>;
+def : Pat<(v2f64 (int_wasm_pmin (v2f64 V128:$lhs), (v2f64 V128:$rhs))),
+  (PMIN_F64x2 V128:$lhs, V128:$rhs)>;
+def : Pat<(v2f64 (int_wasm_pmax (v2f64 V128:$lhs), (v2f64 V128:$rhs))),
+  (PMAX_F64x2 V128:$lhs, V128:$rhs)>;
+
 //===--===//
 // Conversions
 //===--===//
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -164,6 +164,15 @@
 [llvm_v8i16_ty, llvm_v8i16_ty],
 [IntrNoMem, IntrSpeculatable]>;
 
+def int_wasm_pmin :

[PATCH] D108247: [CUDA] Improve CUDA version detection and diagnostics.

2021-08-19 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 367532.
tra added a comment.

Fixed an error spotted by reviewer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108247/new/

https://reviews.llvm.org/D108247

Files:
  clang/include/clang/Basic/Cuda.h
  clang/include/clang/Basic/DiagnosticDriverKinds.td
  clang/lib/Basic/Cuda.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/lib/Driver/ToolChains/Cuda.h
  clang/test/Driver/Inputs/CUDA-new/usr/local/cuda/bin/.keep
  clang/test/Driver/Inputs/CUDA-new/usr/local/cuda/include/.keep
  clang/test/Driver/Inputs/CUDA-new/usr/local/cuda/include/cuda.h
  clang/test/Driver/Inputs/CUDA-new/usr/local/cuda/lib/.keep
  clang/test/Driver/Inputs/CUDA-new/usr/local/cuda/lib64/.keep
  
clang/test/Driver/Inputs/CUDA-new/usr/local/cuda/nvvm/libdevice/libdevice.10.bc
  clang/test/Driver/Inputs/CUDA-unknown/usr/local/cuda/version.txt
  clang/test/Driver/Inputs/CUDA_80/usr/local/cuda/include/cuda.h
  clang/test/Driver/Inputs/CUDA_80/usr/local/cuda/version.txt
  clang/test/Driver/Inputs/CUDA_90/usr/local/cuda/include/cuda.h
  clang/test/Driver/cuda-version-check.cu

Index: clang/test/Driver/cuda-version-check.cu
===
--- clang/test/Driver/cuda-version-check.cu
+++ clang/test/Driver/cuda-version-check.cu
@@ -8,15 +8,12 @@
 // RUN:FileCheck %s --check-prefix=OK
 // RUN: %clang --target=x86_64-linux -v -### --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA_80/usr/local/cuda 2>&1 %s | \
 // RUN:FileCheck %s --check-prefix=OK
-// Test version guess when no version.txt or cuda.h are found
+// Test version guess when cuda.h has not been found
 // RUN: %clang --target=x86_64-linux -v -### --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA-unknown/usr/local/cuda 2>&1 %s | \
 // RUN:FileCheck %s --check-prefix=UNKNOWN_VERSION
-// Unknown version with version.txt present
-// RUN: %clang --target=x86_64-linux -v -### --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA_102/usr/local/cuda 2>&1 %s | \
-// RUN:FileCheck %s --check-prefix=UNKNOWN_VERSION_V
-// Unknown version with no version.txt but with version info present in cuda.h
-// RUN: %clang --target=x86_64-linux -v -### --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA_111/usr/local/cuda 2>&1 %s | \
-// RUN:FileCheck %s --check-prefix=UNKNOWN_VERSION_H
+// Unknown version info present in cuda.h
+// RUN: %clang --target=x86_64-linux -v -### --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA-new/usr/local/cuda 2>&1 %s | \
+// RUN:FileCheck %s --check-prefix=UNKNOWN_VERSION
 // Make sure that we don't warn about CUDA version during C++ compilation.
 // RUN: %clang --target=x86_64-linux -v -### -x c++ --cuda-gpu-arch=sm_60 \
 // RUN:--cuda-path=%S/Inputs/CUDA-unknown/usr/local/cuda 2>&1 %s | \
@@ -66,13 +63,14 @@
 // OK_SM35-NOT: error: GPU arch sm_35
 
 // We should only get one error per architecture.
+// ERR_SM20: error: GPU arch sm_20 {{.*}}
+// ERR_SM20-NOT: error: GPU arch sm_20
+
 // ERR_SM60: error: GPU arch sm_60 {{.*}}
 // ERR_SM60-NOT: error: GPU arch sm_60
 
 // ERR_SM61: error: GPU arch sm_61 {{.*}}
 // ERR_SM61-NOT: error: GPU arch sm_61
 
-// UNKNOWN_VERSION_V: unknown CUDA version: version.txt:{{.*}}; assuming the latest supported version
-// UNKNOWN_VERSION_H: unknown CUDA version: cuda.h: CUDA_VERSION={{.*}}; assuming the latest supported version
-// UNKNOWN_VERSION: unknown CUDA version: no version found in version.txt or cuda.h; assuming the latest supported version
+// UNKNOWN_VERSION: CUDA version is newer than the latest{{.*}} supported version
 // UNKNOWN_VERSION_CXX-NOT: unknown CUDA version
Index: clang/test/Driver/Inputs/CUDA_90/usr/local/cuda/include/cuda.h
===
--- /dev/null
+++ clang/test/Driver/Inputs/CUDA_90/usr/local/cuda/include/cuda.h
@@ -0,0 +1,7 @@
+//
+// Placeholder file for testing CUDA version detection
+//
+
+#define CUDA_VERSION 9000
+
+//
Index: clang/test/Driver/Inputs/CUDA_80/usr/local/cuda/version.txt
===
--- clang/test/Driver/Inputs/CUDA_80/usr/local/cuda/version.txt
+++ /dev/null
@@ -1 +0,0 @@
-CUDA Version 8.0.42
Index: clang/test/Driver/Inputs/CUDA_80/usr/local/cuda/include/cuda.h
===
--- /dev/null
+++ clang/test/Driver/Inputs/CUDA_80/usr/local/cuda/include/cuda.h
@@ -0,0 +1,7 @@
+//
+// Placeholder file for testing CUDA version detection
+//
+
+#define CUDA_VERSION 8000
+
+//
Index: clang/test/Driver/Inputs/CUDA-unknown/usr/local/cuda/version.txt
===
--- clang/test/Driver/Inputs/CUDA-unknown/usr/local/cuda/version.txt
+++ /dev/null
@@ -1 +0,0 @@
-CUDA Version 999.999.999
Index: clang/test/Driver/Inputs/CUDA-new/usr/local/cuda/include/cuda.h
===
--- /dev/null
+++ 

[PATCH] D107690: [Modules] Do not remove failed modules after the control block phase

2021-08-19 Thread Yaron Keren via Phabricator via cfe-commits
yaron.keren added inline comments.



Comment at: clang/lib/Serialization/ASTReader.cpp:4268
 // Read the AST block.
 if (ASTReadResult Result = ReadASTBlock(F, ClientLoadCapabilities))
+  return Failure;

Result is unused now.



Comment at: clang/lib/Serialization/ASTReader.cpp:4279
 while (!SkipCursorToBlock(F.Stream, EXTENSION_BLOCK_ID)) {
   if (ASTReadResult Result = ReadExtensionBlock(F))
+return Failure;

Same here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107690/new/

https://reviews.llvm.org/D107690

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >