from:"Phoebe Wang via Phabricator via cfe\-commits"

[PATCH] D86855: Convert __m64 intrinsics to unconditionally use SSE2 instead of MMX instructions.

2023-11-19 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

Reverse ping. Any progress or plan for this patch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86855/new/

https://reviews.llvm.org/D86855

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159250: [X86][RFC] Add new option `-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 features

2023-09-10 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/lib/Target/X86/X86Subtarget.cpp:271
 
+  // Attach EVEX512 feature when we have AVX512 features and EVEX512 is not 
set.
+  size_t posNoEVEX512 = FS.rfind("-evex512");

pengfei wrote:
> skan wrote:
> > It seems the change in X86.cpp is redundant?
> It's not. We need `FeatureEVEX512` because it's independent of 
> `FeatureAVX512`. We will have future AVX10-256 targets that have 
> `FeatureAVX512` only.
> Here we handle old IR that don't set `evex512` in function attributes.
It has affects to AVX10-256 targets too, we need to restrict it to default CPU, 
see https://github.com/llvm/llvm-project/pull/65920


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159250/new/

https://reviews.llvm.org/D159250

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159250: [X86][RFC] Add new option `-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 features

2023-09-07 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/CodeGen/Targets/X86.cpp:1493
  const llvm::StringMap ,
  QualType Ty, StringRef Feature,
  bool IsArgument) {

RKSimon wrote:
> Remove Feature argument and hardcode to "avx" now that it only has 1 (avx) 
> caller?
We still have call to "avx512" at line 1525.
We have strict rule for AVX512-256, but should not change the legacy check.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159250/new/

https://reviews.llvm.org/D159250

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159250: [X86][RFC] Add new option `-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 features

2023-09-06 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159250/new/

https://reviews.llvm.org/D159250

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159250: [X86][RFC] Add new option `-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 features

2023-09-04 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/lib/Target/X86/X86Subtarget.cpp:271
 
+  // Attach EVEX512 feature when we have AVX512 features and EVEX512 is not 
set.
+  size_t posNoEVEX512 = FS.rfind("-evex512");

skan wrote:
> It seems the change in X86.cpp is redundant?
It's not. We need `FeatureEVEX512` because it's independent of `FeatureAVX512`. 
We will have future AVX10-256 targets that have `FeatureAVX512` only.
Here we handle old IR that don't set `evex512` in function attributes.



Comment at: llvm/lib/Target/X86/X86Subtarget.cpp:275
+  size_t posEVEX512 = FS.rfind("+evex512");
+  size_t posAVX512F = FS.rfind("+avx512");
+

skan wrote:
> Missing f?
No, it's intentional. Sometimes, feature attributes may not have a full set of 
AVX512 features. If user only use e.g., "avx512bw", we should make sure 
"evex512" attached too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159250/new/

https://reviews.llvm.org/D159250

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159250: [X86][RFC] Add new option `-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 features

2023-09-01 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D159250#4634774 , @jyknight wrote:

> In D159250#4633530 , @pengfei wrote:
>
>> In D159250#4631786 , @RKSimon 
>> wrote:
>>
>>> Would it be possible to add function multiversioning tests to ensure the 
>>> evex512 attribute would work with it?
>>
>> Function multiversioning is orthogonal to evex512 feature.
>>
>> When user uses `-mno-evex512` in command line, all the code generation, 
>> including function multiversioning are limited to 256-bit vector and 32-bit 
>> mask.
>>
>> User is not suggested to use `avx512xxx,evex512` in function attributes for 
>> function multiversioning, because EVEX512 is a software concept and the 
>> dispatcher cannot distinguish between `avx512xxx` and `avx512xxx,evex512`.
>
> If the dispatcher is updated to take into account AVX10.1 CPUID, it could 
> distinguish the different hardware support.
>
> That is:
>
> - to check for AVX512xxx with evex512 //enabled//, the dispatcher need only 
> check for the AVX512xxx CPUID bit, since according to the doc, a CPU which 
> implements AVX10.1 with 512-bit register size will also set the corresponding 
> AVX512 CPUID bits. No change there.
> - to check for AVX512xxx with evex512 //disabled//, the dispatcher function 
> should check that either CPUID reports the AVX512xxx bit OR that the CPUID 
> reports AVX10.1 with support for at least 256-bit register size. (But only 
> for the 'AVX512xxx' features which are actually included in AVX10.1, of 
> course).

Let's not to mix evex512 with AVX10.1 512-bit register size enumeration bit. 
EVEX512 is intended for AVX512xxx only. It's not supposed to use for AVX10. And 
it conflicts with the functionality of the AVX10.1 bit in same way.
For example, to maintain backward compatibility, EVEX512 is designed to be a 
default by on feature. That says, if users don't disable EVEX512 in the command 
line explicitly, and use `avx512xxx` only in function attributes, compiler will 
attach a `evex512` implicitly. If we map `evex512` to the AVX10.1 bit, the 
function will never be dispatched on prior-AVX10 targets.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159250/new/

https://reviews.llvm.org/D159250

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159250: [X86][RFC] Add new option `-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 features

2023-09-01 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/CodeGen/Targets/X86.cpp:1517
+  bool Caller256 = CallerMap.lookup("avx512f") && !CallerMap.lookup("evex512");
+  bool Callee256 = CallerMap.lookup("avx512f") && !CallerMap.lookup("evex512");
+

RKSimon wrote:
> typo in Callee256?
Good catch!



Comment at: clang/test/CodeGen/X86/avx512-error.c:9
+  return __builtin_ia32_sqrtpd512(a, _MM_FROUND_CUR_DIRECTION); // 
expected-error {{'__builtin_ia32_sqrtpd512' needs target feature evex512}}
+}

RKSimon wrote:
> add  __mmask64 test ? _knot_mask64 or _cvtmask64_u64 maybe?
Good point! This exposed a design problem. We cannot only check for 512-bit 
vector, instead, we need to add `evex512` to all ZMM or 64-bit mask 
builtin/intrinsic attribute list.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159250/new/

https://reviews.llvm.org/D159250

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157485: [X86][RFC] Support new feature AVX10

2023-08-31 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei planned changes to this revision.
pengfei added a comment.

In D157485#4597603 , @e-kud wrote:

> Just curious, in RFC we have `-mavx10.x-256/-mavx10.x-512` but here we refer 
> to `-mavx10.x/-mavx10.x,-mavx10-512bit`. Is it compliant with GCC, or the 
> revision is just for the illustrative purpose?

Sorry for the late reply. We have received a couple concerns about how to 
interpret these options, especially when used together with AVX512 options. We 
decided not to provide AVX10.1 options at the present, instead, we just provide 
`-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 
features. For more details, lease take a look at D159250 
.




Comment at: clang/lib/Basic/Targets/X86.cpp:739
+  if (HasAVX10_512BIT)
+Builder.defineMacro("__AVX10_512BIT__");
+

MaskRay wrote:
> This is untested?
Done in an alternative reversion D159250.



Comment at: clang/lib/Driver/ToolChains/Arch/X86.cpp:261
+if (AVXVecSize == 256)
+  D.Diag(diag::warn_drv_overriding_flag_option) << "AVX10-256"
+<< "AVX10-512";

MaskRay wrote:
> `warn_drv_overriding_flag_option` is under the group `-Woverriding-t-option`, 
> which was intended for clang-cl `/T*` options (`D1290`).
> 
> I created D158137 to add `-Woverriding-option`.
Thanks! This is not needed in the new version.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157485/new/

https://reviews.llvm.org/D157485

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159250: [X86][RFC] Add new option `-m[no-]evex512` to disable ZMM and 64-bit mask instructions for AVX512 features

2023-08-31 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei created this revision.
pengfei added reviewers: RKSimon, skan, jyknight, e-kud.
Herald added a subscriber: hiraditya.
Herald added a project: All.
pengfei requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

This is an alternative of D157485  and a 
pre-feature to support AVX10.

AVX10 Architecture Specification: 
https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661

Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.

There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D159250

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Basic/Targets/X86.cpp
  clang/lib/Basic/Targets/X86.h
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/Targets/X86.cpp
  clang/test/CodeGen/X86/avx512-error.c
  clang/test/CodeGen/attr-cpuspecific.c
  clang/test/CodeGen/attr-target-x86.c
  clang/test/CodeGen/regcall2.c
  clang/test/CodeGen/target-avx-abi-diag.c
  clang/test/Driver/x86-target-features.c
  clang/test/Preprocessor/x86_target_features.c
  llvm/include/llvm/TargetParser/X86TargetParser.def
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86RegisterInfo.cpp
  llvm/lib/Target/X86/X86Subtarget.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/TargetParser/X86TargetParser.cpp
  llvm/test/CodeGen/X86/avx512bwvl-arith.ll
  llvm/test/CodeGen/X86/avx512vl-arith.ll

Index: llvm/test/CodeGen/X86/avx512vl-arith.ll
===
--- llvm/test/CodeGen/X86/avx512vl-arith.ll
+++ llvm/test/CodeGen/X86/avx512vl-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl --show-mc-encoding| FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl,-evex512 --show-mc-encoding| FileCheck %s
 
 ; 256-bit
 
Index: llvm/test/CodeGen/X86/avx512bwvl-arith.ll
===
--- llvm/test/CodeGen/X86/avx512bwvl-arith.ll
+++ llvm/test/CodeGen/X86/avx512bwvl-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+avx512vl | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+avx512vl,-evex512 | FileCheck %s
 
 ; 256-bit
 
Index: llvm/lib/TargetParser/X86TargetParser.cpp
===
--- llvm/lib/TargetParser/X86TargetParser.cpp
+++ llvm/lib/TargetParser/X86TargetParser.cpp
@@ -72,7 +72,7 @@
 constexpr FeatureBitset FeaturesX86_64_V3 =
 FeaturesX86_64_V2 | FeatureAVX2 | FeatureBMI | FeatureBMI2 | FeatureF16C |
 FeatureFMA | FeatureLZCNT | FeatureMOVBE | FeatureXSAVE;
-constexpr FeatureBitset FeaturesX86_64_V4 = FeaturesX86_64_V3 |
+constexpr FeatureBitset FeaturesX86_64_V4 = FeaturesX86_64_V3 | FeatureEVEX512 |
 FeatureAVX512BW | FeatureAVX512CD |
 FeatureAVX512DQ | FeatureAVX512VL;
 
@@ -96,8 +96,8 @@
 // Intel Knights Landing and Knights Mill
 // Knights Landing has feature parity with Broadwell.
 constexpr FeatureBitset FeaturesKNL =
-FeaturesBroadwell | FeatureAES | FeatureAVX512F | FeatureAVX512CD |
-FeatureAVX512ER | FeatureAVX512PF | FeaturePREFETCHWT1;
+FeaturesBroadwell | FeatureAES | FeatureAVX512F | FeatureEVEX512 |
+FeatureAVX512CD | FeatureAVX512ER | FeatureAVX512PF | FeaturePREFETCHWT1;
 constexpr FeatureBitset FeaturesKNM = FeaturesKNL | FeatureAVX512VPOPCNTDQ;
 
 // Intel Skylake processors.
@@ -107,9 +107,9 @@
 // SkylakeServer inherits all SkylakeClient features except SGX.
 // FIXME: That doesn't match gcc.
 constexpr FeatureBitset FeaturesSkylakeServer =
-(FeaturesSkylakeClient & ~FeatureSGX) | FeatureAVX512F | FeatureAVX512CD |
-FeatureAVX512DQ | FeatureAVX512BW | FeatureAVX512VL | FeatureCLWB |
-FeaturePKU;
+(FeaturesSkylakeClient &

[PATCH] D159068: [clang][X86] Update excessive register save diagnostic to more closely follow the interrupt attribute spec

2023-08-29 Thread Phoebe Wang via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGfa1dc06a1b39: [clang][X86] Update excessive register save 
diagnostic to more closely follow… (authored by antangelo, committed by 
pengfei).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159068/new/

https://reviews.llvm.org/D159068

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/AttrDocs.td
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaExpr.cpp
  clang/test/Sema/attr-x86-interrupt.c
  clang/test/Sema/x86-diag-excessive-regsave.c

Index: clang/test/Sema/x86-diag-excessive-regsave.c
===
--- /dev/null
+++ clang/test/Sema/x86-diag-excessive-regsave.c
@@ -0,0 +1,34 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify %s -DNO_CALLER_SAVED_REGISTERS=1
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature -x87 -target-feature -mmx -target-feature -sse -fsyntax-only -verify %s -DGPRONLY=1
+
+#if defined(NO_CALLER_SAVED_REGS) || defined(GPRONLY)
+// expected-no-diagnostics
+#else
+#define EXPECT_WARNING
+#endif
+
+#ifdef NO_CALLER_SAVED_REGS
+__attribute__((no_caller_saved_registers))
+#endif
+#ifdef EXPECT_WARNING
+// expected-note@+3 {{'foo' declared here}}
+// expected-note@+2 {{'foo' declared here}}
+#endif
+extern void foo(void *);
+
+__attribute__((no_caller_saved_registers))
+void no_caller_saved_registers(void *arg) {
+#ifdef EXPECT_WARNING
+// expected-warning@+2 {{function with attribute 'no_caller_saved_registers' should only call a function with attribute 'no_caller_saved_registers' or be compiled with '-mgeneral-regs-only'}}
+#endif
+foo(arg);
+}
+
+__attribute__((interrupt))
+void interrupt(void *arg) {
+#ifdef EXPECT_WARNING
+// expected-warning@+2 {{interrupt service routine should only call a function with attribute 'no_caller_saved_registers' or be compiled with '-mgeneral-regs-only'}}
+#endif
+foo(arg);
+}
Index: clang/test/Sema/attr-x86-interrupt.c
===
--- clang/test/Sema/attr-x86-interrupt.c
+++ clang/test/Sema/attr-x86-interrupt.c
@@ -40,23 +40,6 @@
 __attribute__((interrupt)) void foo7(int *a, unsigned b) {}
 __attribute__((interrupt)) void foo8(int *a) {}
 
-#ifdef _LP64
-typedef unsigned long Arg2Type;
-#elif defined(__x86_64__)
-typedef unsigned long long Arg2Type;
-#else
-typedef unsigned int Arg2Type;
-#endif
-#ifndef NOCALLERSAVE
-__attribute__((no_caller_saved_registers))
-#else
-// expected-note@+3 {{'foo9' declared here}}
-// expected-warning@+4 {{interrupt service routine should only call a function with attribute 'no_caller_saved_registers'}}
-#endif
-void foo9(int *a, Arg2Type b) {}
-__attribute__((interrupt)) void fooA(int *a, Arg2Type b) {
-  foo9(a, b);
-}
 void g(void (*fp)(int *));
 int main(int argc, char **argv) {
   void *ptr = (void *)
Index: clang/lib/Sema/SemaExpr.cpp
===
--- clang/lib/Sema/SemaExpr.cpp
+++ clang/lib/Sema/SemaExpr.cpp
@@ -7413,11 +7413,18 @@
   Diag(FDecl->getLocation(), diag::note_callee_decl) << FDecl;
   }
 }
-if (Caller->hasAttr() &&
-((!FDecl || !FDecl->hasAttr( {
-  Diag(Fn->getExprLoc(), diag::warn_anyx86_interrupt_regsave);
-  if (FDecl)
-Diag(FDecl->getLocation(), diag::note_callee_decl) << FDecl;
+if (Caller->hasAttr() ||
+Caller->hasAttr()) {
+  const TargetInfo  = Context.getTargetInfo();
+  bool HasNonGPRRegisters =
+  TI.hasFeature("sse") || TI.hasFeature("x87") || TI.hasFeature("mmx");
+  if (HasNonGPRRegisters &&
+  (!FDecl || !FDecl->hasAttr())) {
+Diag(Fn->getExprLoc(), diag::warn_anyx86_excessive_regsave)
+<< (Caller->hasAttr() ? 0 : 1);
+if (FDecl)
+  Diag(FDecl->getLocation(), diag::note_callee_decl) << FDecl;
+  }
 }
   }
 
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -310,10 +310,12 @@
   "a pointer as the first parameter|a %2 type as the second parameter}1">;
 def err_anyx86_interrupt_called : Error<
   "interrupt service routine cannot be called directly">;
-def warn_anyx86_interrupt_regsave : Warning<
-  "interrupt service routine should only call a function"
-  " with attribute 'no_caller_saved_registers'">,
-  InGroup>;
+def warn_anyx86_excessive_regsave : Warning<
+  "%select{interrupt service routine|function with attribute"
+  " 'no_caller_saved_registers'}0 should only call a function"
+  " with attribute 'no_caller_saved_registers'"
+  " or be compiled with '-mgeneral-regs-only'">,
+  InGroup>;
 def

[PATCH] D159068: [clang][X86] Update excessive register save diagnostic to more closely follow the interrupt attribute spec

2023-08-29 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

I don't have prior experience about interrupt diagnostic but know something 
about mgeneral-regs-only. I think the diagnostic great.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159068/new/

https://reviews.llvm.org/D159068

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159068: [clang][X86] Update excessive register save diagnostic to more closely follow the interrupt attribute spec

2023-08-29 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:317
+  " with attribute 'no_caller_saved_registers'"
+  " or be compiled with '-mgeneral-regs-only'">,
+  InGroup>;

aaron.ballman wrote:
> Can you add a test case showing that `-mgeneral-regs-only` causes the 
> diagnostic to be silenced?
It's a driver only option. I think we are not suggested to invoke %clang out of 
driver testsing, right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159068/new/

https://reviews.llvm.org/D159068

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158811: [X86] __builtin_cpu_supports: support x86-64{,-v2,-v3,-v4}

2023-08-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158811/new/

https://reviews.llvm.org/D158811

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158811: [X86] __builtin_cpu_supports: support x86-64{,-v2,-v3,-v4}

2023-08-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/include/llvm/TargetParser/X86TargetParser.h:63
+
+#define X86_FEATURE(ENUM, STRING)
+#define X86_MICROARCH_LEVEL(ENUM, STRING, PRIORITY) FEATURE_##ENUM = PRIORITY,

Not needed.



Comment at: llvm/lib/TargetParser/X86TargetParser.cpp:718
 ;
-FeaturesMask |= (1ULL << Feature);
+FeatureMask[Feature / 32] |= 1U << (Feature % 32);
   }

Should we use vector for future expansion, or add an assert to make sure it 
won't exceed current limitation?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158811/new/

https://reviews.llvm.org/D158811

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158329: [X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute

2023-08-23 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158329/new/

https://reviews.llvm.org/D158329

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158329: [X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute

2023-08-20 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13318-13320
+Value *Idxs[] = {Builder.getInt32(0), Builder.getInt32(i - 1)};
+Value *Features = Builder.CreateAlignedLoad(
+Int32Ty, Builder.CreateGEP(ATy, CpuFeatures2, Idxs),

MaskRay wrote:
> pengfei wrote:
> > Do we have problem when linking with old version of libgcc?
> If we don't use `__cpu_features2[1..3]`, no problem with older libgcc.
> 
> The way GCC upgraded `__cpu_features2` to an array is compatible with scalar 
> `__cpu_features2`.
I assume GCC seldom links to older libgcc, but there's no guarantee Clang can 
link to latest libgcc.
And we cannot assume user won't use it once the method provided.
I don't have a good idea for it, but I think we should write the requirement 
done in release note or somewhere if we cannot find a better way.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158329/new/

https://reviews.llvm.org/D158329

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158329: [X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute

2023-08-19 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13272
+  uint64_t Mask = llvm::X86::getCpuSupportsMask(FeatureStrs);
+  uint32_t FeaturesMask[4] = {uint32_t(Mask), uint32_t(Mask >> 32), 0, 0};
+  return EmitX86CpuSupports(FeaturesMask);

Use `Lo_32(Mask)`, `Hi_32(Mask)`?



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13318-13320
+Value *Idxs[] = {Builder.getInt32(0), Builder.getInt32(i - 1)};
+Value *Features = Builder.CreateAlignedLoad(
+Int32Ty, Builder.CreateGEP(ATy, CpuFeatures2, Idxs),

Do we have problem when linking with old version of libgcc?



Comment at: compiler-rt/lib/builtins/cpu_model.c:808
+  if (HasExtLeaf1) {
+if (ECX & 1)
+  setFeature(FEATURE_LAHF_LM);

Can we reuse bit defination in cpuid.h, e.g., `if (ECX & bit_LAHF_LM)`? The 
same below.



Comment at: compiler-rt/lib/builtins/cpu_model.c:808
+  if (HasExtLeaf1) {
+if (ECX & 1)
+  setFeature(FEATURE_LAHF_LM);

pengfei wrote:
> Can we reuse bit defination in cpuid.h, e.g., `if (ECX & bit_LAHF_LM)`? The 
> same below.
Can we reuse bit defination in cpuid.h, e.g., `if (ECX & bit_LAHF_LM)`? The 
same below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158329/new/

https://reviews.llvm.org/D158329

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157297: [clang] Fixes compile error that double colon operator cannot resolve macro with parentheses.

2023-08-16 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D157297#4592851 , @aaron.ballman 
wrote:

> In D157297#4590886 , @pengfei wrote:
>
>> BTW, maybe @aaron.ballman knows why we don't support such syntax in C++.
>
> The return statement expects an expression or braced init list: 
> http://eel.is/c++draft/stmt.jump#general-1 but we really only need to care 
> about the expression form (there's no braces in sight).
> Grammatically, the leading `::` will parse as a qualified-id because it 
> matches the production for nested-name-specifier: 
> http://eel.is/c++draft/expr.prim.id.qual#nt:qualified-id
> That needs to be followed by an unqualified-id 
> (http://eel.is/c++draft/expr.prim.id.unqual#nt:unqualified-id), but the open 
> paren does not match any of the grammar productions, so this is a syntax 
> error.

Thanks @aaron.ballman for the detailed explanation! I have learnt it. @lygstate 
maybe include Aaron's explanation in commit message?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157297/new/

https://reviews.llvm.org/D157297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157485: [X86][RFC] Support new feature AVX10

2023-08-16 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:2581
+ unsigned VectorWidth) {
+  if (!getTarget().getTriple().isX86() || VectorWidth < 512)
+return;

skan wrote:
> Minor suggestion. The code here may be refined to be better extended by other 
> targets, like
> ```
>   llvm::Triple::ArchType ArchType =
>   getContext().getTargetInfo().getTriple().getArch();
> 
>   switch (ArchType) {
>   case llvm::Triple::x86:
>   case llvm::Triple::x86_64: {
>  
>}
>default:
> return;
> 
> ```
We have a few place code using `isX86`. I think it's more convenient to use a 
single condition. We can refactor when necessary.



Comment at: llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:927
+  !STI.hasFeature(X86::FeatureAVX10_512bit))
+report_fatal_error("ZMM registers are not supported without AVX10-512BIT");
   switch (TSFlags & X86II::OpPrefixMask) {

skan wrote:
> skan wrote:
> > -mavx10.1 does not work for assembler. So if such instruction is generated 
> > w/o AVX10-512BIT support, it must be compiler's issue instead of user's. An 
> > `assert` should be more appropriate here.
> > -mavx10.1 does not work for assembler. So if such instruction is generated 
> > w/o AVX10-512BIT support, it must be compiler's issue instead of user's. An 
> > `assert` should be more appropriate here.
> 
> Reference: https://llvm.org/docs/CodingStandards.html#assert-liberally
We need to report fatal error for this case even if it's a compiler bug. 
Otherwise, user may observe the crash issue in runtime and hard to find the 
reason.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157485/new/

https://reviews.llvm.org/D157485

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157485: [X86][RFC] Support new feature AVX10

2023-08-16 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei updated this revision to Diff 550669.
pengfei marked an inline comment as done.
pengfei added a comment.

Address comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157485/new/

https://reviews.llvm.org/D157485

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Basic/Targets/X86.cpp
  clang/lib/Basic/Targets/X86.h
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/Targets/X86.cpp
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  clang/test/CodeGen/X86/avx10-error.c
  clang/test/CodeGen/attr-target-x86.c
  clang/test/CodeGen/target-avx-abi-diag.c
  clang/test/Driver/x86-target-features.c
  clang/test/Preprocessor/x86_target_features.c
  llvm/docs/ReleaseNotes.rst
  llvm/include/llvm/TargetParser/X86TargetParser.def
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86RegisterInfo.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/TargetParser/Host.cpp
  llvm/lib/TargetParser/X86TargetParser.cpp
  llvm/test/CodeGen/X86/avx512-arith.ll
  llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
  llvm/test/CodeGen/X86/avx512bw-arith.ll
  llvm/test/CodeGen/X86/avx512bwvl-arith.ll
  llvm/test/CodeGen/X86/avx512fp16-arith.ll
  llvm/test/CodeGen/X86/avx512vl-arith.ll

Index: llvm/test/CodeGen/X86/avx512vl-arith.ll
===
--- llvm/test/CodeGen/X86/avx512vl-arith.ll
+++ llvm/test/CodeGen/X86/avx512vl-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl --show-mc-encoding| FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=+avx10.1 --show-mc-encoding| FileCheck %s
 
 ; 256-bit
 
Index: llvm/test/CodeGen/X86/avx512fp16-arith.ll
===
--- llvm/test/CodeGen/X86/avx512fp16-arith.ll
+++ llvm/test/CodeGen/X86/avx512fp16-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=skx -mattr=+avx512fp16 | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=+avx10.1,+avx10-512bit | FileCheck %s
 
 define <32 x half> @vaddph_512_test(<32 x half> %i, <32 x half> %j) nounwind readnone {
 ; CHECK-LABEL: vaddph_512_test:
Index: llvm/test/CodeGen/X86/avx512bwvl-arith.ll
===
--- llvm/test/CodeGen/X86/avx512bwvl-arith.ll
+++ llvm/test/CodeGen/X86/avx512bwvl-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+avx512vl | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx10.1 | FileCheck %s
 
 ; 256-bit
 
Index: llvm/test/CodeGen/X86/avx512bw-arith.ll
===
--- llvm/test/CodeGen/X86/avx512bw-arith.ll
+++ llvm/test/CodeGen/X86/avx512bw-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx10.1,+avx10-512bit | FileCheck %s
 
 define <64 x i8> @vpaddb512_test(<64 x i8> %i, <64 x i8> %j) nounwind readnone {
 ; CHECK-LABEL: vpaddb512_test:
Index: llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
===
--- llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
+++ llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
@@ -1,6 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512f   | FileCheck %s --check-prefixes=AVX512F
 ; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=AVX512BW
+; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx10.1,+avx10-512bit | FileCheck %s --check-prefixes=AVX512BW
 
 ; PR34666
 define <64 x i8> @add_v64i8_broadcasts(<64 x i8> %a0, i64 %a1, i8 %a2) {
Index: llvm/test/CodeGen/X86/avx512-arith.ll
===
--- llvm/test/CodeGen/X86/avx512-arith.ll
+++ llvm/test/CodeGen/X86/avx512-arith.ll
@@ -4,6 +4,7 @@
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw | FileCheck %s --check-prefix=CHECK --check-prefix=AVX512BW
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512dq | FileCheck %s --check-prefix=CHECK --check-prefix=AVX512DQ
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512dq,+avx512bw,+avx512vl | FileCheck %s --check-prefix=CHECK --check-prefix=SKX
+;

[PATCH] D157680: [X86]Support options -mno-gather -mno-scatter

2023-08-16 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157680/new/

https://reviews.llvm.org/D157680

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157485: [X86][RFC] Support new feature AVX10

2023-08-15 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

Ping~ It looks to me there's no concern about this solution in the RFC 
. I 
think we can move forward to land it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157485/new/

https://reviews.llvm.org/D157485

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157297: [clang] Fixes compile error that double colon operator cannot resolve macro with parentheses.

2023-08-15 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a subscriber: aaron.ballman.
pengfei added a comment.

BTW, maybe @aaron.ballman knows why we don't support such syntax in C++.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157297/new/

https://reviews.llvm.org/D157297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157297: [clang] Fixes compile error that double colon operator cannot resolve macro with parentheses.

2023-08-15 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157297/new/

https://reviews.llvm.org/D157297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157297: [clang] Fixes compile error that double colon operator cannot resolve macro with parentheses.

2023-08-15 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D157297#4590692 , @lygstate wrote:

> In D157297#4590580 , @pengfei wrote:
>
>> 
>
>
>
>> I'd prefer macro to duplicated definitions. We have such precedents, e.g., 
>> https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/xmmintrin.h#L2994
>
> Do you means revert to the first version?

Yes.

>> Besides, you should update the summary as well.
>
> I don't know what kind of summary you want, I can give me that so I can 
> update it

A reliable source about double colon operator cannot resolve function with 
parentheses in C++ is the best I expect, however I haven't find one so far.
Maybe a short description about the phenomenon on GCC/Clang is better than just 
giving error message. I don't have strong request for this.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157297/new/

https://reviews.llvm.org/D157297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157297: [clang] Fixes compile error that double colon operator cannot resolve macro with parentheses.

2023-08-15 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

I'd prefer macro to duplicated definitions. We have such precedents, e.g., 
https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/xmmintrin.h#L2994
Besides, you should update the summary as well.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157297/new/

https://reviews.llvm.org/D157297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157297: [clang] Fixes compile error like error: expected unqualified-id for ::_tzcnt_u32(mask);

2023-08-14 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D157297#4583802 , @lygstate wrote:

> In D157297#4571572 , @pengfei wrote:
>
>> The description is not clear to me. You should describe the reason rather 
>> than phenomenon.
>>
>> My understanding is double colon operator cannot resolve functions with 
>> parentheses. https://godbolt.org/z/6P47se9WW
>
> error: expected unqualified-id
>
>   return ::_tzcnt_u32(a);
>^
>
> /opt/compiler-explorer/clang-16.0.0/lib/clang/16/include/bmiintrin.h:74:27: 
> note: expanded from macro '_tzcnt_u32'
> #define _tzcnt_u32(a) (__tzcnt_u32((a)))
>
> Looks like double colon operator cannot resolve macros with parentheses, is 
> that a compiler bug or we should handled it in code?
>
>> But I didn't find enough proof in Google. It'd be more persuasive if you can 
>> find it and add to the description.

GCC doesn't resolve it either. https://godbolt.org/z/Prb4s5d6o. So it should 
not be a bug.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157297/new/

https://reviews.llvm.org/D157297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157680: [X86]Support options -mno-gather -mno-scatter

2023-08-11 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/lib/Target/X86/X86.td:437
+: SubtargetFeature<"prefer-no-gather", "PreferGather", "false",
+   "Indicates if gather prefer to be disabled">;
+def FeaturePreferNoScatter

XinWang10 wrote:
> skan wrote:
> > Does "Prefer no gather instructions" sound better?
> > 
> > I think these two should be put under "X86 Subtarget Tuning features"?
> I think the two options are to mitigate security issues. Could refer to link 
> in summary.
It depends on if the micro code was applied. We should assume user care of this 
option should have applied the micro code. So it's more like a performance 
turning rather than mitigation. And you cannot disable all gather/scatter 
instructions with these options.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157680/new/

https://reviews.llvm.org/D157680

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157680: [X86]Support options -mno-gather -mno-scatter

2023-08-11 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/include/clang/Driver/Options.td:903-906
+def mno_gather : Flag<["-"], "mno-gather">, Flags<[NoXarchOption]>,
+ HelpText<"Disable generation of gather instructions in 
auto-vectorization(x86 only)">;
+def mno_scatter : Flag<["-"], "mno-scatter">, Flags<[NoXarchOption]>,
+  HelpText<"Disable generation of scatter instructions in 
auto-vectorization(x86 only)">;

Move under "// X86 feature flags"



Comment at: clang/lib/Driver/ToolChains/Arch/X86.cpp:272-277
+  if (Args.hasArg(options::OPT_mno_gather)) {
+Features.push_back("+prefer-no-gather");
+  }
+  if (Args.hasArg(options::OPT_mno_scatter)) {
+Features.push_back("+prefer-no-scatter");
+  }

No need parentheses



Comment at: llvm/test/CodeGen/X86/x86-prefer-no-gather-no-scatter.ll:9-10
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+

Don't need these and is duplicated with RUN line triple.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157680/new/

https://reviews.llvm.org/D157680

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157566: [SEH] fix assertion when -fasy-exceptions is used.

2023-08-10 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157566/new/

https://reviews.llvm.org/D157566

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157485: [X86][RFC] Support new feature AVX10

2023-08-10 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Basic/Targets/X86.h:99
+  bool HasAVX10_1 = false;
+  bool HasAVX10_512BIT = false;
   bool HasAVX512CD = false;

goldstein.w.n wrote:
> Maybe should be HasAVX10_1_512? As brought up the rfc, there might be an 
> avx10.2-512
> 
> Likewise elsewhere, or is this to match GCC?
No. `HasAVX10_512BIT` is a single feature which can be combined with 
`HasAVX10_1`, `HasAVX10_2` etc.
AFAIK, GCC chooses the similar idea here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157485/new/

https://reviews.llvm.org/D157485

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157251: [X86][regcall] Do not produce @ number suffix if it is regcall4

2023-08-09 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157251/new/

https://reviews.llvm.org/D157251

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157485: [X86][RFC] Support new feature AVX10

2023-08-09 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei created this revision.
pengfei added reviewers: RKSimon, craig.topper, skan, e-kud.
Herald added a subscriber: hiraditya.
Herald added a project: All.
pengfei requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, MaskRay.
Herald added projects: clang, LLVM.

AVX10 Architecture Specification: 
https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157485

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Basic/Targets/X86.cpp
  clang/lib/Basic/Targets/X86.h
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/Targets/X86.cpp
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  clang/test/CodeGen/X86/avx10-error.c
  clang/test/CodeGen/attr-target-x86.c
  clang/test/CodeGen/target-avx-abi-diag.c
  clang/test/Driver/x86-target-features.c
  clang/test/Preprocessor/x86_target_features.c
  llvm/docs/ReleaseNotes.rst
  llvm/include/llvm/TargetParser/X86TargetParser.def
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86RegisterInfo.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/TargetParser/Host.cpp
  llvm/lib/TargetParser/X86TargetParser.cpp
  llvm/test/CodeGen/X86/avx512-arith.ll
  llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
  llvm/test/CodeGen/X86/avx512bw-arith.ll
  llvm/test/CodeGen/X86/avx512bwvl-arith.ll
  llvm/test/CodeGen/X86/avx512fp16-arith.ll
  llvm/test/CodeGen/X86/avx512vl-arith.ll

Index: llvm/test/CodeGen/X86/avx512vl-arith.ll
===
--- llvm/test/CodeGen/X86/avx512vl-arith.ll
+++ llvm/test/CodeGen/X86/avx512vl-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl --show-mc-encoding| FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=+avx10.1 --show-mc-encoding| FileCheck %s
 
 ; 256-bit
 
Index: llvm/test/CodeGen/X86/avx512fp16-arith.ll
===
--- llvm/test/CodeGen/X86/avx512fp16-arith.ll
+++ llvm/test/CodeGen/X86/avx512fp16-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=skx -mattr=+avx512fp16 | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=+avx10.1,+avx10-512bit | FileCheck %s
 
 define <32 x half> @vaddph_512_test(<32 x half> %i, <32 x half> %j) nounwind readnone {
 ; CHECK-LABEL: vaddph_512_test:
Index: llvm/test/CodeGen/X86/avx512bwvl-arith.ll
===
--- llvm/test/CodeGen/X86/avx512bwvl-arith.ll
+++ llvm/test/CodeGen/X86/avx512bwvl-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+avx512vl | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx10.1 | FileCheck %s
 
 ; 256-bit
 
Index: llvm/test/CodeGen/X86/avx512bw-arith.ll
===
--- llvm/test/CodeGen/X86/avx512bw-arith.ll
+++ llvm/test/CodeGen/X86/avx512bw-arith.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx10.1,+avx10-512bit | FileCheck %s
 
 define <64 x i8> @vpaddb512_test(<64 x i8> %i, <64 x i8> %j) nounwind readnone {
 ; CHECK-LABEL: vpaddb512_test:
Index: llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
===
--- llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
+++ llvm/test/CodeGen/X86/avx512-broadcast-arith.ll
@@ -1,6 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512f   | FileCheck %s --check-prefixes=AVX512F
 ; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=AVX512BW
+; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx10.1,+avx10-512bit | FileCheck %s --check-prefixes=AVX512BW
 
 ; PR34666
 define <64 x i8> @add_v64i8_broadcasts(<64 x i8> %a0, i64 %a1, i8 %a2) {
Index: llvm/test/CodeGen/X86/avx512-arith.ll
===
--- llvm/test/CodeGen/X86/avx512-arith.ll
+++ llvm/test/CodeGen/X86/avx512-arith.ll
@@ -4,6 +4,7 @@
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw | FileCheck %s --check-prefix=CHECK --check-prefix=AVX512BW
 ; RUN: llc

[PATCH] D157297: [clang] Fixes compile error like error: expected unqualified-id for ::_tzcnt_u32(mask);

2023-08-08 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

The description is not clear to me. You should describe the reason rather than 
phenomenon.

My understanding is double colon cannot operator cannot resolve functions with 
parentheses. But I didn't find enough proof in Google. It'd be more persuasive 
if you can find it and add to the description.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157297/new/

https://reviews.llvm.org/D157297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-02 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D155145#4556178 , @craig.topper 
wrote:

> In D155145#4556157 , @anna wrote:
>
>> In D155145#4554786 , @anna wrote:
>>
 Can you capture the values of EAX, EBX, ECX, and EDX after the two calls 
 to getX86CpuIDAndInfoEx that have 0x7 as the first argument? Maybe there's 
 a bug in CPUID on Sandy Bridge.
>>>
>>> Sure, on the original code before the patch you suggested right?
>>> The two calls are:
>>>
>>>bool HasLeaf7 =
>>>   MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x0, , , 
>>> , );
>>>   +   llvm::errs() << "Before setting fsgsbase the value for EAX: " << EAX
>>>   + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " 
>>> << EDX
>>>   + << "\n";
>>> 
>>>   Features["fsgsbase"]   = HasLeaf7 && ((EBX >>  0) & 1);
>>>   
>>>   bool HasLeaf7Subleaf1 =
>>>   MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x1, , , 
>>> , );
>>>   +   llvm::errs() << "Before setting sha512 the value for EAX: " << EAX
>>>   + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " 
>>> << EDX
>>>   + << "\n";
>>>   Features["sha512"] = HasLeaf7Subleaf1 && ((EAX >> 0) & 1);
>>>   ...
>>>   we set avxvnniint16 after this
>>>
>>> Takes a while to get a build on this machine, should have the output soon.
>>
>> @craig.topper here is the output:
>>
>>   Before setting fsgsbase the value for EAX: 0 EBX: 0 ECX: 0  EDX: 
>> 2617246720 // this is after the HasLeaf7 calculation
>>   Before setting sha512 the value for EAX: 0 EBX: 0 ECX: 0  EDX: 2617246720 
>> // this is after the HasLeaf7Subleaf1 calculation
>>
>> So, with your patch `HasLeaf7Subleaf1` is 0 as EAX is 0. Pls let me know if 
>> you need  any additional diagnostics output (we actually lose access to the 
>> machine on friday, since it is being retired!).
>>
>>> The documentation says that invalid subleaves of leaf 7 should return all 
>>> 0s. So we thought it was safe to check the bits of sub leaf 1 even if eax 
>>> from subleaf 0 doesn't say subleaf 1 is supported.
>>
>> This means the CPUID doesn't satisfy the documentation since EDX != 0 for 
>> SubLeaf1?
>
> Interestingly all of the bits set in EDX are features that were things that 
> were added in microcode patches in the wake of vulnerabilities like Spectre 
> and Meltdown. Maybe the microcode patch forgot to check the subleaf since 
> there was no subleaf implemented when sandy bridge was originally made.
>
> I think my patch is the correct fix given that information. I'll post a patch 
> for review shortly.

Thanks Craig! That makes sense to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-02 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D155145#4556157 , @anna wrote:

> In D155145#4554786 , @anna wrote:
>
>>> Can you capture the values of EAX, EBX, ECX, and EDX after the two calls to 
>>> getX86CpuIDAndInfoEx that have 0x7 as the first argument? Maybe there's a 
>>> bug in CPUID on Sandy Bridge.
>>
>> Sure, on the original code before the patch you suggested right?
>> The two calls are:
>>
>>bool HasLeaf7 =
>>   MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x0, , , , 
>> );
>>   +   llvm::errs() << "Before setting fsgsbase the value for EAX: " << EAX
>>   + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " 
>> << EDX
>>   + << "\n";
>> 
>>   Features["fsgsbase"]   = HasLeaf7 && ((EBX >>  0) & 1);
>>   
>>   bool HasLeaf7Subleaf1 =
>>   MaxLevel >= 7 && !getX86CpuIDAndInfoEx(0x7, 0x1, , , , 
>> );
>>   +   llvm::errs() << "Before setting sha512 the value for EAX: " << EAX
>>   + << " EBX: " << EBX << " ECX: " << ECX << "  EDX: " 
>> << EDX
>>   + << "\n";
>>   Features["sha512"] = HasLeaf7Subleaf1 && ((EAX >> 0) & 1);
>>   ...
>>   we set avxvnniint16 after this
>>
>> Takes a while to get a build on this machine, should have the output soon.
>
> @craig.topper here is the output:
>
>   Before setting fsgsbase the value for EAX: 0 EBX: 0 ECX: 0  EDX: 2617246720 
> // this is after the HasLeaf7 calculation
>   Before setting sha512 the value for EAX: 0 EBX: 0 ECX: 0  EDX: 2617246720 
> // this is after the HasLeaf7Subleaf1 calculation
>
> So, with your patch `HasLeaf7Subleaf1` is 0 as EAX is 0. Pls let me know if 
> you need  any additional diagnostics output (we actually lose access to the 
> machine on friday, since it is being retired!).
>
>> The documentation says that invalid subleaves of leaf 7 should return all 
>> 0s. So we thought it was safe to check the bits of sub leaf 1 even if eax 
>> from subleaf 0 doesn't say subleaf 1 is supported.
>
> This means the CPUID doesn't satisfy the documentation since EDX != 0 for 
> SubLeaf1?

The identical EDX value looks dubious to me. Could you compile and run above 
code and paste the result here? Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-08-02 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

Thanks @anna and @craig.topper 
I think we can dump the value with the simple code

  $ cat cpuid.c
  #include 
  #include 
  
  int main() {
unsigned int info[4];
for (int i = 0; i < 2; ++i) {
  __get_cpuid_count(7, 1, info, info + 1, info + 2, info + 3);
  printf("%08x\n", info[0]);
  printf("%08x\n", info[1]);
  printf("%08x\n", info[2]);
  printf("%08x\n", info[3]);
}
  }
  
  $ clang cpuid.c && ./a.out


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155863: [X86][Regcall] Add an option to respect regcall ABI v.4 in win64

2023-08-01 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155863/new/

https://reviews.llvm.org/D155863

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-07-28 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D155145#4543326 , @anna wrote:

> We see a crash bisected to this patch about using an illegal instruction. 
> Here's the CPUInfo for the machine:
>
>   CPU info:
>   current cpu id: 22
>   total 32(physical cores 16) (assigned logical cores 32) (assigned physical 
> cores 16) (assigned_sockets:2 of 2) (8 cores per cpu, 2 threads per core) 
> family 6 model 45 stepping 7 microcode 0x71a, cmov, cx8, fxsr, mmx, sse, 
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, aes, clmul, ht, 
> tsc, tscinvbit, tscinv, clflush
>   AvgLoads: 0.30, 0.10, 0.18
>   CPU Model and flags from /proc/cpuinfo:
>   model name  : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
>   flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
> pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
> nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est 
> tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt 
> tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp 
> tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts 
> md_clear flush_l1d
>   Online cpus: 0-31
>   Offline cpus:
>   BIOS frequency limitation: 
>   Frequency switch latency (ns): 2
>   Available cpu frequencies: 
>   Current governor: schedutil
>   Core performance/turbo boost: 
>
> I don't see `avxvnniint16` in the flags list nor avx2. So, this (relatively 
> new) instruction shouldn't be generated for this machine. Any ideas on why 
> this might be happening?

As far as I can see from the patch, the only way to generate avxvnniint16 
instructions is to call its specific intrinsics explicitly. And we will check 
compiling options in FE before allowing to call the instructions. We do have an 
optimization to generate vnni instructions without intrinsics, but we haven't 
extend it to avxvnniint16 so far.
So I don't know what's wrong in your case, could you provide a reproducer for 
your problem?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156239: [X86] Support -march=arrowlake, arrowlake-s, lunarlake

2023-07-27 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.




Comment at: llvm/lib/Target/X86/X86.td:1730
+ProcessorFeatures.SRFFeatures, ProcessorFeatures.ADLTuning>;
+foreach P = ["arrowlake-s", "arrowlake_s"] in {
+def : ProcModelhttps://reviews.llvm.org/D156239/new/

https://reviews.llvm.org/D156239

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155798: [X86] Support -march=graniterapids-d and update -march=graniterapids

2023-07-24 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM, but please wait one or two days for other reviewers.




Comment at: llvm/lib/TargetParser/X86TargetParser.cpp:430
   // Granite Rapids microarchitecture based processors.
-  { {"graniterapids"}, CK_Graniterapids, FEATURE_AVX512BF16, 
FeaturesGraniteRapids, 'n', false },
+  { {"graniterapids"}, CK_Graniterapids, FEATURE_AVX512BF16, 
FeaturesGraniteRapids , 'n', false },
+  // Granite Rapids D microarchitecture based processors.

Remove the space.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155798/new/

https://reviews.llvm.org/D155798

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155863: [X86][Regcall] Add an option to respect regcall ABI v.4 in win64

2023-07-24 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/lib/Target/X86/X86CallingConv.td:98-103
+def RC_X86_64_RegCallv4_Win : RC_X86_64_RegCall {
+  let GPR_8 = [AL, CL, DL, DIL, SIL, R8B, R9B, R11B, R12B, R14B, R15B];
+  let GPR_16 = [AX, CX, DX, DI, SI, R8W, R9W, R11W, R12W, R14W, R15W];
+  let GPR_32 = [EAX, ECX, EDX, EDI, ESI, R8D, R9D, R11D, R12D, R14D, R15D];
+  let GPR_64 = [RAX, RCX, RDX, RDI, RSI, R8, R9, R11, R12, R14, R15];
+}

According to the spec, Win64 calling convention is identical to Linux64 on V4, 
i.e., both `R10` and `R11` are reserved. I think you can reuse 
`RC_X86_64_RegCall_SysV` instead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155863/new/

https://reviews.llvm.org/D155863

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155863: [X86][Regcall] Add an option to respect regcall ABI v.4 in win64

2023-07-23 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7897-7899
+  if (Args.hasArg(options::OPT_regcall4)) {
+CmdArgs.push_back("-regcall4");
+  }

Remove parentheses



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7937-7939
+  if (Args.hasArg(options::OPT__SLASH_Gregcall4)) {
+CmdArgs.push_back("-regcall4");
+  }

ditto.



Comment at: llvm/lib/Target/X86/X86CallingConv.td:468
+defm X86_32_RegCallv4_Win :
+X86_RegCall_base;
 defm X86_Win64_RegCall :

This will define RetCC_* as well but it is not used, hence will emit warning. 
Any way to solve it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155863/new/

https://reviews.llvm.org/D155863

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155798: [X86] Support -march=graniterapids-d and update -march=graniterapids

2023-07-20 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/test/Preprocessor/predefined-arch-macros.c:1925
 // CHECK_GNR_M64: #define __AMX_BF16__ 1
-// CHECK_GNR_M64: #define __AMX_COMPLEX__ 1
+// CHECK_GNR_M64-NOT: #define __AMX_COMPLEX__ 1
+// CHECK_GNRD_M64: #define __AMX_COMPLEX__ 1

FreddyYe wrote:
> RKSimon wrote:
> > Won't this fail on the graniterapids-d run?
> Whops, I didn't realize this problem before! But it indeed doesn't fail. Need 
> to figure out why...
I'm guessing when using multi prefixes, it will try to match with the second 
one if the first failed. It's common and easy to understand for positive check 
but a bit confusing for negative one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155798/new/

https://reviews.llvm.org/D155798

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155861: [Headers][doc] Add SHA1/SHA256 intrinsic descriptions

2023-07-20 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155861/new/

https://reviews.llvm.org/D155861

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155859: [Headers][doc] Add misc non-AVX2 intrinsic descriptions

2023-07-20 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/rdseedintrin.h:56
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0

32



Comment at: clang/lib/Headers/rdseedintrin.h:84
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0

64



Comment at: clang/lib/Headers/xsavecintrin.h:34
+/// ENDFOR
+/// __p.Header.XSTATE_BV := mask
+/// \endcode

It's not `mask` but `mask AND XINUSE[62:0]`. The bit[1] also relies on `MXCSR ≠ 
1F80H`. I think we can simply use `__p.Header.XSTATE_BV[62:0] := 
INIT_FUNCTION(mask[62:0])`



Comment at: clang/lib/Headers/xsavecintrin.h:65
+/// ENDFOR
+/// __p.Header.XSTATE_BV := mask
+/// \endcode

ditto.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155859/new/

https://reviews.llvm.org/D155859

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155784: [X86] Update features for sierraforest, grandridge

2023-07-20 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/test/Preprocessor/predefined-arch-macros.c:2496
 // CHECK_SRF_M32: #define __PTWRITE__ 1
+// CHECK_GRR_M32: #define __RAOINT__ 1
 // CHECK_SRF_M32: #define __RDPID__ 1

This is easy to be confused. How about add a not check? e.g.,

```
... -check-prefix=CHECK_SRF_GRR_M32,CHECK_SRF_M32
... -check-prefixes=CHECK_SRF_GRR_M32,CHECK_GRR_M32
...
// CHECK_SRF_GRR_M32: #define __PTWRITE__ 1
// CHECK_SRF_M32-NOT: __RAOINT__
// CHECK_GRR_M32: #define __RAOINT__ 1
```



Comment at: clang/test/Preprocessor/predefined-arch-macros.c:2568
 // CHECK_SRF_M64: #define __PTWRITE__ 1
+// CHECK_GRR_M64: #define __RAOINT__ 1
 // CHECK_SRF_M64: #define __RDPID__ 1

ditto.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155784/new/

https://reviews.llvm.org/D155784

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: [X86] Add AVX-VNNI-INT16 instructions.

2023-07-19 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D86310: [X86] Align i128 to 16 bytes in x86 datalayouts

2023-07-19 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

Just FYI. There are a few reports about the compatibility issues, e.g., #41784 
. There's also concern about 
the alignment difference between `_BitInt(128)` and `__int128`, see #60925 



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86310/new/

https://reviews.llvm.org/D86310

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155147: [X86] Add SM3 instructions.

2023-07-18 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/sm3intrin.h:162
+/// \param imm8
+///A 128-bit vector of [4 x int].
+/// \returns

This is `int`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155147/new/

https://reviews.llvm.org/D155147

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155147: [X86] Add SM3 instructions.

2023-07-18 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/sm3intrin.h:161-164
+/// \param imm8
+///A 128-bit vector of [4 x int].
+/// \returns
+///A 32-bit int.

The description should invert


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155147/new/

https://reviews.llvm.org/D155147

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155147: [X86] Add SM3 instructions.

2023-07-18 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155147/new/

https://reviews.llvm.org/D155147

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155147: [X86] Add SM3 instructions.

2023-07-18 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/sm3intrin.h:69
+/// dst.dword[2] := P1(TMP2)
+/// dst.dword[3] := P1(TMP3)
+/// \endcode

`DEST[MAX:128] := 0` the same to below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155147/new/

https://reviews.llvm.org/D155147

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155147: [X86] Add SM3 instructions.

2023-07-18 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/sm3intrin.h:28
+/// \code
+/// _mm_sm3msg1_epi32(__m128i __A, __m128i __B, __m128i __C)
+/// \endcode

Add return type too.



Comment at: clang/lib/Headers/sm3intrin.h:85
+/// \code
+/// _mm_sm3msg2_epi32(__m128i __A, __m128i __B, __m128i __C)
+/// \endcode

ditto.



Comment at: clang/lib/Headers/sm3intrin.h:148
+/// \code
+/// _mm_sm3rnds2_epi32(__m128i __A, __m128i __B, __m128i __C, const int imm8)
+/// \endcode

ditto.



Comment at: clang/lib/Headers/sm3intrin.h:230
+/// \endcode
+#define _mm_sm3rnds2_epi32(A, B, C, D) 
\
+  (__m128i) __builtin_ia32_vsm3rnds2((__v4su)A, (__v4su)B, (__v4su)C, (int)D)

Missing `__` for variables.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155147/new/

https://reviews.llvm.org/D155147

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155148: [X86] Add SM4 instructions.

2023-07-18 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

In D155148#4510472 , @RKSimon wrote:

> @pengfei Are you happy with the intrinsics doxygen descriptions?

LGTM except for one comment.




Comment at: clang/lib/Headers/sm4intrin.h:23
+/// \code
+/// _mm_sm4key4_epi32(__m128i __A, __m128i __B)
+/// \endcode

Missing return type here. The same for below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155148/new/

https://reviews.llvm.org/D155148

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: Add AVX-VNNI-INT16 instructions.

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/test/CodeGen/X86/avxvnniint16-intrinsics.ll:3
+; RUN: llc < %s -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+avx2,+avxvnniint16 | FileCheck %s 
--check-prefixes=X64
+; RUN: llc < %s -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avx2,+avxvnniint16 | FileCheck %s 
--check-prefixes=X86
+

craig.topper wrote:
> pengfei wrote:
> > `X86,CHECK`
> I thought the common prefix had to be first? But I might be wrong
You are right 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155146: Add SHA512 instructions.

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/lib/Target/X86/X86.td:243
+  "Support SHA512 instructions",
+  [FeatureAVX]>;
 // Processor supports CET SHSTK - Control-Flow Enforcement Technology

craig.topper wrote:
> AVX2 like other integer features?
ISE says it only requires AVX


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155146/new/

https://reviews.llvm.org/D155146

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155148: Add SM4 instructions.

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/test/CodeGen/X86/sm4-intrinsics.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+sm4 | FileCheck %s
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+sm4 | FileCheck %s

Remove O0.



Comment at: llvm/test/CodeGen/X86/sm4-intrinsics.ll:3
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+sm4 | FileCheck %s
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+sm4 | FileCheck %s
+

ditto.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155148/new/

https://reviews.llvm.org/D155148

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155148: Add SM4 instructions.

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/test/MC/Disassembler/X86/sm4-64.txt:3-4
+
+# RUN: llvm-mc --disassemble %s -triple=x86_64 | FileCheck %s 
--check-prefixes=ATT
+# RUN: llvm-mc --disassemble %s -triple=x86_64 -x86-asm-syntax=intel 
--output-asm-variant=1 | FileCheck %s --check-prefixes=INTEL
+

Merge 64-bit tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155148/new/

https://reviews.llvm.org/D155148

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155147: Add SM3 instructions.

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/test/MC/Disassembler/X86/sm3-32.txt:1
+
+# RUN: llvm-mc --disassemble %s -triple=i386-unknown-unknown | FileCheck %s 
--check-prefixes=ATT

Remove blank line



Comment at: llvm/test/MC/Disassembler/X86/sm3-64.txt:1-2
+
+# isadb version: 6aa5c837365e1d1e6e53b3fc3d95ab184e9c06eb
+

Remove



Comment at: llvm/test/MC/Disassembler/X86/sm3-64.txt:4
+
+# RUN: llvm-mc --disassemble %s -triple=x86_64 | FileCheck %s 
--check-prefixes=ATT
+# RUN: llvm-mc --disassemble %s -triple=x86_64 -x86-asm-syntax=intel 
--output-asm-variant=1 | FileCheck %s --check-prefixes=INTEL

We can merge 64-bit tests into 32-bit ones. The same below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155147/new/

https://reviews.llvm.org/D155147

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155146: Add SHA512 instructions.

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/CMakeLists.txt:207
   shaintrin.h
+  sha512intrin.h
   smmintrin.h

alphabetical order



Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:5112
+def int_x86_vsha512msg1 : ClangBuiltin<"__builtin_ia32_vsha512msg1">,
+Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v2i64_ty], 
[IntrNoMem]>;
+def int_x86_vsha512msg2 : ClangBuiltin<"__builtin_ia32_vsha512msg2">,

DefaultAttrsIntrinsic



Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:5114
+def int_x86_vsha512msg2 : ClangBuiltin<"__builtin_ia32_vsha512msg2">,
+Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], 
[IntrNoMem]>;
+def int_x86_vsha512rnds2 : ClangBuiltin<"__builtin_ia32_vsha512rnds2">,

ditto.



Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:5116
+def int_x86_vsha512rnds2 : ClangBuiltin<"__builtin_ia32_vsha512rnds2">,
+Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty, 
llvm_v2i64_ty],
+[IntrNoMem]>;

ditto.



Comment at: llvm/test/CodeGen/X86/sha512-intrinsics.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X64
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X86

`X64,CHECK`



Comment at: llvm/test/CodeGen/X86/sha512-intrinsics.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X64
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X86

pengfei wrote:
> `X64,CHECK`
No need O0



Comment at: llvm/test/CodeGen/X86/sha512-intrinsics.ll:3
+; ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X64
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X86
+

`X86,CHECK`



Comment at: llvm/test/CodeGen/X86/sha512-intrinsics.ll:3
+; ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X64
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+sha512 | FileCheck %s --check-prefixes=X86
+

pengfei wrote:
> `X86,CHECK`
ditto.



Comment at: llvm/test/MC/Disassembler/X86/sha512-64.txt:1-2
+# RUN: llvm-mc --disassemble %s -triple=x86_64 | FileCheck %s 
--check-prefixes=ATT
+# RUN: llvm-mc --disassemble %s -triple=x86_64 -x86-asm-syntax=intel 
--output-asm-variant=1 | FileCheck %s --check-prefixes=INTEL
+

This can be merged with `sha512-32.txt`



Comment at: llvm/test/MC/X86/sha512-att-64.s:1
+// RUN: llvm-mc -triple x86_64 --show-encoding %s | FileCheck %s
+

ditto.



Comment at: llvm/test/MC/X86/sha512-intel-64.s:1
+// RUN: llvm-mc -triple x86_64 -x86-asm-syntax=intel -output-asm-variant=1 
--show-encoding %s | FileCheck %s
+

ditto.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155146/new/

https://reviews.llvm.org/D155146

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155145: Add AVX-VNNI-INT16 instructions.

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Basic/Targets/X86.cpp:1059
   .Case("avx512vp2intersect", HasAVX512VP2INTERSECT)
+  .Case("avxvnniint16", HasAVXVNNIINT16)
   .Case("avxifma", HasAVXIFMA)

alphabetical order.



Comment at: llvm/test/CodeGen/X86/avxvnniint16-intrinsics.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+avx2,+avxvnniint16 | FileCheck %s 
--check-prefixes=X64
+; RUN: llc < %s -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avx2,+avxvnniint16 | FileCheck %s 
--check-prefixes=X86

`X64,CHECK`



Comment at: llvm/test/CodeGen/X86/avxvnniint16-intrinsics.ll:3
+; RUN: llc < %s -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+avx2,+avxvnniint16 | FileCheck %s 
--check-prefixes=X64
+; RUN: llc < %s -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avx2,+avxvnniint16 | FileCheck %s 
--check-prefixes=X86
+

`X86,CHECK`



Comment at: llvm/test/CodeGen/X86/stack-folding-int-avxvnniint16.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O3 -disable-peephole -verify-machineinstrs 
-mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avx2,+avxvnniint16 | 
FileCheck %s
+

Is this required?



Comment at: llvm/test/CodeGen/X86/stack-folding-int-avxvnniint16.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O3 -disable-peephole -verify-machineinstrs 
-mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avx2,+avxvnniint16 | 
FileCheck %s
+

pengfei wrote:
> Is this required?
Don't need `avx2`



Comment at: llvm/test/MC/Disassembler/X86/avx-vnni-int16.txt:1
+
+# RUN: llvm-mc --disassemble %s -triple=i386-unknown-unknown | FileCheck %s 
--check-prefixes=ATT

Remove blank line.



Comment at: llvm/test/MC/Disassembler/X86/x86-64-avx-vnni-int16.txt:1-2
+
+# isadb version: 90b0204ffb4d39b8a5726db9c49d298aea6164f8
+

Remove this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155145/new/

https://reviews.llvm.org/D155145

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155160: Allow immediate integer for a "p" inline asm constraint

2023-07-13 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei created this revision.
Herald added a project: All.
pengfei requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D155160

Files:
  clang/lib/Basic/Targets/X86.cpp


Index: clang/lib/Basic/Targets/X86.cpp
===
--- clang/lib/Basic/Targets/X86.cpp
+++ clang/lib/Basic/Targets/X86.cpp
@@ -1551,8 +1551,8 @@
 return std::string("{si}");
   case 'D':
 return std::string("{di}");
-  case 'p': // Keep 'p' constraint (address).
-return std::string("p");
+  case 'p': // immediate integer or address.
+return std::string("ip");
   case 't': // top of floating point stack.
 return std::string("{st}");
   case 'u':// second from top of floating point stack.


Index: clang/lib/Basic/Targets/X86.cpp
===
--- clang/lib/Basic/Targets/X86.cpp
+++ clang/lib/Basic/Targets/X86.cpp
@@ -1551,8 +1551,8 @@
 return std::string("{si}");
   case 'D':
 return std::string("{di}");
-  case 'p': // Keep 'p' constraint (address).
-return std::string("p");
+  case 'p': // immediate integer or address.
+return std::string("ip");
   case 't': // top of floating point stack.
 return std::string("{st}");
   case 'u':// second from top of floating point stack.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.




Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

probinson wrote:
> pengfei wrote:
> > probinson wrote:
> > > pengfei wrote:
> > > > probinson wrote:
> > > > > pengfei wrote:
> > > > > > A more intrinsic guide format is `MEM[__X+j:j]`
> > > > > LoadXX is the syntax in the gather intrinsics, e.g. 
> > > > > _mm_mask_i32gather_pd. I'd prefer to be consistent.
> > > > I think the problem here is the measurement is easily confusing.
> > > > From C point of view, `__X` is a `int` pointer, so we should `+ i` 
> > > > rather than `i * 4`
> > > > From the other part of the code, we are measuring in bits, but here `i 
> > > > * 4` is a byte offset.
> > > Well, the pseudo-code is clearly not C. If you look at the gather code, 
> > > it computes a byte address using an offset multiplied by an explicit 
> > > scale factor. I am doing exactly the same here.
> > > 
> > > The syntax `MEM[__X+j:j]` is mixing a byte address with a bit offset, 
> > > which I think is more confusing. To be fully consistent, using `[]` with 
> > > bit offsets only, it should be
> > > ```
> > > k := __X*8 + i*32
> > > result[j+31:j] := MEM[k+31:k]
> > > ```
> > > which I think obscures more than it explains.
> > Yeah, it's not C code here. But we are easy to fall into C concepts, e.g., 
> > why assuming __X is measuring in bytes?
> > That's why I think it's clear to make both in bits.
> > I made a mistake here, I wanted to propose `MEM[__X+j+31: __X+j]`. It 
> > matches with [[ 
> > https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=4057,4058,4059,4053,665,3890,5959,5910,3870,4280=_mm256_maskload_epi32
> >  | Intrinsic Guide ]].
> > 
> We assume `__X` is in bytes because that's how addresses work on X86. Adding 
> a bit offset to a byte address makes no sense. I see that is how existing 
> Intel documentation works, which does not make it correct.
> 
> To "make both in bits" means multiplying `__X` by 8, as in the example in my 
> previous comment. Or coming up with a different syntax that makes the 
> difference clear.
> `MEM(__X)[j+31:j]` or even `MEM[__X][j+31:j]` would be preferable.
My intention is to match with Intrinsic Guide as possible. Multiplying by 8 
cannot achive it, but I cannot deny `__X` in bytes makes sense.
So I'm fine to use a different syntax.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

probinson wrote:
> pengfei wrote:
> > probinson wrote:
> > > pengfei wrote:
> > > > A more intrinsic guide format is `MEM[__X+j:j]`
> > > LoadXX is the syntax in the gather intrinsics, e.g. 
> > > _mm_mask_i32gather_pd. I'd prefer to be consistent.
> > I think the problem here is the measurement is easily confusing.
> > From C point of view, `__X` is a `int` pointer, so we should `+ i` rather 
> > than `i * 4`
> > From the other part of the code, we are measuring in bits, but here `i * 4` 
> > is a byte offset.
> Well, the pseudo-code is clearly not C. If you look at the gather code, it 
> computes a byte address using an offset multiplied by an explicit scale 
> factor. I am doing exactly the same here.
> 
> The syntax `MEM[__X+j:j]` is mixing a byte address with a bit offset, which I 
> think is more confusing. To be fully consistent, using `[]` with bit offsets 
> only, it should be
> ```
> k := __X*8 + i*32
> result[j+31:j] := MEM[k+31:k]
> ```
> which I think obscures more than it explains.
Yeah, it's not C code here. But we are easy to fall into C concepts, e.g., why 
assuming __X is measuring in bytes?
That's why I think it's clear to make both in bits.
I made a mistake here, I wanted to propose `MEM[__X+j+31: __X+j]`. It matches 
with [[ 
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=4057,4058,4059,4053,665,3890,5959,5910,3870,4280=_mm256_maskload_epi32
 | Intrinsic Guide ]].



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-29 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

probinson wrote:
> pengfei wrote:
> > A more intrinsic guide format is `MEM[__X+j:j]`
> LoadXX is the syntax in the gather intrinsics, e.g. _mm_mask_i32gather_pd. 
> I'd prefer to be consistent.
I think the problem here is the measurement is easily confusing.
From C point of view, `__X` is a `int` pointer, so we should `+ i` rather than 
`i * 4`
From the other part of the code, we are measuring in bits, but here `i * 4` is 
a byte offset.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151696: [x86] Remove CPU_SPECIFIC* MACROs and add getCPUDispatchMangling

2023-06-29 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

It looks to me the failed unit tests might be related to this patch, please 
take a look.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151696/new/

https://reviews.llvm.org/D151696

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151696: [x86] Remove CPU_SPECIFIC* MACROs and add getCPUDispatchMangling

2023-06-29 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

I have some concerns for RULE 3, especially `core_aes_pclmulqdq -> westmere` 
and `atom_sse4_2_movbe -> silvermont`.
Sometimes, we have minor feature differences in the same generation targets. I 
guess that's why we use `arch_feature` naming like core_2_duo_ssse3. Merging 
them into the same generation or the next generation might corrup the intention 
here. But I'm not expert in CPUDispatch, and I don't see any existing tests for 
them, so I won't block the patch since it's an improvement in general.
Please wait a few days for other reviewers' opinions.

Comment at: clang/test/CodeGen/attr-cpuspecific.c:56
 // WINDOWS: %[[FEAT_INIT:.+]] = load i32, ptr getelementptr inbounds ({ i32, 
i32, i32, [1 x i32] }, ptr @__cpu_model, i32 0, i32 3, i32 0), align 4
-// WINDOWS: %[[FEAT_JOIN:.+]] = and i32 %[[FEAT_INIT]], 1023
-// WINDOWS: %[[FEAT_CHECK:.+]] = icmp eq i32 %[[FEAT_JOIN]], 1023
+// WINDOWS: %[[FEAT_JOIN:.+]] = and i32 %[[FEAT_INIT]], 525311
+// WINDOWS: %[[FEAT_CHECK:.+]] = icmp eq i32 %[[FEAT_JOIN]], 525311

FreddyYe wrote:
> This value change is because the feature list of knl described in 
> X86TargetParser.def before missed feature "bmi2" and "aes".
The comment is for TwoVersions?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151696/new/

https://reviews.llvm.org/D151696

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-29 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:1324
+///   k := i*16
+///   result[k+15:k] := SignExtend(__V[j+7:7])
+/// ENDFOR

j



Comment at: clang/lib/Headers/avx2intrin.h:1352
+///   k := i*32
+///   result[k+31:k] := SignExtend(__V[j+7:7])
+/// ENDFOR

j



Comment at: clang/lib/Headers/avx2intrin.h:1407
+///   k := i*32
+///   result[k+31:k] := SignExtend(__V[j+15:7])
+/// ENDFOR

j



Comment at: clang/lib/Headers/avx2intrin.h:1483
+///   k := i*16
+///   result[k+15:k] := ZeroExtend(__V[j+7:7])
+/// ENDFOR

j



Comment at: clang/lib/Headers/avx2intrin.h:1509
+///   k := i*32
+///   result[k+31:k] := ZeroExtend(__V[j+7:7])
+/// ENDFOR

j



Comment at: clang/lib/Headers/avx2intrin.h:1560
+///   k := i*32
+///   result[k+31:k] := ZeroExtend(__V[j+15:7])
+/// ENDFOR

j



Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

A more intrinsic guide format is `MEM[__X+j:j]`



Comment at: clang/lib/Headers/avx2intrin.h:3506
+///   IF __M[j+63] == 1
+/// result[j+63:j] := Load64(__X+(i*8))
+///   ELSE

ditto.



Comment at: clang/lib/Headers/avx2intrin.h:3538
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

ditto.



Comment at: clang/lib/Headers/avx2intrin.h:3570
+///   IF __M[j+63] == 1
+/// result[j+63:j] := Load64(__X+(i*8))
+///   ELSE

ditto.



Comment at: clang/lib/Headers/avx2intrin.h:3602
+///   IF __M[j+31] == 1
+/// Store32(__X+(i*4), __Y[j+31:j])
+///   FI

MEM[j+31:j] := __Y[j+31:j]



Comment at: clang/lib/Headers/avx2intrin.h:3632
+///   IF __M[j+63] == 1
+/// Store64(__X+(i*8), __Y[j+63:j])
+///   FI

ditto.



Comment at: clang/lib/Headers/avx2intrin.h:3662
+///   IF __M[j+31] == 1
+/// Store32(__X+(i*4), __Y[j+31:j])
+///   FI

ditto.



Comment at: clang/lib/Headers/avx2intrin.h:3692
+///   IF __M[j+63] == 1
+/// Store64(__X+(i*8), __Y[j+63:j])
+///   FI

ditto.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152989: Pre-commit test for D151696.

2023-06-27 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152989/new/

https://reviews.llvm.org/D152989

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152989: Pre-commit test for D151696.

2023-06-27 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/test/CodeGen/attr-cpuspecific-cpus.c:1-2
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s
+// RUN: %clang_cc1 -triple x86_64-windows-pc -fms-compatibility -emit-llvm -o 
- %s
+

Is it only to check no compile warning/error? Should we add a `-verify`?



Comment at: clang/test/CodeGen/attr-cpuspecific-cpus.c:8
+#define ATTR(X) __attribute__((X))
+#endif // _MSC_VER
+

_MSC_VER or _WIN64?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152989/new/

https://reviews.llvm.org/D152989

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153681: [X86] Move back _mulx_u32 to 32-bit only

2023-06-23 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei created this revision.
pengfei added reviewers: probinson, craig.topper, RKSimon.
Herald added a project: All.
pengfei requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

We cannot lower it to mulx at the present due to backend reason.
Since GCC doesn't support it on 64-bit, it's OK to leave it 32-bit only.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D153681

Files:
  clang/lib/Headers/bmi2intrin.h


Index: clang/lib/Headers/bmi2intrin.h
===
--- clang/lib/Headers/bmi2intrin.h
+++ clang/lib/Headers/bmi2intrin.h
@@ -35,14 +35,6 @@
   return __builtin_ia32_pext_si(__X, __Y);
 }
 
-static __inline__ unsigned int __DEFAULT_FN_ATTRS
-_mulx_u32(unsigned int __X, unsigned int __Y, unsigned int *__P)
-{
-  unsigned long long __res = (unsigned long long) __X * __Y;
-  *__P = (unsigned int)(__res >> 32);
-  return (unsigned int)__res;
-}
-
 #ifdef  __x86_64__
 
 static __inline__ unsigned long long __DEFAULT_FN_ATTRS
@@ -72,7 +64,17 @@
   return (unsigned long long) __res;
 }
 
-#endif /* __x86_64__  */
+#else /* !__x86_64__ */
+
+static __inline__ unsigned int __DEFAULT_FN_ATTRS
+_mulx_u32 (unsigned int __X, unsigned int __Y, unsigned int *__P)
+{
+  unsigned long long __res = (unsigned long long) __X * __Y;
+  *__P = (unsigned int) (__res >> 32);
+  return (unsigned int) __res;
+}
+
+#endif /* !__x86_64__ */
 
 #undef __DEFAULT_FN_ATTRS
 


Index: clang/lib/Headers/bmi2intrin.h
===
--- clang/lib/Headers/bmi2intrin.h
+++ clang/lib/Headers/bmi2intrin.h
@@ -35,14 +35,6 @@
   return __builtin_ia32_pext_si(__X, __Y);
 }
 
-static __inline__ unsigned int __DEFAULT_FN_ATTRS
-_mulx_u32(unsigned int __X, unsigned int __Y, unsigned int *__P)
-{
-  unsigned long long __res = (unsigned long long) __X * __Y;
-  *__P = (unsigned int)(__res >> 32);
-  return (unsigned int)__res;
-}
-
 #ifdef  __x86_64__
 
 static __inline__ unsigned long long __DEFAULT_FN_ATTRS
@@ -72,7 +64,17 @@
   return (unsigned long long) __res;
 }
 
-#endif /* __x86_64__  */
+#else /* !__x86_64__ */
+
+static __inline__ unsigned int __DEFAULT_FN_ATTRS
+_mulx_u32 (unsigned int __X, unsigned int __Y, unsigned int *__P)
+{
+  unsigned long long __res = (unsigned long long) __X * __Y;
+  *__P = (unsigned int) (__res >> 32);
+  return (unsigned int) __res;
+}
+
+#endif /* !__x86_64__ */
 
 #undef __DEFAULT_FN_ATTRS
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153576: [Headers] Fix up some conditionals

2023-06-23 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D153576#4442096 , @craig.topper 
wrote:

> The mulx function being 32-bit mode only is also true in gcc. It probably 
> won't generate a mulx instruction on x86-64. Maybe that's why it was 32-bit 
> only? But it should still be functionally correct.

It's not functionally correct. Candidate fix D153620 
.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153576/new/

https://reviews.llvm.org/D153576

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151696: [x86] Remove CPU_SPECIFIC* MACROs and add getCPUDispatchMangling

2023-06-21 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/test/CodeGen/attr-cpuspecific.c:342
 
-// CHECK: attributes #[[S]] = 
{{.*}}"target-features"="+avx,+cmov,+crc32,+cx8,+f16c,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
 // CHECK-SAME: "tune-cpu"="ivybridge"

Why `cmov` disappearing? This feature is supported by most recent targets. Same 
below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151696/new/

https://reviews.llvm.org/D151696

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151749: [Headers][doc] Add "shuffle-like" intrinsic descriptions to avx2intrin.h

2023-05-31 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151749/new/

https://reviews.llvm.org/D151749

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-29 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

In D150645#4379536 , @glandium wrote:

> In D150645#4351266 , @aeubanks 
> wrote:
>
>> this causes `./build/rel/bin/clang-cl /c /tmp/a.cc /Fo/tmp/a 
>> /guard:cf,nochecks` to go from no warnings to
>>
>>   clang-cl: warning: argument unused during compilation: 
>> '/guard:cf,nochecks' [-Wunused-command-line-argument]
>
> This affects 16.0.4

Sorry for making the trouble. I have made a backport request targeting 16.0.5: 
https://github.com/llvm/llvm-project/issues/62921


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150645/new/

https://reviews.llvm.org/D150645

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151610: [Docs] Fix Sphinx documentation formatting issues in LanguageExtensions.rst

2023-05-27 Thread Phoebe Wang via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGdd2e681612f1: [Docs] Fix Sphinx documentation formatting 
issues in LanguageExtensions.rst (authored by codemzs, committed by pengfei).

Changed prior to commit:
  https://reviews.llvm.org/D151610?vs=526246=526248#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151610/new/

https://reviews.llvm.org/D151610

Files:
  clang/docs/LanguageExtensions.rst


Index: clang/docs/LanguageExtensions.rst
===
--- clang/docs/LanguageExtensions.rst
+++ clang/docs/LanguageExtensions.rst
@@ -784,8 +784,8 @@
 to ``float``; see below for more information on this emulation.
 
 * ``__fp16`` is supported on all targets. The special semantics of this
-type mean that no arithmetic is ever performed directly on ``__fp16`` values;
-see below.
+  type mean that no arithmetic is ever performed directly on ``__fp16`` values;
+  see below.
 
 * ``_Float16`` is supported on the following targets:
   * 32-bit ARM (natively on some architecture versions)
@@ -809,7 +809,7 @@
 exponent range as `float`, just with greatly reduced precision.
 
 ``_Float16`` and ``__bf16`` follow the usual rules for arithmetic
-floating-point types.  Most importantly, this means that arithmetic operations
+floating-point types. Most importantly, this means that arithmetic operations
 on operands of these types are formally performed in the type and produce
 values of the type. ``__fp16`` does not follow those rules: most operations
 immediately promote operands of type ``__fp16`` to ``float``, and so
@@ -833,28 +833,29 @@
 
 The use of excess precision can be independently controlled for these two
 types with the ``-ffloat16-excess-precision=`` and
-``-fbfloat16-excess-precision=`` options.  Valid values include:
-- ``none`` (meaning to perform strict operation-by-operation emulation)
-- ``standard`` (meaning that excess precision is permitted under the rules
-  described in the standard, i.e. never across explicit casts or statements)
-- ``fast`` (meaning that excess precision is permitted whenever the
+``-fbfloat16-excess-precision=`` options. Valid values include:
+
+* ``none``: meaning to perform strict operation-by-operation emulation
+* ``standard``: meaning that excess precision is permitted under the rules
+  described in the standard, i.e. never across explicit casts or statements
+* ``fast``: meaning that excess precision is permitted whenever the
   optimizer sees an opportunity to avoid truncations; currently this has no
-  effect beyond ``standard``)
+  effect beyond ``standard``
 
 The ``_Float16`` type is an interchange floating type specified in
- ISO/IEC TS 18661-3:2015 ("Floating-point extensions for C").  It will
+ISO/IEC TS 18661-3:2015 ("Floating-point extensions for C"). It will
 be supported on more targets as they define ABIs for it.
 
 The ``__bf16`` type is a non-standard extension, but it generally follows
 the rules for arithmetic interchange floating types from ISO/IEC TS
-18661-3:2015.  In previous versions of Clang, it was a storage-only type
-that forbade arithmetic operations.  It will be supported on more targets
+18661-3:2015. In previous versions of Clang, it was a storage-only type
+that forbade arithmetic operations. It will be supported on more targets
 as they define ABIs for it.
 
 The ``__fp16`` type was originally an ARM extension and is specified
 by the `ARM C Language Extensions 
`_.
 Clang uses the ``binary16`` format from IEEE 754-2008 for ``__fp16``,
-not the ARM alternative format.  Operators that expect arithmetic operands
+not the ARM alternative format. Operators that expect arithmetic operands
 immediately promote ``__fp16`` operands to ``float``.
 
 It is recommended that portable code use ``_Float16`` instead of ``__fp16``,
@@ -870,7 +871,7 @@
 
 Because default argument promotion only applies to the standard floating-point
 types, ``_Float16`` values are not promoted to ``double`` when passed as 
variadic
-or untyped arguments.  As a consequence, some caution must be taken when using
+or untyped arguments. As a consequence, some caution must be taken when using
 certain library facilities with ``_Float16``; for example, there is no 
``printf`` format
 specifier for ``_Float16``, and (unlike ``float``) it will not be implicitly 
promoted to
 ``double`` when passed to ``printf``, so the programmer must explicitly cast 
it to


Index: clang/docs/LanguageExtensions.rst
===
--- clang/docs/LanguageExtensions.rst
+++ clang/docs/LanguageExtensions.rst
@@ -784,8 +784,8 @@
 to ``float``; see below for more information on this emulation.
 
 * ``__fp16`` is supported on all targets. The special semantics of this
-type mean that no arithmetic is ever

[PATCH] D151610: [Docs] Fix Sphinx documentation formatting issues in LanguageExtensions.rst

2023-05-27 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151610/new/

https://reviews.llvm.org/D151610

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150913: [Clang][BFloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

2023-05-26 Thread Phoebe Wang via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe62175736551: [Clang][BFloat16] Upgrade __bf16 to arithmetic 
type, change mangling, and… (authored by codemzs, committed by pengfei).

Changed prior to commit:
  https://reviews.llvm.org/D150913?vs=526072=526243#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150913/new/

https://reviews.llvm.org/D150913

Files:
  clang/docs/LanguageExtensions.rst
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Basic/FPOptions.def
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Basic/TargetInfo.h
  clang/include/clang/Driver/Options.td
  clang/lib/AST/Type.cpp
  clang/lib/Basic/TargetInfo.cpp
  clang/lib/Basic/Targets/AMDGPU.h
  clang/lib/Basic/Targets/ARM.cpp
  clang/lib/Basic/Targets/NVPTX.h
  clang/lib/Basic/Targets/X86.cpp
  clang/lib/Basic/Targets/X86.h
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaCast.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/lib/Sema/SemaOverload.cpp
  clang/test/CodeGen/X86/avx512bf16-error.c
  clang/test/CodeGen/X86/bfloat-mangle.cpp
  clang/test/CodeGen/X86/bfloat16.cpp
  clang/test/CodeGen/X86/fexcess-precision-bfloat16.c
  clang/test/CodeGenCUDA/amdgpu-bf16.cu
  clang/test/CodeGenCUDA/bf16.cu
  clang/test/Driver/fexcess-precision.c
  clang/test/Sema/arm-bf16-forbidden-ops.c
  clang/test/Sema/arm-bf16-forbidden-ops.cpp
  clang/test/Sema/arm-bfloat.cpp
  clang/test/SemaCUDA/amdgpu-bf16.cu
  clang/test/SemaCUDA/bf16.cu

Index: clang/test/SemaCUDA/bf16.cu
===
--- clang/test/SemaCUDA/bf16.cu
+++ clang/test/SemaCUDA/bf16.cu
@@ -2,32 +2,32 @@
 // REQUIRES: x86-registered-target
 
 // RUN: %clang_cc1 "-triple" "x86_64-unknown-linux-gnu" "-aux-triple" "nvptx64-nvidia-cuda" \
-// RUN:"-target-cpu" "x86-64" -fsyntax-only -verify=scalar %s
+// RUN:"-target-cpu" "x86-64" -fsyntax-only -verify=scalar -Wno-unused %s
 // RUN: %clang_cc1 "-aux-triple" "x86_64-unknown-linux-gnu" "-triple" "nvptx64-nvidia-cuda" \
-// RUN:-fcuda-is-device "-aux-target-cpu" "x86-64" -fsyntax-only -verify=scalar %s
+// RUN:-fcuda-is-device "-aux-target-cpu" "x86-64" -fsyntax-only -verify=scalar -Wno-unused %s
 
 #include "Inputs/cuda.h"
 
 __device__ void test(bool b, __bf16 *out, __bf16 in) {
   __bf16 bf16 = in; // No error on using the type itself.
 
-  bf16 + bf16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__bf16')}}
-  bf16 - bf16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__bf16')}}
-  bf16 * bf16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__bf16')}}
-  bf16 / bf16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__bf16')}}
+  bf16 + bf16;
+  bf16 - bf16;
+  bf16 * bf16;
+  bf16 / bf16;
 
   __fp16 fp16;
 
-  bf16 + fp16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__fp16')}}
-  fp16 + bf16; // scalar-error {{invalid operands to binary expression ('__fp16' and '__bf16')}}
-  bf16 - fp16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__fp16')}}
-  fp16 - bf16; // scalar-error {{invalid operands to binary expression ('__fp16' and '__bf16')}}
-  bf16 * fp16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__fp16')}}
-  fp16 * bf16; // scalar-error {{invalid operands to binary expression ('__fp16' and '__bf16')}}
-  bf16 / fp16; // scalar-error {{invalid operands to binary expression ('__bf16' and '__fp16')}}
-  fp16 / bf16; // scalar-error {{invalid operands to binary expression ('__fp16' and '__bf16')}}
+  bf16 + fp16;
+  fp16 + bf16;
+  bf16 - fp16;
+  fp16 - bf16;
+  bf16 * fp16;
+  fp16 * bf16;
+  bf16 / fp16;
+  fp16 / bf16;
   bf16 = fp16; // scalar-error {{assigning to '__bf16' from incompatible type '__fp16'}}
   fp16 = bf16; // scalar-error {{assigning to '__fp16' from incompatible type '__bf16'}}
-  bf16 + (b ? fp16 : bf16); // scalar-error {{incompatible operand types ('__fp16' and '__bf16')}}
+  bf16 + (b ? fp16 : bf16);
   *out = bf16;
 }
Index: clang/test/SemaCUDA/amdgpu-bf16.cu
===
--- clang/test/SemaCUDA/amdgpu-bf16.cu
+++ clang/test/SemaCUDA/amdgpu-bf16.cu
@@ -1,13 +1,8 @@
 // REQUIRES: amdgpu-registered-target
 // REQUIRES: x86-registered-target
 
-// RUN: %clang_cc1 "-triple" "x86_64-unknown-linux-gnu" "-aux-triple" "amdgcn-amd-amdhsa"\
-// RUN:"-target-cpu" "x86-64" -fsyntax-only -verify=amdgcn %s
-// RUN: %clang_cc1 "-aux-triple" "x86_64-unknown-linux-gnu" "-triple" "amdgcn-amd-amdhsa"\
-// RUN:-fcuda-is-device "-aux-target-cpu" "x86-64" -fsyntax-only -verify=amdgcn %s
-
 // RUN: %clang_cc1 "-aux-triple" "x86_64-unknown-linux-gnu" "-triple" "r600-unknown-unknown"\
-// RUN:

[PATCH] D151509: [Driver][X86] Reject unsupported value for -mabi=

2023-05-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.




Comment at: clang/test/Driver/x86-mabi.c:6
+
+// CHECK-NOT: {{error|warning}}:
+// ERR: error: unsupported option '-mabi=' for target '{{.*}}'

MaskRay wrote:
> pengfei wrote:
> > What's the expected result for e.g. `x86_64-pc-windows-gnu/mingw/cygnus` ?
> They use `-mabi=ms`, but I feel it's excessive to test all combinations.
> (I think `*windows-cygnus` triples are likely dead.)
Agreed, just want to double confirm it. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151509/new/

https://reviews.llvm.org/D151509

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151509: [Driver][X86] Reject unsupported value for -mabi=

2023-05-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/test/Driver/x86-mabi.c:6
+
+// CHECK-NOT: {{error|warning}}:
+// ERR: error: unsupported option '-mabi=' for target '{{.*}}'

What's the expected result for e.g. `x86_64-pc-windows-gnu/mingw/cygnus` ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151509/new/

https://reviews.llvm.org/D151509

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150913: [Clang][BFloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

2023-05-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:852
 ``double`` when passed to ``printf``, so the programmer must explicitly cast 
it to
 ``double`` before using it with an ``%f`` or similar specifier.
 

rjmccall wrote:
> codemzs wrote:
> > pengfei wrote:
> > > rjmccall wrote:
> > > > pengfei wrote:
> > > > > rjmccall wrote:
> > > > > > Suggested rework:
> > > > > > 
> > > > > > ```
> > > > > > Clang supports three half-precision (16-bit) floating point types: 
> > > > > > ``__fp16``,
> > > > > > ``_Float16`` and ``__bf16``.  These types are supported in all 
> > > > > > language
> > > > > > modes, but not on all targets:
> > > > > > 
> > > > > > - ``__fp16`` is supported on every target.
> > > > > > 
> > > > > > - ``_Float16`` is currently supported on the following targets:
> > > > > >   * 32-bit ARM (natively on some architecture versions)
> > > > > >   * 64-bit ARM (AArch64) (natively on ARMv8.2a and above)
> > > > > >   * AMDGPU (natively)
> > > > > >   * SPIR (natively)
> > > > > >   * X86 (if SSE2 is available; natively if AVX512-FP16 is also 
> > > > > > available)
> > > > > > 
> > > > > > - ``__bf16`` is currently supported on the following targets:
> > > > > >   * 32-bit ARM
> > > > > >   * 64-bit ARM (AArch64)
> > > > > >   * X86 (when SSE2 is available)
> > > > > > 
> > > > > > (For X86, SSE2 is available on 64-bit and all recent 32-bit 
> > > > > > processors.)
> > > > > > 
> > > > > > ``__fp16`` and ``_Float16`` both use the binary16 format from IEEE
> > > > > > 754-2008, which provides a 5-bit exponent and an 11-bit significand
> > > > > > (counting the implicit leading 1).  ``__bf16`` uses the `bfloat16
> > > > > > `_ 
> > > > > > format,
> > > > > > which provides an 8-bit exponent and an 8-bit significand; this is 
> > > > > > the same
> > > > > > exponent range as `float`, just with greatly reduced precision.
> > > > > > 
> > > > > > ``_Float16`` and ``__bf16`` follow the usual rules for arithmetic
> > > > > > floating-point types.  Most importantly, this means that arithmetic 
> > > > > > operations
> > > > > > on operands of these types are formally performed in the type and 
> > > > > > produce
> > > > > > values of the type.  ``__fp16`` does not follow those rules: most 
> > > > > > operations
> > > > > > immediately promote operands of type ``__fp16`` to ``float``, and so
> > > > > > arithmetic operations are defined to be performed in ``float`` and 
> > > > > > so result in
> > > > > > a value of type ``float`` (unless further promoted because of other 
> > > > > > operands).
> > > > > > See below for more information on the exact specifications of these 
> > > > > > types.
> > > > > > 
> > > > > > Only some of the supported processors for ``__fp16`` and ``__bf16`` 
> > > > > > offer
> > > > > > native hardware support for arithmetic in their corresponding 
> > > > > > formats.
> > > > > > The exact conditions are described in the lists above.  When 
> > > > > > compiling for a
> > > > > > processor without native support, Clang will perform the arithmetic 
> > > > > > in
> > > > > > ``float``, inserting extensions and truncations as necessary.  This 
> > > > > > can be
> > > > > > done in a way that exactly emulates the behavior of hardware 
> > > > > > support for
> > > > > > arithmetic, but it can require many extra operations.  By default, 
> > > > > > Clang takes
> > > > > > advantage of the C standard's allowances for excess precision in 
> > > > > > intermediate
> > > > > > operands in order to eliminate intermediate truncations within 
> > > > > > statements.
> > > > > > This is generally much faster but can generate different results 
> > > > > > from strict
> > > > > > operation-by-operation emulation.
> > > > > > 
> > > > > > The use of excess precision can be independently controlled for 
> > > > > > these two
> > > > > > types with the ``-ffloat16-excess-precision=`` and
> > > > > > ``-fbfloat16-excess-precision=`` options.  Valid values include:
> > > > > > - ``none`` (meaning to perform strict operation-by-operation 
> > > > > > emulation)
> > > > > > - ``standard`` (meaning that excess precision is permitted under 
> > > > > > the rules
> > > > > >   described in the standard, i.e. never across explicit casts or 
> > > > > > statements)
> > > > > > - ``fast`` (meaning that excess precision is permitted whenever the
> > > > > >   optimizer sees an opportunity to avoid truncations; currently 
> > > > > > this has no
> > > > > >   effect beyond ``standard``)
> > > > > > 
> > > > > > The ``_Float16`` type is an interchange floating type specified in
> > > > > >  ISO/IEC TS 18661-3:2015 ("Floating-point extensions for C").  It 
> > > > > > will
> > > > > > be supported on more targets as they define ABIs for it.
> > > > > > 
> > > > > > The ``__bf16`` type is a non-standard extension, but it generally 
> > > > > > follows
> > > > > > the rules

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM except for a possible typo.




Comment at: clang/lib/Headers/avx2intrin.h:412
+///vectors of [16 x i16] and returns the lower 16 bits of each difference
+///in an element of the [16 x i16] result (overflow is ignored).
+///Differences from \a __a are returned in the lower 64 bits of each

underflow?



Comment at: clang/lib/Headers/avx2intrin.h:448
+///vectors of [8 x i32] and returns the lower 32 bits of each difference in
+///an element of the [8 x i31] result (overflow is ignored). Differences
+///from \a __a are returned in the lower 64 bits of each 128-bit half of

typo or intended?



Comment at: clang/lib/Headers/avx2intrin.h:448
+///vectors of [8 x i32] and returns the lower 32 bits of each difference in
+///an element of the [8 x i31] result (overflow is ignored). Differences
+///from \a __a are returned in the lower 64 bits of each 128-bit half of

pengfei wrote:
> typo or intended?
underflow.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150913: [Clang][BFloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

2023-05-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:852
 ``double`` when passed to ``printf``, so the programmer must explicitly cast 
it to
 ``double`` before using it with an ``%f`` or similar specifier.
 

rjmccall wrote:
> pengfei wrote:
> > rjmccall wrote:
> > > Suggested rework:
> > > 
> > > ```
> > > Clang supports three half-precision (16-bit) floating point types: 
> > > ``__fp16``,
> > > ``_Float16`` and ``__bf16``.  These types are supported in all language
> > > modes, but not on all targets:
> > > 
> > > - ``__fp16`` is supported on every target.
> > > 
> > > - ``_Float16`` is currently supported on the following targets:
> > >   * 32-bit ARM (natively on some architecture versions)
> > >   * 64-bit ARM (AArch64) (natively on ARMv8.2a and above)
> > >   * AMDGPU (natively)
> > >   * SPIR (natively)
> > >   * X86 (if SSE2 is available; natively if AVX512-FP16 is also available)
> > > 
> > > - ``__bf16`` is currently supported on the following targets:
> > >   * 32-bit ARM
> > >   * 64-bit ARM (AArch64)
> > >   * X86 (when SSE2 is available)
> > > 
> > > (For X86, SSE2 is available on 64-bit and all recent 32-bit processors.)
> > > 
> > > ``__fp16`` and ``_Float16`` both use the binary16 format from IEEE
> > > 754-2008, which provides a 5-bit exponent and an 11-bit significand
> > > (counting the implicit leading 1).  ``__bf16`` uses the `bfloat16
> > > `_ format,
> > > which provides an 8-bit exponent and an 8-bit significand; this is the 
> > > same
> > > exponent range as `float`, just with greatly reduced precision.
> > > 
> > > ``_Float16`` and ``__bf16`` follow the usual rules for arithmetic
> > > floating-point types.  Most importantly, this means that arithmetic 
> > > operations
> > > on operands of these types are formally performed in the type and produce
> > > values of the type.  ``__fp16`` does not follow those rules: most 
> > > operations
> > > immediately promote operands of type ``__fp16`` to ``float``, and so
> > > arithmetic operations are defined to be performed in ``float`` and so 
> > > result in
> > > a value of type ``float`` (unless further promoted because of other 
> > > operands).
> > > See below for more information on the exact specifications of these types.
> > > 
> > > Only some of the supported processors for ``__fp16`` and ``__bf16`` offer
> > > native hardware support for arithmetic in their corresponding formats.
> > > The exact conditions are described in the lists above.  When compiling 
> > > for a
> > > processor without native support, Clang will perform the arithmetic in
> > > ``float``, inserting extensions and truncations as necessary.  This can be
> > > done in a way that exactly emulates the behavior of hardware support for
> > > arithmetic, but it can require many extra operations.  By default, Clang 
> > > takes
> > > advantage of the C standard's allowances for excess precision in 
> > > intermediate
> > > operands in order to eliminate intermediate truncations within statements.
> > > This is generally much faster but can generate different results from 
> > > strict
> > > operation-by-operation emulation.
> > > 
> > > The use of excess precision can be independently controlled for these two
> > > types with the ``-ffloat16-excess-precision=`` and
> > > ``-fbfloat16-excess-precision=`` options.  Valid values include:
> > > - ``none`` (meaning to perform strict operation-by-operation emulation)
> > > - ``standard`` (meaning that excess precision is permitted under the rules
> > >   described in the standard, i.e. never across explicit casts or 
> > > statements)
> > > - ``fast`` (meaning that excess precision is permitted whenever the
> > >   optimizer sees an opportunity to avoid truncations; currently this has 
> > > no
> > >   effect beyond ``standard``)
> > > 
> > > The ``_Float16`` type is an interchange floating type specified in
> > >  ISO/IEC TS 18661-3:2015 ("Floating-point extensions for C").  It will
> > > be supported on more targets as they define ABIs for it.
> > > 
> > > The ``__bf16`` type is a non-standard extension, but it generally follows
> > > the rules for arithmetic interchange floating types from ISO/IEC TS
> > > 18661-3:2015.  In previous versions of Clang, it was a storage-only type
> > > that forbade arithmetic operations.  It will be supported on more targets
> > > as they define ABIs for it.
> > > 
> > > The ``__fp16`` type was originally an ARM extension and is specified
> > > by the `ARM C Language Extensions 
> > > `_.
> > > Clang uses the ``binary16`` format from IEEE 754-2008 for ``__fp16``,
> > > not the ARM alternative format.  Operators that expect arithmetic operands
> > > immediately promote ``__fp16`` operands to ``float``.
> > > 
> > > It is recommended that portable code use ``_Float16`` instead of 
> > > ``__fp16``,
> > > as it has been

[PATCH] D150913: [Clang][Bfloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

2023-05-24 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:852
 ``double`` when passed to ``printf``, so the programmer must explicitly cast 
it to
 ``double`` before using it with an ``%f`` or similar specifier.
 

rjmccall wrote:
> Suggested rework:
> 
> ```
> Clang supports three half-precision (16-bit) floating point types: ``__fp16``,
> ``_Float16`` and ``__bf16``.  These types are supported in all language
> modes, but not on all targets:
> 
> - ``__fp16`` is supported on every target.
> 
> - ``_Float16`` is currently supported on the following targets:
>   * 32-bit ARM (natively on some architecture versions)
>   * 64-bit ARM (AArch64) (natively on ARMv8.2a and above)
>   * AMDGPU (natively)
>   * SPIR (natively)
>   * X86 (if SSE2 is available; natively if AVX512-FP16 is also available)
> 
> - ``__bf16`` is currently supported on the following targets:
>   * 32-bit ARM
>   * 64-bit ARM (AArch64)
>   * X86 (when SSE2 is available)
> 
> (For X86, SSE2 is available on 64-bit and all recent 32-bit processors.)
> 
> ``__fp16`` and ``_Float16`` both use the binary16 format from IEEE
> 754-2008, which provides a 5-bit exponent and an 11-bit significand
> (counting the implicit leading 1).  ``__bf16`` uses the `bfloat16
> `_ format,
> which provides an 8-bit exponent and an 8-bit significand; this is the same
> exponent range as `float`, just with greatly reduced precision.
> 
> ``_Float16`` and ``__bf16`` follow the usual rules for arithmetic
> floating-point types.  Most importantly, this means that arithmetic operations
> on operands of these types are formally performed in the type and produce
> values of the type.  ``__fp16`` does not follow those rules: most operations
> immediately promote operands of type ``__fp16`` to ``float``, and so
> arithmetic operations are defined to be performed in ``float`` and so result 
> in
> a value of type ``float`` (unless further promoted because of other operands).
> See below for more information on the exact specifications of these types.
> 
> Only some of the supported processors for ``__fp16`` and ``__bf16`` offer
> native hardware support for arithmetic in their corresponding formats.
> The exact conditions are described in the lists above.  When compiling for a
> processor without native support, Clang will perform the arithmetic in
> ``float``, inserting extensions and truncations as necessary.  This can be
> done in a way that exactly emulates the behavior of hardware support for
> arithmetic, but it can require many extra operations.  By default, Clang takes
> advantage of the C standard's allowances for excess precision in intermediate
> operands in order to eliminate intermediate truncations within statements.
> This is generally much faster but can generate different results from strict
> operation-by-operation emulation.
> 
> The use of excess precision can be independently controlled for these two
> types with the ``-ffloat16-excess-precision=`` and
> ``-fbfloat16-excess-precision=`` options.  Valid values include:
> - ``none`` (meaning to perform strict operation-by-operation emulation)
> - ``standard`` (meaning that excess precision is permitted under the rules
>   described in the standard, i.e. never across explicit casts or statements)
> - ``fast`` (meaning that excess precision is permitted whenever the
>   optimizer sees an opportunity to avoid truncations; currently this has no
>   effect beyond ``standard``)
> 
> The ``_Float16`` type is an interchange floating type specified in
>  ISO/IEC TS 18661-3:2015 ("Floating-point extensions for C").  It will
> be supported on more targets as they define ABIs for it.
> 
> The ``__bf16`` type is a non-standard extension, but it generally follows
> the rules for arithmetic interchange floating types from ISO/IEC TS
> 18661-3:2015.  In previous versions of Clang, it was a storage-only type
> that forbade arithmetic operations.  It will be supported on more targets
> as they define ABIs for it.
> 
> The ``__fp16`` type was originally an ARM extension and is specified
> by the `ARM C Language Extensions 
> `_.
> Clang uses the ``binary16`` format from IEEE 754-2008 for ``__fp16``,
> not the ARM alternative format.  Operators that expect arithmetic operands
> immediately promote ``__fp16`` operands to ``float``.
> 
> It is recommended that portable code use ``_Float16`` instead of ``__fp16``,
> as it has been defined by the C standards committee and has behavior that is
> more familiar to most programmers.
> 
> Because ``__fp16`` operands are always immediately promoted to ``float``, the
> common real type of ``__fp16`` and ``_Float16`` for the purposes of the usual
> arithmetic conversions is ``float``.
> 
> A literal can be given ``_Float16`` type using the suffix ``f16``. For 
> example,
> ``3.14f16``.
> 
> Because default argument promotion

[PATCH] D150913: [Clang][Bfloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

2023-05-23 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150913/new/

https://reviews.llvm.org/D150913

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D136919: [X86][RFC] Change mangle name of bf16 from u6bf16 to DF16b

2023-05-21 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei abandoned this revision.
pengfei added a comment.

Abandon this in favor of D150913 .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136919/new/

https://reviews.llvm.org/D136919

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-21 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

Sorry for the back and forth! I just want to add a `:` after `warning`, but I 
forgot to amend it in the first reland..


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150645/new/

https://reviews.llvm.org/D150645

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-20 Thread Phoebe Wang via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGa4f366dcd85c: Reland [Driver] Support multi /guard: 
options (authored by pengfei).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150645/new/

https://reviews.llvm.org/D150645

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/MSVC.cpp
  clang/test/Driver/cl-options.c


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,13 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -Wall -Wno-msvc-not-found -### -- %s 
2>&1 | \
+// RUN:   FileCheck -check-prefix=BOTHGUARD %s --implicit-check-not=warning
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+// BOTHGUARD: -guard:cf
+// BOTHGUARD-SAME: -guard:ehcont
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck 
-check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7876,7 +7876,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
@@ -7895,6 +7895,7 @@
 } else {
   D.Diag(diag::err_drv_invalid_value) << A->getSpelling() << GuardArgs;
 }
+A->claim();
   }
 }
 


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,13 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -Wall -Wno-msvc-not-found -### -- %s 2>&1 | \
+// RUN:   FileCheck -check-prefix=BOTHGUARD %s --implicit-check-not=warning
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+// BOTHGUARD: -guard:cf
+// BOTHGUARD-SAME: -guard:ehcont
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck -check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7876,7 +7876,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
@@ -7895,6 +7895,7 @@
 } else {
   D.Diag(diag::err_drv_invalid_value) << A->getSpelling() << GuardArgs;
 }
+A->claim();
   }
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150913: [Clang][Bfloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

2023-05-20 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Basic/Targets/X86.cpp:381
 
 HasBFloat16 = SSELevel >= SSE2;
 

codemzs wrote:
> rjmccall wrote:
> > pengfei wrote:
> > > I'm not sure if I understand the meaning of `HasFullBFloat16`. If it is 
> > > used for target that supports arithmetic `__bf16`, we should not use 
> > > `+fullbf16` but always enable it for SSE2, i.e., `HasFullBFloat16 = 
> > > SSELevel >= SSE2`. Because X86 GCC already supports arithmetic for 
> > > `__bf16`.
> > > 
> > > If this is used in the way like `HasLegalHalfType`, we should enable it 
> > > once we have a full BF16 ISA on X86. `fullbf16` doesn't make much sense 
> > > to me.
> > At the moment, we haven't done the work to emulate BFloat16 arithmetic in 
> > any of the three ways we can do that: Clang doesn't promote it in IRGen, 
> > LLVM doesn't promote it in legalization, and we don't have compiler-rt 
> > functions for it.  If we emit these instructions, they'll just sail through 
> > LLVM and fail in the backend.  So in the short term, we have to restrict 
> > this to targets that directly support BFloat16 arithmetic in hardware, 
> > which doesn't include x86.
> > 
> > Once we have that emulation support, I agree that the x86 targets should 
> > enable this whenever they would enable `__bf16`.
> @rjmccall, I concur and just wanted to confirm this change indeed intends to 
> provide `BFloat16` emulation support, utilizing excess precision for 
> promotion to `float`. The `HasFullBFloat16` switch is designed to determine 
> excess precision support automatically when the hardware does not natively 
> support `bfloat16` arithmetic.
> LLVM doesn't promote it in legalization, and we don't have compiler-rt 
> functions for it.

That's not true: https://godbolt.org/z/jxf5E83vG.

> The `HasFullBFloat16` switch is designed to determine excess precision 
> support automatically when the hardware does not natively support bfloat16 
> arithmetic.

Makes sense to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150913/new/

https://reviews.llvm.org/D150913

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150913: [Clang][Bfloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

2023-05-19 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

Great work! Thanks for the patch!




Comment at: clang/include/clang/AST/ASTContext.h:1102
   CanQualType HalfTy; // [OpenCL 6.1.1.1], ARM NEON
-  CanQualType BFloat16Ty;
+  CanQualType BFloat16Ty; // ISO/IEC/IEEE 60559.
   CanQualType Float16Ty; // C11 extension ISO/IEC TS 18661-3

Don't have a look at ISO/IEC/IEEE 60559, but I doubt BF16 is still not a IEEE 
type for now.



Comment at: clang/lib/Basic/Targets/AMDGPU.h:121
   bool hasBFloat16Type() const override { return isAMDGCN(getTriple()); }
-  const char *getBFloat16Mangling() const override { return "u6__bf16"; };
+  const char *getBFloat16Mangling() const override { return "DF16b"; };
 

I think it's time to bring D139608 back with this patch :)



Comment at: clang/lib/Basic/Targets/X86.cpp:362-363
   HasX87 = true;
+} else if (Feature == "+fullbf16") {
+  HasFullBFloat16 = true;
 }

Maybe not need it.



Comment at: clang/lib/Basic/Targets/X86.cpp:381
 
 HasBFloat16 = SSELevel >= SSE2;
 

I'm not sure if I understand the meaning of `HasFullBFloat16`. If it is used 
for target that supports arithmetic `__bf16`, we should not use `+fullbf16` but 
always enable it for SSE2, i.e., `HasFullBFloat16 = SSELevel >= SSE2`. Because 
X86 GCC already supports arithmetic for `__bf16`.

If this is used in the way like `HasLegalHalfType`, we should enable it once we 
have a full BF16 ISA on X86. `fullbf16` doesn't make much sense to me.



Comment at: clang/lib/Basic/Targets/X86.cpp:1122
   .Case("xsaveopt", HasXSAVEOPT)
+  .Case("fullbf16", HasFullBFloat16)
   .Default(false);

ditto.



Comment at: clang/test/CodeGen/X86/bfloat16.cpp:2-3
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 2
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -S -emit-llvm %s -o - | 
FileCheck -check-prefix=CHECK-NBF16 %s
+

The backend has already support lowering of `bfloat`, I don't think it's 
necessary to do extra work in FE unless for excess-precision.



Comment at: clang/test/CodeGen/X86/fexcess-precision-bfloat16.c:7
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
+// RUN: -fbfloat16-excess-precision=fast -target-feature +fullbf16 \
+// RUN: -emit-llvm -o - %s | FileCheck -check-prefixes=CHECK-NO-EXT %s

The tests here make me guess you want to use `fullbf16` the same as 
`HasLegalHalfType`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150913/new/

https://reviews.llvm.org/D150913

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-19 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei updated this revision to Diff 523745.
pengfei added a comment.

In D150645#4353408 , @aeubanks wrote:

> if you add a test for the repro, relanding lgtm

Thanks, done.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150645/new/

https://reviews.llvm.org/D150645

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/MSVC.cpp
  clang/test/Driver/cl-options.c


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,13 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -Wall -Wno-msvc-not-found -### -- %s 
2>&1 | \
+// RUN:   FileCheck -check-prefix=BOTHGUARD %s --implicit-check-not=warning
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+// BOTHGUARD: -guard:cf
+// BOTHGUARD-SAME: -guard:ehcont
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck 
-check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7868,7 +7868,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
@@ -7887,6 +7887,7 @@
 } else {
   D.Diag(diag::err_drv_invalid_value) << A->getSpelling() << GuardArgs;
 }
+A->claim();
   }
 }
 


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,13 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -Wall -Wno-msvc-not-found -### -- %s 2>&1 | \
+// RUN:   FileCheck -check-prefix=BOTHGUARD %s --implicit-check-not=warning
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+// BOTHGUARD: -guard:cf
+// BOTHGUARD-SAME: -guard:ehcont
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck -check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7868,7 +7868,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
@@ -7887,6 +7887,7 @@
 } else {
   D.Diag(diag::err_drv_invalid_value) << A->getSpelling() << GuardArgs;
 }
+A->claim();
   }
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-17 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei updated this revision to Diff 523256.
pengfei added a comment.

Sorry for the noise, I guess I missed a `A->claim()` here. Updated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150645/new/

https://reviews.llvm.org/D150645

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/MSVC.cpp
  clang/test/Driver/cl-options.c


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,10 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=BOTHGUARD %s
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck 
-check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7868,7 +7868,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
@@ -7887,6 +7887,7 @@
 } else {
   D.Diag(diag::err_drv_invalid_value) << A->getSpelling() << GuardArgs;
 }
+A->claim();
   }
 }
 


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,10 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=BOTHGUARD %s
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck -check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7868,7 +7868,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
@@ -7887,6 +7887,7 @@
 } else {
   D.Diag(diag::err_drv_invalid_value) << A->getSpelling() << GuardArgs;
 }
+A->claim();
   }
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-16 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added a comment.

Thanks @rnk !


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150645/new/

https://reviews.llvm.org/D150645

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-16 Thread Phoebe Wang via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG3b6f7e45a209: [Driver] Support multi /guard: options 
(authored by pengfei).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150645/new/

https://reviews.llvm.org/D150645

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/MSVC.cpp
  clang/test/Driver/cl-options.c


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,10 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=BOTHGUARD %s
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck 
-check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7865,7 +7865,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,10 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=BOTHGUARD %s
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck -check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7865,7 +7865,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150645: [Driver] Support multi /guard: options

2023-05-16 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei created this revision.
pengfei added reviewers: rnk, arlosi, thakis, ajpaverd.
Herald added a project: All.
pengfei requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D150645

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/MSVC.cpp
  clang/test/Driver/cl-options.c


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,10 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -### -- %s 2>&1 | FileCheck 
-check-prefix=BOTHGUARD %s
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck 
-check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7839,7 +7839,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".


Index: clang/test/Driver/cl-options.c
===
--- clang/test/Driver/cl-options.c
+++ clang/test/Driver/cl-options.c
@@ -647,6 +647,10 @@
 // RUN: %clang_cl /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=EHCONTGUARD %s
 // EHCONTGUARD: -ehcontguard
 
+// RUN: %clang_cl /guard:cf /guard:ehcont -### -- %s 2>&1 | FileCheck -check-prefix=BOTHGUARD %s
+// BOTHGUARD: -cfguard
+// BOTHGUARD-SAME: -ehcontguard
+
 // RUN: %clang_cl /guard:foo -### -- %s 2>&1 | FileCheck -check-prefix=CFGUARDINVALID %s
 // CFGUARDINVALID: invalid value 'foo' in '/guard:'
 
Index: clang/lib/Driver/ToolChains/MSVC.cpp
===
--- clang/lib/Driver/ToolChains/MSVC.cpp
+++ clang/lib/Driver/ToolChains/MSVC.cpp
@@ -227,7 +227,7 @@
   Args.AddAllArgValues(CmdArgs, options::OPT__SLASH_link);
 
   // Control Flow Guard checks
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 if (GuardArgs.equals_insensitive("cf") ||
 GuardArgs.equals_insensitive("cf,nochecks")) {
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7839,7 +7839,7 @@
   if (Args.hasArg(options::OPT__SLASH_kernel))
 CmdArgs.push_back("-fms-kernel");
 
-  if (Arg *A = Args.getLastArg(options::OPT__SLASH_guard)) {
+  for (const Arg *A : Args.filtered(options::OPT__SLASH_guard)) {
 StringRef GuardArgs = A->getValue();
 // The only valid options are "cf", "cf,nochecks", "cf-", "ehcont" and
 // "ehcont-".
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-15 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:1043
+///corresponding byte of the 256-bit integer vector result (overflow is
+///ignored). For each byte, computes  result = __a - __b .
+///

It better to move it to `\code{.operation}` for consistency. Same for the below.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149920: ms inline asm: recognize case-insensitive JMP and CALL as TargetLowering::C_Address

2023-05-05 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149920/new/

https://reviews.llvm.org/D149920

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149920: ms inline asm: recognize case-insensitive JMP and CALL as TargetLowering::C_Address

2023-05-04 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei added inline comments.



Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:33955
 const SmallVectorImpl , unsigned OpNo) const {
-  StringRef InstrStr = getInstrStrFromOpNo(AsmStrs, OpNo);
-
-  if (InstrStr.contains("call"))
-return true;
-
-  return false;
+  // In a __asm block, __asm inst foo where inst is CALL or FOO should be
+  // changed from indirect TargetLowering::C_Memory to direct

JMP?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149920/new/

https://reviews.llvm.org/D149920

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149205: [Headers][doc] Add "gather" intrinsic descriptions to avx2intrin.h

2023-04-25 Thread Phoebe Wang via Phabricator via cfe-commits

pengfei accepted this revision.
pengfei added a comment.

LGTM with one nit.




Comment at: clang/lib/Headers/avx2intrin.h:942
+///
+/// \code
+/// FOR element := 0 to 1

Use `\code{.operation}` please, the same below. Our internal tool will 
recognize this pattern.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149205/new/

https://reviews.llvm.org/D149205

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 4 5 6 >

1 - 100 of 515 matches

Mail list logo