from:"Sanjay Patel via Phabricator via cfe\-commits"

[PATCH] D139006: [UpdateTestChecks] Match define for labels

2022-12-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D139006#4000992 , @sebastian-ne 
wrote:

> Adding `--function-signature` by default sounds like a good idea to me.
> Is there any reason why we wouldn’t want to enable this by default (for new 
> tests)?

No objection from me for new files. There's some small time cost for added 
checks, but AFAIK nobody cares about that, so the added verification to also 
make sure we didn't accidentally swap function args during regex is worth it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139006/new/

https://reviews.llvm.org/D139006

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139006: [UpdateTestChecks] Match define for labels

2022-12-13 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added subscribers: arsenm, spatel.
spatel added a comment.

I agree with the revert - we'd need lots of commits like this:
b7232dafe69eb04c14217 
 (cc 
@arsenm)

Another possibility to fix this with less churn:
Add a warning during (re-)generation for any test where the CHECK-LABEL could 
be going wrong. So scan the file for a call to a function name that is also 
defined in that file?
We already have a warning like that for variables named `%tmp`. So if a test 
file contains the potentially buggy construct, then warn the user that they 
should regenerate with "--function-signature" to be safe.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139006/new/

https://reviews.llvm.org/D139006

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D136806: [Pipelines] Introduce SROA after (final, full) loop unrolling

2022-11-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM based on the timing, results, and alternatives discussed
There's some testing in progress according to previous comments, so best to 
wait for that to finish in case it turns up anything new.




Comment at: llvm/lib/Passes/PassBuilderPipelines.cpp:1119
+  if (IsFullLTO) {
+FPM.addPass(SROAPass());
+  }

Is there a reason to put this down here vs. tacking it on the end of the 
previous IsFullLTO block? 
If LoopUnroll is the reason for adding SROA, then mention that specifically in 
the comment?

IIRC, all of the FullLTO predicates in this set of passes were questionable 
(see TODO comment above this function). They just accumulated because the code 
was duplicated and diverged over time.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136806/new/

https://reviews.llvm.org/D136806

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D131980: [Passes] Don't run tail-call-elim in -O1

2022-08-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM
These clang tests are just awful, but I don't have the patience to fix them...




Comment at: llvm/lib/Passes/PassBuilderPipelines.cpp:245
 
 // TODO: Investigate the cost/benefit of tail call elimination on debugging.
 FunctionPassManager

Update this comment? IIUC, we are decisively saying that tail call elim is not 
worth the impact to debuggability for -O1. That way, someone -- possibly me 
again :) -- won't try to unknowingly change this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131980/new/

https://reviews.llvm.org/D131980

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D130374: [Passes] add a tail-call-elim pass near the end of the opt pipeline

2022-07-26 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D130374#3679550 , @hiraditya wrote:

> In D130374#3675677 , @vdsered wrote:
>
>> Just JFYI :)
>> Yes, probably not worth it
>
> that is interesting. do we know why?

Based on this data: 
https://llvm-compile-time-tracker.com/compare.php?from=95f4ca7f5db623bacc2e34548d39fe5b28d47bad=bfb9b8e075ee32197157ccaf0c301122ca9b81af=instructions

This patch (adding a late round of TCE) caused a ~0.06% compile-time regression 
for a -O3 build (less for the LTO variants). So the value of splitting the pass 
as proposed in D60031  depends on whether we 
think it's worth trying to save some (unknown?) fraction of the 0.06%.
We decided to push this for the known codegen wins, but someone can still 
revive the pass-splitting patch if it seems worthwhile.

One more note that I failed to mention while updating all of those clang tests: 
the reason those tests did not show "tail" before is because we only ran TCE 
with -O{2/3/s/z}, not -O1. This patch enabled TCE for all -O levels. I don't 
know the history/motivation for not including TCE at -O1 before, but it did not 
seem worth excluding based on compile-time cost. If there's another reason, we 
can add that limitation to the late invocation too (and it should cause 
most/all of the clang test diffs to revert).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130374/new/

https://reviews.llvm.org/D130374

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D130374: [Passes] add a tail-call-elim pass near the end of the opt pipeline

2022-07-25 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGbfb9b8e075ee: [Passes] add a tail-call-elim pass near the 
end of the opt pipeline (authored by spatel).
Herald added subscribers: cfe-commits, pmatos, asb, arphaman, aheejin, sbc100.
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D130374?vs=447156=447431#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130374/new/

https://reviews.llvm.org/D130374

Files:
  clang/test/CodeGen/aarch64-ls64-inline-asm.c
  clang/test/CodeGen/aarch64-neon-vcmla.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_abd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_abs.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_acge.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_acgt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_acle.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_aclt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_add.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_adda.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_addv.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_adrb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_adrd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_adrh.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_adrw.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_and.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_andv.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_asr.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_asrd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bic.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_brka.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_brkb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_brkn.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_brkpa.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_brkpb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cadd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cls.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clz.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmpeq.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmpge.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmpgt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmple.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmplt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmpne.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cmpuo.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntp.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_compact.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_create2-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_create2.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_create3-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_create3.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_create4-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_create4.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvtnt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_div.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_divr.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c

[PATCH] D127460: Rename GCCBuiltin into ClangBuiltin

2022-06-14 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/include/llvm/IR/Intrinsics.td:387
 
-/// GCCBuiltin - If this intrinsic exactly corresponds to a GCC builtin, this
+/// ClangBuiltin - If this intrinsic exactly corresponds to a GCC builtin, this
 /// specifies the name of the builtin.  This provides automatic CBE and CFE

Replace "GCC" with "Clang" ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127460/new/

https://reviews.llvm.org/D127460

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D124551: [Driver] Add f16 support to -mrecip parsing.

2022-04-28 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM.

AFAIK, these FP reciprocal options have never been officially documented. Would 
that go under here:
https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124551/new/

https://reviews.llvm.org/D124551

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa3cffc11509b: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) 
into ctpop(X)  2 (authored by hkmatsumoto, committed by spatel).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122077/new/

https://reviews.llvm.org/D122077

Files:
  llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
  llvm/test/Transforms/InstCombine/ispow2.ll

Index: llvm/test/Transforms/InstCombine/ispow2.ll
===
--- llvm/test/Transforms/InstCombine/ispow2.ll
+++ llvm/test/Transforms/InstCombine/ispow2.ll
@@ -740,10 +740,8 @@
 define i1 @is_pow2or0_ctpop(i32 %x) {
 ; CHECK-LABEL: @is_pow2or0_ctpop(
 ; CHECK-NEXT:[[T0:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[X:%.*]]), !range [[RNG0]]
-; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[T0]], 1
-; CHECK-NEXT:[[ISZERO:%.*]] = icmp eq i32 [[X]], 0
-; CHECK-NEXT:[[R:%.*]] = or i1 [[ISZERO]], [[CMP]]
-; CHECK-NEXT:ret i1 [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ult i32 [[T0]], 2
+; CHECK-NEXT:ret i1 [[TMP1]]
 ;
   %t0 = tail call i32 @llvm.ctpop.i32(i32 %x)
   %cmp = icmp eq i32 %t0, 1
@@ -755,10 +753,8 @@
 define i1 @is_pow2or0_ctpop_swap_cmp(i32 %x) {
 ; CHECK-LABEL: @is_pow2or0_ctpop_swap_cmp(
 ; CHECK-NEXT:[[T0:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[X:%.*]]), !range [[RNG0]]
-; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[T0]], 1
-; CHECK-NEXT:[[ISZERO:%.*]] = icmp eq i32 [[X]], 0
-; CHECK-NEXT:[[R:%.*]] = or i1 [[CMP]], [[ISZERO]]
-; CHECK-NEXT:ret i1 [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ult i32 [[T0]], 2
+; CHECK-NEXT:ret i1 [[TMP1]]
 ;
   %t0 = tail call i32 @llvm.ctpop.i32(i32 %x)
   %cmp = icmp eq i32 %t0, 1
@@ -770,10 +766,8 @@
 define i1 @is_pow2or0_ctpop_logical(i32 %x) {
 ; CHECK-LABEL: @is_pow2or0_ctpop_logical(
 ; CHECK-NEXT:[[T0:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[X:%.*]]), !range [[RNG0]]
-; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[T0]], 1
-; CHECK-NEXT:[[ISZERO:%.*]] = icmp eq i32 [[X]], 0
-; CHECK-NEXT:[[R:%.*]] = or i1 [[ISZERO]], [[CMP]]
-; CHECK-NEXT:ret i1 [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ult i32 [[T0]], 2
+; CHECK-NEXT:ret i1 [[TMP1]]
 ;
   %t0 = tail call i32 @llvm.ctpop.i32(i32 %x)
   %cmp = icmp eq i32 %t0, 1
@@ -785,10 +779,8 @@
 define <2 x i1> @is_pow2or0_ctpop_commute_vec(<2 x i8> %x) {
 ; CHECK-LABEL: @is_pow2or0_ctpop_commute_vec(
 ; CHECK-NEXT:[[T0:%.*]] = tail call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> [[X:%.*]])
-; CHECK-NEXT:[[CMP:%.*]] = icmp eq <2 x i8> [[T0]], 
-; CHECK-NEXT:[[ISZERO:%.*]] = icmp eq <2 x i8> [[X]], zeroinitializer
-; CHECK-NEXT:[[R:%.*]] = or <2 x i1> [[ISZERO]], [[CMP]]
-; CHECK-NEXT:ret <2 x i1> [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ult <2 x i8> [[T0]], 
+; CHECK-NEXT:ret <2 x i1> [[TMP1]]
 ;
   %t0 = tail call <2 x i8> @llvm.ctpop.v2i8(<2 x i8> %x)
   %cmp = icmp eq <2 x i8> %t0, 
@@ -807,8 +799,8 @@
 ; CHECK-NEXT:call void @use_i1(i1 [[CMP]])
 ; CHECK-NEXT:[[ISZERO:%.*]] = icmp eq i32 [[X]], 0
 ; CHECK-NEXT:call void @use_i1(i1 [[ISZERO]])
-; CHECK-NEXT:[[R:%.*]] = or i1 [[ISZERO]], [[CMP]]
-; CHECK-NEXT:ret i1 [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ult i32 [[T0]], 2
+; CHECK-NEXT:ret i1 [[TMP1]]
 ;
   %t0 = tail call i32 @llvm.ctpop.i32(i32 %x)
   call void @use(i32 %t0)
@@ -828,8 +820,8 @@
 ; CHECK-NEXT:call void @use_i1(i1 [[CMP]])
 ; CHECK-NEXT:[[ISZERO:%.*]] = icmp eq i32 [[X]], 0
 ; CHECK-NEXT:call void @use_i1(i1 [[ISZERO]])
-; CHECK-NEXT:[[R:%.*]] = or i1 [[ISZERO]], [[CMP]]
-; CHECK-NEXT:ret i1 [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ult i32 [[T0]], 2
+; CHECK-NEXT:ret i1 [[TMP1]]
 ;
   %t0 = tail call i32 @llvm.ctpop.i32(i32 %x)
   call void @use(i32 %t0)
@@ -940,10 +932,8 @@
 define i1 @isnot_pow2nor0_ctpop(i32 %x) {
 ; CHECK-LABEL: @isnot_pow2nor0_ctpop(
 ; CHECK-NEXT:[[T0:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[X:%.*]]), !range [[RNG0]]
-; CHECK-NEXT:[[CMP:%.*]] = icmp ne i32 [[T0]], 1
-; CHECK-NEXT:[[NOTZERO:%.*]] = icmp ne i32 [[X]], 0
-; CHECK-NEXT:[[R:%.*]] = and i1 [[NOTZERO]], [[CMP]]
-; CHECK-NEXT:ret i1 [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ugt i32 [[T0]], 1
+; CHECK-NEXT:ret i1 [[TMP1]]
 ;
   %t0 = tail call i32 @llvm.ctpop.i32(i32 %x)
   %cmp = icmp ne i32 %t0, 1
@@ -955,10 +945,8 @@
 define i1 @isnot_pow2nor0_ctpop_swap_cmp(i32 %x) {
 ; CHECK-LABEL: @isnot_pow2nor0_ctpop_swap_cmp(
 ; CHECK-NEXT:[[T0:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[X:%.*]]), !range [[RNG0]]
-; CHECK-NEXT:[[CMP:%.*]] = icmp ne i32 [[T0]], 1
-; CHECK-NEXT:[[NOTZERO:%.*]] = icmp ne i32 [[X]], 0
-; CHECK-NEXT:[[R:%.*]] = and i1 [[CMP]], [[NOTZERO]]
-; CHECK-NEXT:ret i1 [[R]]
+; CHECK-NEXT:[[TMP1:%.*]] = icmp ugt i32 [[T0]], 1
+; CHECK-NEXT:ret i1

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122077/new/

https://reviews.llvm.org/D122077

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:915
+/// Fold (icmp eq ctpop(X) 1) | (icmp eq X 0) into (icmp ult ctpop(X) 2) and
+/// fold (icmp ne ctpop(X) 1) & (icmp ne X 0) into (icmp uge ctpop(X) 2).
+static Value *foldIsPowerOf2OrZero(ICmpInst *Cmp0, ICmpInst *Cmp1, bool IsAnd,

Why create ">= 2" instead of "> 1" directly? 

I don't think it makes the transform or code any clearer with ">= 2", and we 
will always canonicalize to the other form, so I would prefer to go directly to 
the final result for efficiency.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122077/new/

https://reviews.llvm.org/D122077

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D122077#3413565 , @xbolva00 wrote:

> In D122077#3413550 , @joerg wrote:
>
>> Why is this fold preferable to `(X & (X-1)) == 0`? At least on all 
>> architectures without native population count, the binary-and based test is 
>> preferable and it might even be better with it.
>
> Less IR instructions. I think SDAG already expands some ctpop patterns to 
> logic (backends should decide about optimal form)

Correct - we try to convert the setcc(ctpop) pattern here:
https://github.com/llvm/llvm-project/blob/f1d8e46258c6a08ca1a375dc9670dd5581d6cf65/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp#L3772

That doesn't check whether the target has a popcount instruction for scalar 
types, so it is potentially too aggressive. For x86, we have a bonus 
optimization after that, so it should not show up there:
https://godbolt.org/z/oc16Kdhcf


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122077/new/

https://reviews.llvm.org/D122077

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-27 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

Thank you for the making the test changes. I pushed the baseline tests on your 
behalf here:
ebaa28e0750b 


Please rebase and update the patch here in Phabricator - it should only show 
changes in the CHECK lines after the update.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122077/new/

https://reviews.llvm.org/D122077

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-24 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

A few changes for tests suggested inline.

There might be some generalization of ctpop analysis that we can make as a 
follow-up patch.
For example, I was looking at a "wrong predicate" combination and noticed that 
we miss possible optimizations like this:
https://alive2.llvm.org/ce/z/RRk_d9

Comment at: llvm/test/Transforms/InstCombine/icmp-or.ll:412
+  %4 = icmp eq <2 x i32> %0, 
+  %5 = or <2 x i1> %4, %3
+  ret <2 x i1> %5

We can swap the operand order of the "or" for more coverage of the commuted 
pattern.

Comment at: llvm/test/Transforms/InstCombine/icmp-or.ll:444
+  %4 = icmp ne i32 %0, 0
+  %5 = and i1 %4, %3
+  ret i1 %5

We are testing the 'and' pattern too now, so it doesn't match the name of the 
file. I think it would be better to add the new tests next to the changed tests 
in `ispow2.ll`.

Comment at: llvm/test/Transforms/InstCombine/icmp-or.ll:478
+
+; negative test - wrong constants
+

Instead of checking the same wrong constant again, it would be better to change 
to a wrong predicate(s).

Comment at: llvm/test/Transforms/InstCombine/icmp-or.ll:382
+  ret i1 %5
+}
+

craig.topper wrote:
> RKSimon wrote:
> > What about if the ctpop has multi uses?
> The ctpop isn't being changed. Does it matter if it has more uses?
> 
> What is interesting is if the (icmp eq (ctpop(x)), 1) has another user other 
> than the or.
This transform only replaces the logic op with a cmp, so I think we want to do 
it even if all of the intermediate values have other uses.
Either way, we should have at least one test where there are other uses of the 
icmps.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122077/new/

https://reviews.llvm.org/D122077

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-21 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:916
+/// fold (icmp ne ctpop(X) 1) & (icmp ne X 0) into (icmp uge ctpop(X) 2).
+static Value *foldOrOfCtpop(ICmpInst *Cmp0, ICmpInst *Cmp1, bool IsAnd,
+InstCombiner::BuilderTy ) {

This name is too specific now that we are handling 'and'. It could be called 
"foldIsPowerOf2OrZero" to match the transform below?



Comment at: llvm/test/Transforms/InstCombine/ispow2.ll:540
 
 define i1 @isnot_pow2_ctpop_wrong_pred1(i32 %x) {
 ; CHECK-LABEL: @isnot_pow2_ctpop_wrong_pred1(

It seems better to rename this test and the next one now that we can optimize 
it - "is_pow2_or_zero"  or similar.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122077/new/

https://reviews.llvm.org/D122077

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121306: [Sema] add warning for tautological FP compare with literal

2022-03-17 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGab982eace6e4: [Sema] add warning for tautological FP compare 
with literal (authored by spatel).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121306/new/

https://reviews.llvm.org/D121306

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaChecking.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/test/Sema/floating-point-compare.c

Index: clang/test/Sema/floating-point-compare.c
===
--- clang/test/Sema/floating-point-compare.c
+++ clang/test/Sema/floating-point-compare.c
@@ -12,6 +12,7 @@
   return x == x; // no-warning
 }
 
+// 0.0 can be represented exactly, so don't warn.
 int f4(float x) {
   return x == 0.0; // no-warning {{comparing}}
 }
@@ -20,6 +21,43 @@
   return x == __builtin_inf(); // no-warning
 }
 
-int f7(float x) {
-  return x == 3.14159; // expected-warning {{comparing}}
+// The literal is a double that can't be represented losslessly as a float.
+int tautological_FP_compare(float x) {
+  return x == 3.14159; // expected-warning {{floating-point comparison is always false}}
+}
+
+int tautological_FP_compare_inverse(float x) {
+  return x != 3.14159; // expected-warning {{floating-point comparison is always true}}
+}
+
+// The literal is a double that can be represented losslessly as a long double,
+// but this might not be the comparison what was intended.
+int not_tautological_FP_compare(long double f) {
+  return f == 0.1; // expected-warning {{comparing floating point with ==}}
+}
+
+int tautological_FP_compare_commute(float f) {
+  return 0.3 == f; // expected-warning {{floating-point comparison is always false}}
+}
+
+int tautological_FP_compare_commute_inverse(float f) {
+  return 0.3 != f; // expected-warning {{floating-point comparison is always true}}
+}
+
+int tautological_FP_compare_explicit_upcast(float f) {
+  return 0.3 == (double) f; // expected-warning {{floating-point comparison is always false}}
+}
+
+int tautological_FP_compare_explicit_downcast(double d) {
+  return 0.3 == (float) d; // expected-warning {{floating-point comparison is always false}}
+}
+
+int tautological_FP_compare_ignore_parens(float f) {
+  return f == (0.3); // expected-warning {{floating-point comparison is always false}}
+}
+
+#define CST 0.3
+
+int tautological_FP_compare_macro(float f) {
+  return f == CST; // expected-warning {{floating-point comparison is always false}}
 }
Index: clang/lib/Sema/SemaExpr.cpp
===
--- clang/lib/Sema/SemaExpr.cpp
+++ clang/lib/Sema/SemaExpr.cpp
@@ -12022,7 +12022,7 @@
 
   // Check for comparisons of floating point operands using != and ==.
   if (Type->hasFloatingRepresentation() && BinaryOperator::isEqualityOp(Opc))
-S.CheckFloatComparison(Loc, LHS.get(), RHS.get());
+S.CheckFloatComparison(Loc, LHS.get(), RHS.get(), Opc);
 
   // The result of comparisons is 'bool' in C++, 'int' in C.
   return S.Context.getLogicalOperationType();
@@ -12618,7 +12618,7 @@
   if (BinaryOperator::isEqualityOp(Opc) &&
   LHSType->hasFloatingRepresentation()) {
 assert(RHS.get()->getType()->hasFloatingRepresentation());
-CheckFloatComparison(Loc, LHS.get(), RHS.get());
+CheckFloatComparison(Loc, LHS.get(), RHS.get(), Opc);
   }
 
   // Return a signed type for the vector.
Index: clang/lib/Sema/SemaChecking.cpp
===
--- clang/lib/Sema/SemaChecking.cpp
+++ clang/lib/Sema/SemaChecking.cpp
@@ -11436,12 +11436,40 @@
 CheckPPCMMAType(RetValExp->getType(), ReturnLoc);
 }
 
-//===--- CHECK: Floating-Point comparisons (-Wfloat-equal) ---===//
+/// Check for comparisons of floating-point values using == and !=. Issue a
+/// warning if the comparison is not likely to do what the programmer intended.
+void Sema::CheckFloatComparison(SourceLocation Loc, Expr *LHS, Expr *RHS,
+BinaryOperatorKind Opcode) {
+  // Match and capture subexpressions such as "(float) X == 0.1".
+  FloatingLiteral *FPLiteral;
+  CastExpr *FPCast;
+  auto getCastAndLiteral = [, ](Expr *L, Expr *R) {
+FPLiteral = dyn_cast(L->IgnoreParens());
+FPCast = dyn_cast(R->IgnoreParens());
+return FPLiteral && FPCast;
+  };
+
+  if (getCastAndLiteral(LHS, RHS) || getCastAndLiteral(RHS, LHS)) {
+auto *SourceTy = FPCast->getSubExpr()->getType()->getAs();
+auto *TargetTy = FPLiteral->getType()->getAs();
+if (SourceTy && TargetTy && SourceTy->isFloatingPoint() &&
+TargetTy->isFloatingPoint()) {
+  bool Lossy;
+  llvm::APFloat TargetC = FPLiteral->getValue();
+  TargetC.convert(Context.getFloatTypeSemantics(QualType(SourceTy, 0)),
+

[PATCH] D115886: [CodeGen] remove creation of FP cast function attribute

2021-12-19 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1965cc469539: [CodeGen] remove creation of FP cast function 
attribute (authored by spatel).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115886/new/

https://reviews.llvm.org/D115886

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/no-junk-ftrunc.c


Index: clang/test/CodeGen/no-junk-ftrunc.c
===
--- clang/test/CodeGen/no-junk-ftrunc.c
+++ clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,11 +1,12 @@
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | 
FileCheck %s --check-prefix=NOSTRICT
 
 // When compiling with non-standard semantics, use intrinsics to inhibit the 
optimizer.
+// This used to require a function attribute, so we check that it is NOT here 
anymore.
 
 // NOSTRICT-LABEL: main
 // NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
 // NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
-// NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
+// NOSTRICT-NOT: strict-float-cast-overflow
 
 // The workaround attribute is not applied by default.
 
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1831,11 +1831,6 @@
 if (LangOpts.getFPExceptionMode() == LangOptions::FPE_Ignore)
   FuncAttrs.addAttribute("no-trapping-math", "true");
 
-// Strict (compliant) code is the default, so only add this attribute to
-// indicate that we are trying to workaround a problem case.
-if (!CodeGenOpts.StrictFloatCastOverflow)
-  FuncAttrs.addAttribute("strict-float-cast-overflow", "false");
-
 // TODO: Are these all needed?
 // unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 if (LangOpts.NoHonorInfs)


Index: clang/test/CodeGen/no-junk-ftrunc.c
===
--- clang/test/CodeGen/no-junk-ftrunc.c
+++ clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,11 +1,12 @@
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | FileCheck %s --check-prefix=NOSTRICT
 
 // When compiling with non-standard semantics, use intrinsics to inhibit the optimizer.
+// This used to require a function attribute, so we check that it is NOT here anymore.
 
 // NOSTRICT-LABEL: main
 // NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
 // NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
-// NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
+// NOSTRICT-NOT: strict-float-cast-overflow
 
 // The workaround attribute is not applied by default.
 
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1831,11 +1831,6 @@
 if (LangOpts.getFPExceptionMode() == LangOptions::FPE_Ignore)
   FuncAttrs.addAttribute("no-trapping-math", "true");
 
-// Strict (compliant) code is the default, so only add this attribute to
-// indicate that we are trying to workaround a problem case.
-if (!CodeGenOpts.StrictFloatCastOverflow)
-  FuncAttrs.addAttribute("strict-float-cast-overflow", "false");
-
 // TODO: Are these all needed?
 // unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 if (LangOpts.NoHonorInfs)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-19 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added subscribers: kpn, sepavloff, andrew.w.kaylor.
spatel added a comment.

In D115804#3201044 , @craig.topper 
wrote:

> What's the plan for constrained intrinsics versions of these intrinsics? The 
> IRBuilder calls for CreateFPToSI and CreateFPToUI are strict FP aware, but 
> this new code isn't.

Not sure. cc'ing @kpn @sepavloff @andrew.w.kaylor  
The saturating intrinsics implement non-standard behavior for C languages 
AFAIK, so we might want to warn if someone tries to use 
"-fno-strict-float-cast-overflow" and "-ffp-exception-behavior=strict" at the 
same time? Or we try to support that corner case by adding even more FP 
intrinsics?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115804/new/

https://reviews.llvm.org/D115804

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-16 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8c7f2a4f8719: [CodeGen] use saturating FP casts when 
compiling with no-strict-float-cast… (authored by spatel).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D115804?vs=394571=394850#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115804/new/

https://reviews.llvm.org/D115804

Files:
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/no-junk-ftrunc.c


Index: clang/test/CodeGen/no-junk-ftrunc.c
===
--- clang/test/CodeGen/no-junk-ftrunc.c
+++ clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,14 +1,22 @@
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | 
FileCheck %s --check-prefix=NOSTRICT
+
+// When compiling with non-standard semantics, use intrinsics to inhibit the 
optimizer.
+
 // NOSTRICT-LABEL: main
+// NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
+// NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
 // NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
 
 // The workaround attribute is not applied by default.
 
 // RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s --check-prefix=STRICT
 // STRICT-LABEL: main
+// STRICT: = fptosi
+// STRICT: = fptoui
 // STRICT-NOT: strict-float-cast-overflow
 
+
 int main() {
-  return 0;
+  double d = 1e20;
+  return (int)d != 1e20 && (unsigned)d != 1e20;
 }
-
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -1240,7 +1240,18 @@
 
   if (isa(DstElementTy)) {
 assert(SrcElementTy->isFloatingPointTy() && "Unknown real conversion");
-if (DstElementType->isSignedIntegerOrEnumerationType())
+bool IsSigned = DstElementType->isSignedIntegerOrEnumerationType();
+
+// If we can't recognize overflow as undefined behavior, assume that
+// overflow saturates. This protects against normal optimizations if we are
+// compiling with non-standard FP semantics.
+if (!CGF.CGM.getCodeGenOpts().StrictFloatCastOverflow) {
+  llvm::Intrinsic::ID IID =
+  IsSigned ? llvm::Intrinsic::fptosi_sat : llvm::Intrinsic::fptoui_sat;
+  return Builder.CreateCall(CGF.CGM.getIntrinsic(IID, {DstTy, SrcTy}), 
Src);
+}
+
+if (IsSigned)
   return Builder.CreateFPToSI(Src, DstTy, "conv");
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }


Index: clang/test/CodeGen/no-junk-ftrunc.c
===
--- clang/test/CodeGen/no-junk-ftrunc.c
+++ clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,14 +1,22 @@
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | FileCheck %s --check-prefix=NOSTRICT
+
+// When compiling with non-standard semantics, use intrinsics to inhibit the optimizer.
+
 // NOSTRICT-LABEL: main
+// NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
+// NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
 // NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
 
 // The workaround attribute is not applied by default.
 
 // RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s --check-prefix=STRICT
 // STRICT-LABEL: main
+// STRICT: = fptosi
+// STRICT: = fptoui
 // STRICT-NOT: strict-float-cast-overflow
 
+
 int main() {
-  return 0;
+  double d = 1e20;
+  return (int)d != 1e20 && (unsigned)d != 1e20;
 }
-
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -1240,7 +1240,18 @@
 
   if (isa(DstElementTy)) {
 assert(SrcElementTy->isFloatingPointTy() && "Unknown real conversion");
-if (DstElementType->isSignedIntegerOrEnumerationType())
+bool IsSigned = DstElementType->isSignedIntegerOrEnumerationType();
+
+// If we can't recognize overflow as undefined behavior, assume that
+// overflow saturates. This protects against normal optimizations if we are
+// compiling with non-standard FP semantics.
+if (!CGF.CGM.getCodeGenOpts().StrictFloatCastOverflow) {
+  llvm::Intrinsic::ID IID =
+  IsSigned ? llvm::Intrinsic::fptosi_sat : llvm::Intrinsic::fptoui_sat;
+  return Builder.CreateCall(CGF.CGM.getIntrinsic(IID, {DstTy, SrcTy}), Src);
+}
+
+if (IsSigned)
   return Builder.CreateFPToSI(Src, DstTy, "conv");
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-15 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D115804#3195002 , @nikic wrote:

> Looks reasonable. Making float to int cast well defined is exactly why these 
> intrinsics exist. Should we also drop the "string-float-cast-overflow" 
> attribute, as UB is now prevented in a different way?

Yes, that sounds like a good enhancement. The only in-tree use of the attribute 
is in DAGCombiner. Make that a follow-up patch or two since it crosses into 
LLVM? 
If we care about perf of the generated code in those cases, then we need to add 
something to the optimizer or codegen to recognize patterns like this:

  define float @s(float %x) {
%i = call i32 @llvm.fptosi.sat.i32.f32(float %x)
%f = sitofp i32 %i to float
ret float %f
  }

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115804/new/

https://reviews.llvm.org/D115804

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D92270: [ConstantFold] Fold more operations to poison

2021-12-15 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/lib/IR/ConstantFold.cpp:633
 // the input constant.
-return UndefValue::get(DestTy);
+return PoisonValue::get(DestTy);
   }

aqjune wrote:
> neildhar wrote:
> > spatel wrote:
> > > MatzeB wrote:
> > > > I believe this is causing some of our clients trouble, especially since 
> > > > we somehow have a `-fno-strict-float-cast-overflow` flag in clang that 
> > > > says float->int overflows are not UB... (CC @spatel )
> > > I can guess at what the example looks like, but it would be great to have 
> > > a reduced test.
> > > There should be a function attribute in IR corresponding to that clang 
> > > flag, so we could alter the behavior here based on checking that? Not 
> > > sure if there's precedence for that kind of transform though.
> > Here's a minimal repro for the issue we ran into: 
> > https://godbolt.org/z/Wdr7q1a9M
> Clang is lowering fp-to-int casts into fptosi/ui 
> (https://godbolt.org/z/Gz3Y7YKKf), but I think in this case clang must emit 
> the fptosi.sat intrinsic: 
> https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic
> It guarantees that the result is well-defined.
I agree with this suggestion. Here's a patch proposal:
D115804


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92270/new/

https://reviews.llvm.org/D92270

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-15 Thread Sanjay Patel via Phabricator via cfe-commits

spatel created this revision.
spatel added reviewers: MatzeB, neildhar, nikic, aqjune, dmgreen.
Herald added a subscriber: mcrosier.
spatel requested review of this revision.

We got an unintended consequence of the optimizer getting smarter when 
compiling in a non-standard mode, and there's no good way to inhibit those 
optimizations at a later stage. The test is based on an example linked from 
D92270 .

We allow the "no-strict-float-cast-overflow" exception to normal C cast rules 
to preserve legacy code that does not expect overflowing casts from FP to int 
to produce UB. See D46236  for details.


https://reviews.llvm.org/D115804

Files:
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/no-junk-ftrunc.c


Index: clang/test/CodeGen/no-junk-ftrunc.c
===
--- clang/test/CodeGen/no-junk-ftrunc.c
+++ clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,14 +1,23 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | 
FileCheck %s --check-prefix=NOSTRICT
+
+// When compiling with non-standard semantics, use intrinsics to inhibit the 
optimizer.
+
 // NOSTRICT-LABEL: main
+// NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
+// NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
 // NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
 
 // The workaround attribute is not applied by default.
 
 // RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s --check-prefix=STRICT
 // STRICT-LABEL: main
+// STRICT: = fptosi
+// STRICT: = fptoui
 // STRICT-NOT: strict-float-cast-overflow
 
+
 int main() {
-  return 0;
+  double d = 1e20;
+  return (int)d != 1e20 && (unsigned)d != 1e20;
 }
-
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -1240,7 +1240,18 @@
 
   if (isa(DstElementTy)) {
 assert(SrcElementTy->isFloatingPointTy() && "Unknown real conversion");
-if (DstElementType->isSignedIntegerOrEnumerationType())
+bool IsSigned = DstElementType->isSignedIntegerOrEnumerationType();
+
+// If we can't recognize overflow as undefined behavior, assume that
+// overflow saturates. This protects against normal optimizations if we are
+// compiling with non-standard FP semantics.
+if (!CGF.CGM.getCodeGenOpts().StrictFloatCastOverflow) {
+  llvm::Intrinsic::ID IID =
+  IsSigned ? llvm::Intrinsic::fptosi_sat : llvm::Intrinsic::fptoui_sat;
+  return Builder.CreateCall(CGF.CGM.getIntrinsic(IID, {DstTy, SrcTy}), 
Src);
+}
+
+if (IsSigned)
   return Builder.CreateFPToSI(Src, DstTy, "conv");
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }


Index: clang/test/CodeGen/no-junk-ftrunc.c
===
--- clang/test/CodeGen/no-junk-ftrunc.c
+++ clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,14 +1,23 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | FileCheck %s --check-prefix=NOSTRICT
+
+// When compiling with non-standard semantics, use intrinsics to inhibit the optimizer.
+
 // NOSTRICT-LABEL: main
+// NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
+// NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
 // NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
 
 // The workaround attribute is not applied by default.
 
 // RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s --check-prefix=STRICT
 // STRICT-LABEL: main
+// STRICT: = fptosi
+// STRICT: = fptoui
 // STRICT-NOT: strict-float-cast-overflow
 
+
 int main() {
-  return 0;
+  double d = 1e20;
+  return (int)d != 1e20 && (unsigned)d != 1e20;
 }
-
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -1240,7 +1240,18 @@
 
   if (isa(DstElementTy)) {
 assert(SrcElementTy->isFloatingPointTy() && "Unknown real conversion");
-if (DstElementType->isSignedIntegerOrEnumerationType())
+bool IsSigned = DstElementType->isSignedIntegerOrEnumerationType();
+
+// If we can't recognize overflow as undefined behavior, assume that
+// overflow saturates. This protects against normal optimizations if we are
+// compiling with non-standard FP semantics.
+if (!CGF.CGM.getCodeGenOpts().StrictFloatCastOverflow) {
+  llvm::Intrinsic::ID IID =
+  IsSigned ? llvm::Intrinsic::fptosi_sat : llvm::Intrinsic::fptoui_sat;
+  return Builder.CreateCall(CGF.CGM.getIntrinsic(IID, {DstTy, SrcTy}), Src);
+}
+
+if (IsSigned)
   return Builder.CreateFPToSI(Src, DstTy, "conv");
 return Builder.CreateFPToUI(Src, DstTy,

[PATCH] D92270: [ConstantFold] Fold more operations to poison

2021-10-28 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/lib/IR/ConstantFold.cpp:633
 // the input constant.
-return UndefValue::get(DestTy);
+return PoisonValue::get(DestTy);
   }

MatzeB wrote:
> I believe this is causing some of our clients trouble, especially since we 
> somehow have a `-fno-strict-float-cast-overflow` flag in clang that says 
> float->int overflows are not UB... (CC @spatel )
I can guess at what the example looks like, but it would be great to have a 
reduced test.
There should be a function attribute in IR corresponding to that clang flag, so 
we could alter the behavior here based on checking that? Not sure if there's 
precedence for that kind of transform though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92270/new/

https://reviews.llvm.org/D92270

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D109925: [AST] add warnings for out-of-bounds FP values when using relaxed FP math compile flags

2021-09-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel created this revision.
spatel added reviewers: sepavloff, jyknight, efriedma, dim, kparzysz, rsmith.
Herald added a subscriber: mcrosier.
spatel requested review of this revision.

There's currently ongoing discussion on the dev lists about how to handle 
related cases with isnan() / isinf(), but I don't think the problem addressed 
by this patch is controversial:
If the compile specified loose FP math because special values like NaN / Inf / 
-0.0 are not expected, then we should warn if the code contains an obvious 
occurrence of any of those values.

My understanding of clang isn't good though. Is this the right place to detect 
the unexpected values? I commented in the test file that I don't see the 
expected warnings in all cases before CodeGen. But if we run through to IR 
creation, then I sometimes see duplicate warnings for the same line of code.

For example, I see duplicate 3 warnings on this program based on 
https://llvm.org/PR51775 :

  % cat 51775.c 
  #include 
  #include 
  int main() {
  const double d = strtod("1E+100", NULL);
  return d == HUGE_VAL;
  }



  % clang -O1 -ffast-math 51775.c -S -o -
  51775.c:5:17: warning: floating-point infinity may be optimized out of 
computation or comparison [-Wliteral-range]
  return d == HUGE_VAL;
  ^
  
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/math.h:57:28:
 note: expanded from macro 'HUGE_VAL'
  #   defineHUGE_VAL __builtin_huge_val()
 ^
  51775.c:5:17: warning: floating-point infinity may be optimized out of 
computation or comparison [-Wliteral-range]
  
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/math.h:57:28:
 note: expanded from macro 'HUGE_VAL'
  #   defineHUGE_VAL __builtin_huge_val()
 ^
  51775.c:5:17: warning: floating-point infinity may be optimized out of 
computation or comparison [-Wliteral-range]
  
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/math.h:57:28:
 note: expanded from macro 'HUGE_VAL'
  #   defineHUGE_VAL __builtin_huge_val()
  ...


https://reviews.llvm.org/D109925

Files:
  clang/include/clang/Basic/DiagnosticASTKinds.td
  clang/lib/AST/ExprConstant.cpp
  clang/test/AST/warn-fp-values.c

Index: clang/test/AST/warn-fp-values.c
===
--- /dev/null
+++ clang/test/AST/warn-fp-values.c
@@ -0,0 +1,50 @@
+// RUN: %clang_cc1 %s -triple x86_64-unknown-unknown -verify -fsyntax-only -ffast-math -Wliteral-range
+
+#define INFINITY_MATH_MACRO  1.0 / 0.0
+
+#define NAN_MATH_MACRO  0.0f / 0.0f
+
+// Infinity
+
+double inf_by_macro() {
+  double d = INFINITY_MATH_MACRO; // expected-warning {{floating-point infinity may be optimized}}
+  return d;
+}
+
+float inf_by_overflow() {
+  const float huge_literal = 10e100f; // expected-warning {{magnitude of floating-point constant too large}} {{floating-point infinity may be optimized}}
+  return huge_literal;
+}
+
+double inf_by_builtin() {
+  double b = __builtin_inf(); // shows warning if compile continues with -emit-llvm
+  return b;
+}
+
+float inf_by_builtin_cast() {
+  float b = __builtin_inf(); // expected-warning {{floating-point infinity may be optimized}}
+  return b;
+}
+
+// NaN
+
+float nan_by_macro() {
+  float f = NAN_MATH_MACRO; // expected-warning {{floating-point NaN may be optimized}}
+  return f;
+}
+
+double nan_by_builtin() {
+  double b = __builtin_nan(""); // shows warning if compile continues with -emit-llvm
+  return b;
+}
+
+float nan_by_builtin_cast() {
+  float b = __builtin_nan(""); // expected-warning {{floating-point NaN may be optimized}}
+  return b;
+}
+
+// -0.0
+
+double neg_zero_literal() {
+  return -0.0; // shows warning if compile continues with -emit-llvm
+}
Index: clang/lib/AST/ExprConstant.cpp
===
--- clang/lib/AST/ExprConstant.cpp
+++ clang/lib/AST/ExprConstant.cpp
@@ -13563,10 +13563,22 @@
 };
 } // end anonymous namespace
 
-static bool EvaluateFloat(const Expr* E, APFloat& Result, EvalInfo ) {
+static bool EvaluateFloat(const Expr *E, APFloat , EvalInfo ) {
   assert(!E->isValueDependent());
   assert(E->isPRValue() && E->getType()->isRealFloatingType());
-  return FloatExprEvaluator(Info, Result).Visit(E);
+  bool Status = FloatExprEvaluator(Info, Result).Visit(E);
+
+  // Warn if we encounter a floating-point special value that is potentially
+  // constant-folded by the optimizer.
+  const FPOptions  = E->getFPFeaturesInEffect(Info.Ctx.getLangOpts());
+  if (Result.isInfinity() && FPO.getNoHonorInfs())
+Info.Ctx.getDiagnostics().Report(E->getExprLoc(), diag::warn_fp_infinity);
+  else if (Result.isNaN() && FPO.getNoHonorNaNs())
+Info.Ctx.getDiagnostics().Report(E->getExprLoc(), diag::warn_fp_nan);
+  else

[PATCH] D108826: [SLP][LTO][WIP]Allow full SLP in LTO only at link time.

2021-08-27 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D108826#2969604 , @ABataev wrote:

> In D108826#2969594 , @lebedev.ri 
> wrote:
>
>> Aha, so full lto. That is consistent with the phase ordering dilemma @spatel 
>> discovered: D102002 
>
> Aha, do I understand correctly that he tries to add a flag(s) that we have a 
> compile without LTO, compile at LTO and link at LTO? Or something else? Or he 
> just tries to reorder passes depending whether we're in LTO or not in LTO?

We found that there were differences between regular and LTO for the passes 
invoked, their orderings, and parameters used to enable extra optimizations. 
(There was also inconsistency between new and old pass manager, but we can 
probably just focus on NPM now.)
I suspect that almost none of those differences were intentional - people just 
made changes for whatever pipeline they were interested in at the time and 
didn't realize there was divergence.
So we now have things refactored with this note:

  /// TODO: Should LTO cause any differences to this set of passes?
  void PassBuilder::addVectorPasses(OptimizationLevel Level,
FunctionPassManager , bool IsFullLTO) {

So if there really is a reason for something to be different with LTO, it's set 
up to make that easily visible at least. :)
I made a couple of small fixes in there already, but basically any place where 
we do something differently for FullLTO should be investigated.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108826/new/

https://reviews.llvm.org/D108826

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101759: [PowerPC] Scalar IBM MASS library conversion pass

2021-08-27 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a reviewer: efriedma.
spatel added a comment.

I'm not familiar with this library, and I haven't looked at current state of 
how we enable/map optional libs in a while...
We definitely want to avoid adding another target option/debug flag, and if we 
can avoid relying on a function parameter too, that would be even better.
Ie, the "afn" fast-math-flag (possibly in combination with some other IR- or 
node-level flags) seems like it should be enough to allow this 
transform/lowering. 
Scanning the earlier review comments, there was some concern about the 
semantics wrt errno. If we need to adjust the "afn" definition, it's probably 
fine. There haven't been many uses of that flag AFAIK.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101759/new/

https://reviews.llvm.org/D101759

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104854: Introduce intrinsic llvm.isnan

2021-08-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D104854#2957582 , @lebedev.ri 
wrote:

> In D104854#2957529 , @spatel wrote:
>
>> In D104854#2957471 , @sepavloff 
>> wrote:
>>
>>> In D104854#2957423 , @spatel 
>>> wrote:
>>>
 Is it intentional that we are not canonicalizing the intrinsic call back 
 to `fcmp uno` in the default FP environment?
>>>
>>> It is lowered to unordered comparison by default. Changing `llvm.isnan` to  
>>> `fcmp uno` somewhere in IR would make it possible to optimize out the 
>>> latter if fast-math mode is on. Preserving semantics of `isnan` when 
>>> fast-math is in effect was one of the goals of this change.
>>
>> I understand that the codegen was supposed to be no worse, but the 
>> difference in IR causes optimizer regressions like: 
>> https://llvm.org/PR51556
>>
>> If we want this intrinsic (and its siblings that haven't been created yet) 
>> to survive through IR, then we have to enhance IR passes to recognize the 
>> new patterns. 
>> It would be easier to do this in steps: (1) create the intrinsic only if not 
>> in the default FP env, (2) update IR analysis/passes to recognize the 
>> intrinsic, (3) create the intrinsic in the default FP env with no FMF, (4) 
>> create the intrinsic always.
>
> +1, but right now i'm not sold on the behavior of not optimizing away NaN 
> checks in no-NaN's mode.
> At least that part should be reconciled now. It *might* be an improvement, 
> but it caters to expectations
> of one group while catering away from the documentation and existing 
> expectations of other groups.
> This shouldn't be decided in a review, it should be driven by an RFC.

Agree. I think a revert followed by RFC to make sure there is consensus on 
semantics is needed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104854/new/

https://reviews.llvm.org/D104854

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104854: Introduce intrinsic llvm.isnan

2021-08-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D104854#2957471 , @sepavloff wrote:

> In D104854#2957423 , @spatel wrote:
>
>> Is it intentional that we are not canonicalizing the intrinsic call back to 
>> `fcmp uno` in the default FP environment?
>
> It is lowered to unordered comparison by default. Changing `llvm.isnan` to  
> `fcmp uno` somewhere in IR would make it possible to optimize out the latter 
> if fast-math mode is on. Preserving semantics of `isnan` when fast-math is in 
> effect was one of the goals of this change.

I understand that the codegen was supposed to be no worse, but the difference 
in IR causes optimizer regressions like: 
https://llvm.org/PR51556

If we want this intrinsic (and its siblings that haven't been created yet) to 
survive through IR, then we have to enhance IR passes to recognize the new 
patterns. 
It would be easier to do this in steps: (1) create the intrinsic only if not in 
the default FP env, (2) update IR analysis/passes to recognize the intrinsic, 
(3) create the intrinsic in the default FP env with no FMF, (4) create the 
intrinsic always.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104854/new/

https://reviews.llvm.org/D104854

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104854: Introduce intrinsic llvm.isnan

2021-08-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

Is it intentional that we are not canonicalizing the intrinsic call back to 
`fcmp uno` in the default FP environment?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104854/new/

https://reviews.llvm.org/D104854

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D107843: [X86] Add parentheses around casts in some of the X86 intrinsic headers.

2021-08-13 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

Probably want to address the other cleanups in another patch; the parens fixes 
and test LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107843/new/

https://reviews.llvm.org/D107843

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D107843: [X86] Add parentheses around casts in some of the X86 intrinsic headers.

2021-08-10 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

Hard to see through all of the lint noise, but seems like a mechanical fix.
Can we add a test like the one in the bug report?
https://godbolt.org/z/sPT8e9vx9


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107843/new/

https://reviews.llvm.org/D107843

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-07-05 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/test/CodeGen/X86/stack-folding-fp-avx512fp16vl-fma.ll:193-194
+define <8 x half> @stack_fold_fmsub123ph(<8 x half> %a0, <8 x half> %a1, <8 x 
half> %a2) {
+  ;check-label: stack_fold_fmsub123ph:
+  ;check:   vfmsub213ph {{-?[0-9]*}}(%rsp), {{%xmm[0-9][0-9]*}}, 
{{%xmm[0-9][0-9]*}} {{.*#+}} 16-byte folded reload
+  %1 = tail call <2 x i64> asm sideeffect "nop", 
"=x,~{xmm3},~{xmm4},~{xmm5},~{xmm6},~{xmm7},~{xmm8},~{xmm9},~{xmm10},~{xmm11},~{xmm12},~{xmm13},~{xmm14},~{xmm15},~{xmm16},~{xmm17},~{xmm18},~{xmm19},~{xmm20},~{xmm21},~{xmm22},~{xmm23},~{xmm24},~{xmm25},~{xmm26},~{xmm27},~{xmm28},~{xmm29},~{xmm30},~{xmm31},~{flags}"()

I was just scanning through this patch and noticed the capitalization mismatch 
on these lines and others. This test has no valid checks as written?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105268/new/

https://reviews.llvm.org/D105268

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-05-30 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D101191#2783570 , @rupprecht wrote:

> In D101191#2782963 , @rupprecht 
> wrote:
>
>> The issue I'm seeing seems more directly caused by SLP vectorization, as it 
>> goes away with `-fno-slp-vectorize`. This patch merely unblocks that bad 
>> optimization AFAICT.
>
> Filed as http://llvm.org/PR50500

We needed SLP to get to the problem state in that example, but the bug was in 
instcombine/instsimplify. 
Should be fixed with:
7bb8bfa0622b 



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101191/new/

https://reviews.llvm.org/D101191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-05-25 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D101191#2779631 , @spatel wrote:

> In D101191#2779007 , @fhahn wrote:
>
>> Looks like this is causing an infinite loop in instcombine: 
>> https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34661
>
> This seems likely to be another partial undef/poison vector problem ( 
> 1894c6c59e 
>  ). I'll 
> take a look.

The fuzzer test can be reduced to a single line. I committed a minimal fix here:
ae1bc9ebf3 
...but some follow-up seems necessary.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101191/new/

https://reviews.llvm.org/D101191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-05-25 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D101191#2779007 , @fhahn wrote:

> Looks like this is causing an infinite loop in instcombine: 
> https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34661

This seems likely to be another partial undef/poison vector problem ( 
1894c6c59e 
 ). I'll 
take a look.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101191/new/

https://reviews.llvm.org/D101191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-30 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: 
llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll:20
 
+; FIXME: noundef should be attached to args
 define i1 @will_not_overflow(i64 %arg, i64 %arg1) {

aqjune wrote:
> nikic wrote:
> > spatel wrote:
> > > aqjune wrote:
> > > > aqjune wrote:
> > > > > spatel wrote:
> > > > > > Any ideas about what it will take to get those argument attributes 
> > > > > > for the C++ source example shown in the code comment?
> > > > > > 
> > > > > > SDAG is still going to convert the `select` to `and`, so we can 
> > > > > > probably avoid regressions by replicating InstSimplify's 
> > > > > > omitCheckForZeroBeforeMulWithOverflow() as a DAG combine. Let me 
> > > > > > know if I should do that.
> > > > > I promised to do the patch at D82317, but these days I'm occupied 
> > > > > with other things, so it might not be a recent future (not in a month 
> > > > > at least)...
> > > > > 
> > > > > I think it is a good chance to use freeze here: We can add
> > > > > ```
> > > > > %cond = icmp ne %x, 0
> > > > > %v = call @llvm.umul.with.overflow(%x, %y)
> > > > > %ov = extractvalue %v, 1
> > > > > %res = select i1 %cond, %ov, false
> > > > >   =>
> > > > > %y.fr = freeze %y
> > > > > %v = call @llvm.umul.with.overflow(%x, %y.fr)
> > > > > %ov = extractvalue %v, 1
> > > > > %res = %ov
> > > > > ```
> > > > > into CodeGenPrepare.
> > > > > What do you think? CodeGenPrepare is already using freeze for a few 
> > > > > transformations.
> > > > Alive2 proof: https://alive2.llvm.org/ce/z/zgPUGT
> > > I don't think we want to add code to CGP if we can avoid it. CGP is 
> > > supposed to be a temporary hack that is not needed with GlobalISel. I do 
> > > acknowledge that "temporary" in the code may outlive the people working 
> > > on it today (!).
> > > 
> > > If we don't care about undef correctness in codegen (in other words, the 
> > > select->and transform will live on in codegen forever), then we might as 
> > > well add a DAG combine. Alternatively, are we comfortable creating freeze 
> > > in instcombine for rare code like umul.with.ov?
> > I think this is one of the rare cases where inserting Freeze in InstCombine 
> > makes sense. There's not much (any?) further optimization opportunities for 
> > a umul.with.overflow with non-constant operands.
> InstCombine sounds good as well!
For reference, that part is D101423, and it will be a preliminary patch before 
this one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101191/new/

https://reviews.llvm.org/D101191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-27 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: 
llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll:20
 
+; FIXME: noundef should be attached to args
 define i1 @will_not_overflow(i64 %arg, i64 %arg1) {

aqjune wrote:
> aqjune wrote:
> > spatel wrote:
> > > Any ideas about what it will take to get those argument attributes for 
> > > the C++ source example shown in the code comment?
> > > 
> > > SDAG is still going to convert the `select` to `and`, so we can probably 
> > > avoid regressions by replicating InstSimplify's 
> > > omitCheckForZeroBeforeMulWithOverflow() as a DAG combine. Let me know if 
> > > I should do that.
> > I promised to do the patch at D82317, but these days I'm occupied with 
> > other things, so it might not be a recent future (not in a month at 
> > least)...
> > 
> > I think it is a good chance to use freeze here: We can add
> > ```
> > %cond = icmp ne %x, 0
> > %v = call @llvm.umul.with.overflow(%x, %y)
> > %ov = extractvalue %v, 1
> > %res = select i1 %cond, %ov, false
> >   =>
> > %y.fr = freeze %y
> > %v = call @llvm.umul.with.overflow(%x, %y.fr)
> > %ov = extractvalue %v, 1
> > %res = %ov
> > ```
> > into CodeGenPrepare.
> > What do you think? CodeGenPrepare is already using freeze for a few 
> > transformations.
> Alive2 proof: https://alive2.llvm.org/ce/z/zgPUGT
I don't think we want to add code to CGP if we can avoid it. CGP is supposed to 
be a temporary hack that is not needed with GlobalISel. I do acknowledge that 
"temporary" in the code may outlive the people working on it today (!).

If we don't care about undef correctness in codegen (in other words, the 
select->and transform will live on in codegen forever), then we might as well 
add a DAG combine. Alternatively, are we comfortable creating freeze in 
instcombine for rare code like umul.with.ov?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101191/new/

https://reviews.llvm.org/D101191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-27 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: 
llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll:20
 
+; FIXME: noundef should be attached to args
 define i1 @will_not_overflow(i64 %arg, i64 %arg1) {

Any ideas about what it will take to get those argument attributes for the C++ 
source example shown in the code comment?

SDAG is still going to convert the `select` to `and`, so we can probably avoid 
regressions by replicating InstSimplify's 
omitCheckForZeroBeforeMulWithOverflow() as a DAG combine. Let me know if I 
should do that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101191/new/

https://reviews.llvm.org/D101191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-27 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/test/Transforms/PhaseOrdering/X86/vector-reductions.ll:282
 
+; FIXME: this should be vectorized
 define i1 @cmp_lt_gt(double %a, double %b, double %c) {

I don't think we need to worry about regressing this. It's not ideal before or 
after, but we probably have a better chance of getting to the goal by making 
instcombine/simplifycfg correct and consistent.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101191/new/

https://reviews.llvm.org/D101191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D99898: [clang, test] Fix use of undef FileCheck var

2021-04-06 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: clang/test/CodeGen/libcalls.c:127
 // CHECK-NO-DAG: attributes [[NUW_RN]] = { nounwind readnone{{.*}} }
+// CHECK-YES-NOT: attributes [[NUW_RN]] = { nounwind readnone{{.*}} }
 // CHECK-NO-DAG: attributes [[NUW_RNI]] = { nofree nosync nounwind readnone 
speculatable willreturn }

Can we use a positive CHECK for the expected attributes instead?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99898/new/

https://reviews.llvm.org/D99898

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D98945: [BranchProbability] move options for 'likely' and 'unlikely'

2021-03-20 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGee8b53815ddf: [BranchProbability] move options for 
likely and unlikely (authored by spatel).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98945/new/

https://reviews.llvm.org/D98945

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp
  llvm/include/llvm/Support/BranchProbability.h
  llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
  llvm/lib/Support/BranchProbability.cpp
  llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp

Index: llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
===
--- llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
+++ llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
@@ -24,6 +24,7 @@
 #include "llvm/IR/Metadata.h"
 #include "llvm/InitializePasses.h"
 #include "llvm/Pass.h"
+#include "llvm/Support/BranchProbability.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Transforms/Scalar.h"
 
@@ -34,25 +35,6 @@
 STATISTIC(ExpectIntrinsicsHandled,
   "Number of 'expect' intrinsic instructions handled");
 
-// These default values are chosen to represent an extremely skewed outcome for
-// a condition, but they leave some room for interpretation by later passes.
-//
-// If the documentation for __builtin_expect() was made explicit that it should
-// only be used in extreme cases, we could make this ratio higher. As it stands,
-// programmers may be using __builtin_expect() / llvm.expect to annotate that a
-// branch is likely or unlikely to be taken.
-//
-// There is a known dependency on this ratio in CodeGenPrepare when transforming
-// 'select' instructions. It may be worthwhile to hoist these values to some
-// shared space, so they can be used directly by other passes.
-
-cl::opt llvm::LikelyBranchWeight(
-"likely-branch-weight", cl::Hidden, cl::init(2000),
-cl::desc("Weight of the branch likely to be taken (default = 2000)"));
-cl::opt llvm::UnlikelyBranchWeight(
-"unlikely-branch-weight", cl::Hidden, cl::init(1),
-cl::desc("Weight of the branch unlikely to be taken (default = 1)"));
-
 static std::tuple
 getBranchWeight(Intrinsic::ID IntrinsicID, CallInst *CI, int BranchCount) {
   if (IntrinsicID == Intrinsic::expect) {
Index: llvm/lib/Support/BranchProbability.cpp
===
--- llvm/lib/Support/BranchProbability.cpp
+++ llvm/lib/Support/BranchProbability.cpp
@@ -19,6 +19,20 @@
 
 using namespace llvm;
 
+// These default values are chosen to represent an extremely skewed outcome for
+// a condition, but they leave some room for interpretation by later passes.
+//
+// If the documentation for __builtin_expect() was made explicit that it should
+// only be used in extreme cases, we could make this ratio higher. As it stands,
+// programmers may be using __builtin_expect() / llvm.expect to annotate that a
+// branch is only mildly likely or unlikely to be taken.
+cl::opt llvm::LikelyBranchWeight(
+"likely-branch-weight", cl::Hidden, cl::init(2000),
+cl::desc("Weight of the branch likely to be taken (default = 2000)"));
+cl::opt llvm::UnlikelyBranchWeight(
+"unlikely-branch-weight", cl::Hidden, cl::init(1),
+cl::desc("Weight of the branch unlikely to be taken (default = 1)"));
+
 constexpr uint32_t BranchProbability::D;
 
 raw_ostream ::print(raw_ostream ) const {
Index: llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
===
--- llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
+++ llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
@@ -17,7 +17,6 @@
 
 #include "llvm/IR/Function.h"
 #include "llvm/IR/PassManager.h"
-#include "llvm/Support/CommandLine.h"
 
 namespace llvm {
 
@@ -32,8 +31,6 @@
   PreservedAnalyses run(Function , FunctionAnalysisManager &);
 };
 
-extern cl::opt LikelyBranchWeight;
-extern cl::opt UnlikelyBranchWeight;
 }
 
 #endif
Index: llvm/include/llvm/Support/BranchProbability.h
===
--- llvm/include/llvm/Support/BranchProbability.h
+++ llvm/include/llvm/Support/BranchProbability.h
@@ -13,6 +13,7 @@
 #ifndef LLVM_SUPPORT_BRANCHPROBABILITY_H
 #define LLVM_SUPPORT_BRANCHPROBABILITY_H
 
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/DataTypes.h"
 #include 
 #include 
@@ -21,6 +22,9 @@
 
 namespace llvm {
 
+extern cl::opt LikelyBranchWeight;
+extern cl::opt UnlikelyBranchWeight;
+
 class raw_ostream;
 
 // This class represents Branch Probability as a non-negative fraction that is
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -42,8 +42,8 @@
 #include

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2021-02-14 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

Thanks! LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93179/new/

https://reviews.llvm.org/D93179

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2021-02-14 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: clang/lib/Headers/avx512fintrin.h:9305
+ * 1. The elements are reassociable when using fadd/fmul intrinsics;
+ * 2. There's no nan and signed zero in the elements when using fmin/max
+ intrinsics;

pengfei wrote:
> spatel wrote:
> > If I understand correctly, there's nothing preventing -0.0 or NaN values in 
> > these ops. But if either of those are present, then the result is 
> > potentially indeterminate.
> > 
> > For the LLVM intrinsic, we have this text in LangRef:
> > "The result will always be a number unless all elements of the vector are 
> > NaN. For a vector with minimum element magnitude 0.0 and containing both 
> > +0.0 and -0.0 elements, the sign of the result is unspecified."
> Thanks @spatel for the information. I checked that the LLVM intrinsic does 
> ignore the sign of zeros. https://godbolt.org/z/a9Yj8a. So we can remove the 
> no signed zero assumption.
> But X86 fmin/fmax instructions have difference with the LLVM intrinsic, i.e. 
> the result might be NaN even there's only one NaN in elements.
> Besides, the LangRef also says "If the intrinsic call has the nnan fast-math 
> flag, then the operation can assume that NaNs are not present in the input 
> vector."
> So we still need the no nan assumption here.
I understand the behavior that we want to specify for this API, but we are not 
stating it clearly for the user. I think we should do that. How about:

```
 * For floating-point intrinsics:
 * 1. When using fadd/fmul intrinsics, the order of operations within the 
vector is unspecified (associative math).
 * 2. When using fmin/fmax intrinsics, NaN or -0.0 elements within the vector 
produce unspecified results. 
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93179/new/

https://reviews.llvm.org/D93179

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2021-02-10 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: clang/lib/Headers/avx512fintrin.h:9305
+ * 1. The elements are reassociable when using fadd/fmul intrinsics;
+ * 2. There's no nan and signed zero in the elements when using fmin/max
+ intrinsics;

If I understand correctly, there's nothing preventing -0.0 or NaN values in 
these ops. But if either of those are present, then the result is potentially 
indeterminate.

For the LLVM intrinsic, we have this text in LangRef:
"The result will always be a number unless all elements of the vector are NaN. 
For a vector with minimum element magnitude 0.0 and containing both +0.0 and 
-0.0 elements, the sign of the result is unspecified."


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93179/new/

https://reviews.llvm.org/D93179

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D96231: [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.

2021-02-09 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96231/new/

https://reviews.llvm.org/D96231

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D96231: [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.

2021-02-08 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: clang/lib/Headers/avx512fintrin.h:9352
 static __inline__ double __DEFAULT_FN_ATTRS512 _mm512_reduce_add_pd(__m512d 
__W) {
   return __builtin_ia32_reduce_fadd_pd512(0.0, __W);
 }

Ah - this is where the +0.0 is specified. This should be -0.0. We could still 
add 'nsz' flag to be safe.



Comment at: clang/lib/Headers/avx512fintrin.h:9362
   __W = _mm512_maskz_mov_pd(__M, __W);
   return __builtin_ia32_reduce_fadd_pd512(0.0, __W);
 }

This also should be changed to -0.0?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96231/new/

https://reviews.llvm.org/D96231

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D96231: [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.

2021-02-08 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.

Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13829
 CGM.getIntrinsic(Intrinsic::vector_reduce_fadd, Ops[1]->getType());
+Builder.getFastMathFlags().setAllowReassoc(true);
 return Builder.CreateCall(F, {Ops[0], Ops[1]});

I haven't looked at this part of the compiler in a long time, so I was 
wondering how we handle FMF scope. It looks like there is already an FMFGuard 
object in place -- CodeGenFunction::CGFPOptionsRAII(). So setting FMF here will 
not affect anything but this CreateCall(). 

Does that match your understanding? Should we have an extra regression test to 
make sure that does not change?

I am imagining something like:

```
double test_mm512_reduce_add_pd(__m512d __W, double ExtraAddOp) {
  double S = _mm512_reduce_add_pd(__W) + ExtraAddOp;
  return S;
}

```

Then we could confirm that `reassoc` is not applied to the `fadd` that follows 
the reduction call.

Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13829
 CGM.getIntrinsic(Intrinsic::vector_reduce_fadd, Ops[1]->getType());
+Builder.getFastMathFlags().setAllowReassoc(true);
 return Builder.CreateCall(F, {Ops[0], Ops[1]});

spatel wrote:
> I haven't looked at this part of the compiler in a long time, so I was 
> wondering how we handle FMF scope. It looks like there is already an FMFGuard 
> object in place -- CodeGenFunction::CGFPOptionsRAII(). So setting FMF here 
> will not affect anything but this CreateCall(). 
> 
> Does that match your understanding? Should we have an extra regression test 
> to make sure that does not change?
> 
> I am imagining something like:
> 
> ```
> double test_mm512_reduce_add_pd(__m512d __W, double ExtraAddOp) {
>   double S = _mm512_reduce_add_pd(__W) + ExtraAddOp;
>   return S;
> }
> 
> ```
> 
> Then we could confirm that `reassoc` is not applied to the `fadd` that 
> follows the reduction call.
Currently (and we could say that this is an LLVM codegen bug), we will not 
generate the optimal/expected reduction with `reassoc` alone.

I think the x86 reduction definition is implicitly assuming that -0.0 is not 
meaningful here, so we should add `nsz` too.

The backend is expecting an explicit `nsz` on this op. Ie, I see this x86 asm 
currently with only `reassoc`:

```
vextractf64x4   $1, %zmm0, %ymm1
vaddpd  %zmm1, %zmm0, %zmm0
vextractf128$1, %ymm0, %xmm1
vaddpd  %xmm1, %xmm0, %xmm0
vpermilpd   $1, %xmm0, %xmm1  
vaddsd  %xmm1, %xmm0, %xmm0 
vxorpd  %xmm1, %xmm1, %xmm1   <--- create 0.0
vaddsd  %xmm1, %xmm0, %xmm0   <--- add it to the reduction result
```

Alternatively (and I'm not sure where it is specified), we could replace the 
default 0.0 argument with -0.0?

Comment at: clang/lib/Headers/avx512fintrin.h:9300
  * outputs. This class of vector operation forms the basis of many scientific
  * computations. In vector-reduction arithmetic, the evaluation off is
  * independent of the order of the input elements of V.

This is an existing text bug, but if we are changing this text, we might as 
well fix it in this patch - I'm not sure what "off" refers to here. Should that 
be "order"?

Comment at: clang/lib/Headers/avx512fintrin.h:9303

+ * For floating points type, we always assume the elements are reassociable 
even
+ * if -fast-math is off.

Typo: "floating-point types"

Comment at: clang/lib/Headers/avx512fintrin.h:9304
+ * For floating points type, we always assume the elements are reassociable 
even
+ * if -fast-math is off.
+

Also mention that sign of zero is indeterminate. We might use the LangRef text 
as a model for what to say here:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fadd-intrinsic

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96231/new/

https://reviews.llvm.org/D96231

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93923: Use unary CreateShuffleVector if possible

2020-12-30 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

See inline comments to avoid bot failures - otherwise, LGTM. Thanks for the 
cleanup!




Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3193
 IRBuilder<> IRB();
 Type *ShadowTy = getShadowTy();
 unsigned Width =

ShadowTy is now an unused variable. Delete to avoid a compile warning.



Comment at: llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp:337
   Value *Vec = isColumnMajor() ? getColumn(J) : getRow(I);
   Value *Undef = UndefValue::get(Vec->getType());
   return Builder.CreateShuffleVector(

`Undef` is now an unused variable. Delete to avoid a compile warning.



Comment at: llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp:450
 SmallVector SplitVecs;
 Value *Undef = UndefValue::get(VType);
 for (unsigned MaskStart = 0;

Undef is now an unused variable. Delete to avoid a compile warning.



Comment at: llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp:944
 
 Value *Undef = UndefValue::get(Block->getType());
 Block = Builder.CreateShuffleVector(

Undef is now an unused variable. Delete to avoid a compile warning.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93923/new/

https://reviews.llvm.org/D93923

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93793: [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder

2020-12-29 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/include/llvm/IR/IRBuilder.h:2527
   /// Create a unary shuffle. The second vector operand of the IR instruction
   /// is undefined.
   Value *CreateShuffleVector(Value *V, ArrayRef Mask,

update code comment: undefined -> poison



Comment at: llvm/lib/IR/IRBuilder.cpp:1020
   Value *Zeros = ConstantAggregateZero::get(VectorType::get(I32Ty, EC));
-  return CreateShuffleVector(V, Undef, Zeros, Name + ".splat");
+  return CreateShuffleVector(V, Poison, Zeros, Name + ".splat");
 }

I'm still catching up on reviews/mails, so ignore if this was already discussed:
Can we change this usage of CreateShuffleVector to the unary operand version, 
so we're reducing the number of explicit places where we need to specify poison?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93793/new/

https://reviews.llvm.org/D93793

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93586: [InstCombine] use poison as placeholder for undemanded elems

2020-12-22 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D93586#2467248 , @aqjune wrote:

>> There are 792 files in llvm/test, most of which are in test/Transform (119) 
>> and test/CodeGen(655).
>> The transition will be swiftly done (if there's no other issue hopefully) by 
>> the next weekend.
>
> Thinking about these again, do we need to make a poison copy for 
> test/CodeGen? I don't think so, since the backend won't be changed anyway.

It would be good to update those for consistency; the codegen tests are 
supposed to be representative of what comes out of the IR optimizer. IIUC, we 
could do the substitution on those files already, and it would not change 
anything. But let's sort out the IR changes first?

In D93586#2466284 , @aqjune wrote:

> My concern is that some tests aren't using `utils/update_test_checks.py`, and 
> this makes things complicated. :(
> Simply replacing `insertelement undef` at `CHECK:` isn't enough (ex: 
> Transforms/SLPVectorizer/X86/alternate-cast.ll).

Yes, tests that don't have scripted CHECK lines require more work to 
understand. That SLP test file is scripted though. Is there another problem 
with that one?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93586/new/

https://reviews.llvm.org/D93586

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93586: [InstCombine] use poison as placeholder for undemanded elems

2020-12-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added reviewers: nlopes, regehr, RKSimon, zhengyangl, nikic, hfinkel.
spatel added a comment.

Thank you for working on this! Looking back at the bug comments (and adding 
reviewers based on those comments), this is a step towards killing undef that 
has been discussed for a long time now. :)

Besides changing IRBuilder and shufflevector's definition, I think we'll also 
need updates in the vectorizers, InstSimplify, and other places in InstCombine 
that use UndefValue for InsertElement and shuffle transforms.

> A conservative way is copying all tests having insertelement undef & 
> replacing it with insertelement poison & run Alive2 on it, but it will create 
> many tests and people won’t like it. :(

Do you have an estimate of how many tests are out there? If it's a temporary 
increase and not huge, I don't think there is much downside. But if we think 
the transition will take a long time, then maybe we don't want the duplication.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93586/new/

https://reviews.llvm.org/D93586

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2020-12-14 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

If we're going by existing behavior/compatibility, gcc/icc use packed ops too:
https://godbolt.org/z/9jEhaW
...so there's an implicit 'nnan nsz' in these intrinsics (and that should be 
documented in the header file (and file a bug for Intel's page at 
https://software.intel.com/sites/landingpage/IntrinsicsGuide/ ?).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93179/new/

https://reviews.llvm.org/D93179

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-12-10 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D87188#2445506 , @lebedev.ri wrote:

> Partial rebase (without updating test coverage)

Thanks for reducing!
If I'm seeing it correctly, the codegen looks fine - the difference is in the 
IR icmp predicate changing from `slt` to `ult`, and that diff appears to be 
correct independent of whether or not we use the abs intrinsic:
https://alive2.llvm.org/ce/z/VDDqBj

So either the source/test has an incorrect expectation and/or something outside 
of this function is wrong?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87188/new/

https://reviews.llvm.org/D87188

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88712: [CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting

2020-11-24 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D88712#2413877 , 
@venkataramanan.kumar.llvm wrote:

> In D88712#2413688 , @spatel wrote:
>
>> In D88712#2412874 , 
>> @venkataramanan.kumar.llvm wrote:
>>
>>> In D88712#2412366 , @MaskRay wrote:
>>>
 In D88712#2411841 , 
 @venkataramanan.kumar.llvm wrote:

> 

 For your example:

   extern double log (double) asm ("" "log_finite") __attribute__ 
 ((nothrow));
   
   double mylog (double d) { return log(d); }

 The intention is to emit `log_finite`, no matter the `-ffast-math`. Both 
 GCC and Clang (after this patch) emit `jmp log_finite`.

 The previous behavior (according to your comment, with an older glibc) was 
 undesired. So I think the right suggestion is to upgrade glibc.
>>>
>>> Then there optimizations like vectorization, inst combine which works on 
>>> the LLVM intrinsics.  But they will be not happening now  with  log_finite 
>>> calls.
>>
>> I'm not sure about the expected semantics/lowering for the finite calls, but 
>> can you add something under 
>> LibCallSimplifier::optimizeFloatingPointLibCall() that would turn it into 
>> the LLVM log intrinsic with appropriate FMF? Codegen would need to be 
>> responsible for converting it back to finite call(s) if those are available?
>
> Hi Sanjay I thought codegen no longer lowers them back to finite calls 
> https://reviews.llvm.org/rGcd0926d087a85c5ee1222ca80980b4440214a822

There was a comment in D74712  that we might 
be moving too fast. If you would like to revert/adjust that patch, raise a bug 
or post a patch to discuss? If the goal is to be able to vectorize the finite 
calls, then we will need some way to represent those. Alternatively, we could 
change something in the cost models to enable more unrolling, etc?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88712/new/

https://reviews.llvm.org/D88712

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88712: [CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting

2020-11-24 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D88712#2412874 , 
@venkataramanan.kumar.llvm wrote:

> In D88712#2412366 , @MaskRay wrote:
>
>> In D88712#2411841 , 
>> @venkataramanan.kumar.llvm wrote:
>>
>>> 
>>
>> For your example:
>>
>>   extern double log (double) asm ("" "log_finite") __attribute__ ((nothrow));
>>   
>>   double mylog (double d) { return log(d); }
>>
>> The intention is to emit `log_finite`, no matter the `-ffast-math`. Both GCC 
>> and Clang (after this patch) emit `jmp log_finite`.
>>
>> The previous behavior (according to your comment, with an older glibc) was 
>> undesired. So I think the right suggestion is to upgrade glibc.
>
> Then there optimizations like vectorization, inst combine which works on the 
> LLVM intrinsics.  But they will be not happening now  with  log_finite calls.

I'm not sure about the expected semantics/lowering for the finite calls, but 
can you add something under LibCallSimplifier::optimizeFloatingPointLibCall() 
that would turn it into the LLVM log intrinsic with appropriate FMF? Codegen 
would need to be responsible for converting it back to finite call(s) if those 
are available?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88712/new/

https://reviews.llvm.org/D88712

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-11-10 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D87188#2348343 , @spatel wrote:

> In D87188#2348326 , @nikic wrote:
>
>> Reopening this so we don't forget...
>>
>> I believe @spatel is working on the cost modelling. I did not have much luck 
>> tracking down the miscompile, at least did not spot anything incriminating 
>> in the llvm-diff.
>
> Yes, I'm working through the mess of the TTI cost model with things like:
> c963bde0152a 
> 
> It's a slog...

D90554  / f7eac51b9b 

I think that change makes this patch ready to try again. If we do see 
regressions, then it should now be easy to adjust target-specific cost modeling 
of abs() intrinsics to fix those.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87188/new/

https://reviews.llvm.org/D87188

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-10-22 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D87188#2348326 , @nikic wrote:

> Reopening this so we don't forget...
>
> I believe @spatel is working on the cost modelling. I did not have much luck 
> tracking down the miscompile, at least did not spot anything incriminating in 
> the llvm-diff.

Yes, I'm working through the mess of the TTI cost model with things like:
c963bde0152a 

It's a slog...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87188/new/

https://reviews.llvm.org/D87188

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-22 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG57cdc52c4df0: Initial support for vectorization using 
Libmvec (GLIBC vector math library) (authored by venkataramanan.kumar.llvm, 
committed by spatel).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88154/new/

https://reviews.llvm.org/D88154

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/Driver/autocomplete.c
  clang/test/Driver/fveclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/include/llvm/Analysis/VecFuncs.def
  llvm/lib/Analysis/TargetLibraryInfo.cpp
  llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-VF2-VF8.ll
  llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll
  llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll
  llvm/test/Transforms/Util/add-TLI-mappings.ll

Index: llvm/test/Transforms/Util/add-TLI-mappings.ll
===
--- llvm/test/Transforms/Util/add-TLI-mappings.ll
+++ llvm/test/Transforms/Util/add-TLI-mappings.ll
@@ -3,6 +3,8 @@
 ; RUN: opt -vector-library=MASSV  -inject-tli-mappings-S < %s | FileCheck %s  --check-prefixes=COMMON,MASSV
 ; RUN: opt -vector-library=MASSV  -passes=inject-tli-mappings -S < %s | FileCheck %s  --check-prefixes=COMMON,MASSV
 ; RUN: opt -vector-library=Accelerate -inject-tli-mappings-S < %s | FileCheck %s  --check-prefixes=COMMON,ACCELERATE
+; RUN: opt -vector-library=LIBMVEC-X86 -inject-tli-mappings-S < %s | FileCheck %s  --check-prefixes=COMMON,LIBMVEC-X86
+; RUN: opt -vector-library=LIBMVEC-X86 -passes=inject-tli-mappings -S < %s | FileCheck %s  --check-prefixes=COMMON,LIBMVEC-X86
 ; RUN: opt -vector-library=Accelerate -passes=inject-tli-mappings -S < %s | FileCheck %s  --check-prefixes=COMMON,ACCELERATE
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
@@ -21,6 +23,9 @@
 ; MASSV-SAME: i8* bitcast (<4 x float> (<4 x float>)* @__log10f4_massv to i8*)
 ; ACCELERATE-SAME:  [1 x i8*] [
 ; ACCELERATE-SAME:i8* bitcast (<4 x float> (<4 x float>)* @vlog10f to i8*)
+; LIBMVEC-X86-SAME: [2 x i8*] [
+; LIBMVEC-X86-SAME:   i8* bitcast (<2 x double> (<2 x double>)* @_ZGVbN2v_sin to i8*),
+; LIBMVEC-X86-SAME:   i8* bitcast (<4 x double> (<4 x double>)* @_ZGVdN4v_sin to i8*)
 ; COMMON-SAME:  ], section "llvm.metadata"
 
 define double @sin_f64(double %in) {
@@ -28,6 +33,7 @@
 ; SVML: call double @sin(double %{{.*}}) #[[SIN:[0-9]+]]
 ; MASSV:call double @sin(double %{{.*}}) #[[SIN:[0-9]+]]
 ; ACCELERATE:   call double @sin(double %{{.*}})
+; LIBMVEC-X86:  call double @sin(double %{{.*}}) #[[SIN:[0-9]+]]
 ; No mapping of "sin" to a vector function for Accelerate.
 ; ACCELERATE-NOT: _ZGV_LLVM_{{.*}}_sin({{.*}})
   %call = tail call double @sin(double %in)
@@ -39,10 +45,12 @@
 define float @call_llvm.log10.f32(float %in) {
 ; COMMON-LABEL: @call_llvm.log10.f32(
 ; SVML: call float @llvm.log10.f32(float %{{.*}})
+; LIBMVEC-X86:  call float @llvm.log10.f32(float %{{.*}})
 ; MASSV:call float @llvm.log10.f32(float %{{.*}}) #[[LOG10:[0-9]+]]
 ; ACCELERATE:   call float @llvm.log10.f32(float %{{.*}}) #[[LOG10:[0-9]+]]
 ; No mapping of "llvm.log10.f32" to a vector function for SVML.
 ; SVML-NOT: _ZGV_LLVM_{{.*}}_llvm.log10.f32({{.*}})
+; LIBMVEC-X86-NOT: _ZGV_LLVM_{{.*}}_llvm.log10.f32({{.*}})
   %call = tail call float @llvm.log10.f32(float %in)
   ret float %call
 }
@@ -62,3 +70,7 @@
 
 ; ACCELERATE:  attributes #[[LOG10]] = { "vector-function-abi-variant"=
 ; ACCELERATE-SAME:   "_ZGV_LLVM_N4v_llvm.log10.f32(vlog10f)" }
+
+; LIBMVEC-X86:  attributes #[[SIN]] = { "vector-function-abi-variant"=
+; LIBMVEC-X86-SAME:   "_ZGV_LLVM_N2v_sin(_ZGVbN2v_sin),
+; LIBMVEC-X86-SAME:   _ZGV_LLVM_N4v_sin(_ZGVdN4v_sin)" }
Index: llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll
@@ -0,0 +1,373 @@
+; RUN: opt -vector-library=LIBMVEC-X86  -inject-tli-mappings -loop-vectorize -S < %s | FileCheck %s
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @sin_f64(double* nocapture %varray) {
+; CHECK-LABEL: @sin_f64(
+; CHECK-LABEL:vector.body
+; CHECK:[[TMP5:%.*]] = call <4 x double> @_ZGVdN4v_sin(<4 x double> [[TMP4:%.*]])
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to double
+  %call = tail call double @sin(double %conv)
+  %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
+  store double %call, double* %arrayidx,

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM.
I'm not sure why we need 3 different test files to test the very similar 
patterns -  just programmer preference?
Wait a day or so to commit in case anyone else has comments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88154/new/

https://reviews.llvm.org/D88154

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a subscriber: aeubanks.
spatel added a comment.

In D88154#2328312 , 
@venkataramanan.kumar.llvm wrote:

> I can add few more float type tests with meta data for VF=8.   please let me 
> know your suggestions.

I may be missing some subtlety of the vectorizer behavior. Can we vary the test 
types + metadata in 1 file,s o that there is coverage for something like this 
v2f64 call : `TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVbN2v_sin", 2)`?
I'm just trying to make sure we don't fall into some blind-spot by only testing 
VF=4.

Comment at: llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -vector-library=LIBMVEC-X86 -inject-tli-mappings -loop-vectorize -S 
< %s | FileCheck %s
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

fhahn wrote:
> fpetrogalli wrote:
> > `-inject-tli-mappings` is not required here, as a pass itself is required 
> > by the loop vectorizer.
> I guess it still doesn't hurt to be explicit. Also, can you add a line for 
> the new pass manager?
We need to be explicit about that pass with new-pass-manager as shown here:
df5576a

cc @aeubanks as I'm not sure if we want to update tests with NPM RUN lines or 
if we want to silently transition whenever the default gets changed.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88154/new/

https://reviews.llvm.org/D88154

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-13 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.

Comment at: llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll:9
+define void @exp_f32(float* nocapture %varray) {
+; CHECK-LABEL: @exp_f32(
+; CHECK-NEXT:  entry:

fpetrogalli wrote:
> I think you are over-testing here. It is enough to check that inside the 
> vector body there is a call to the vector function you have listed in the 
> mapping. You are not checking for the whole auto-vectorization process here, 
> just the vectorization of the function call. Same for all the tests for this 
> patch in which you are doing something similar to this one test.
> 
> ```
> ; CHECK-LABEL: @exp_f32(
> ; CHECK-LABEL:   vector.body:
> ; CHECK: call fast <4 x float> @_ZGVbN4v___expf_finite(<4 x float> 
> ```
> 
I requested using "utils/update_test_checks.py" to auto-generate the assertions 
consistently.

We have standardized on this practice for tests in several passes because it 
provides extra test coverage (at the risk of over-testing), and it makes 
updating tests in the future nearly automatic.

The time cost of checking the extra lines is negligible vs. the benefit that we 
have gotten in finding/avoiding bugs. If the consensus is that it's not worth 
it on this particular file, I'm ok with that. But the general trend is 
definitely towards auto-generating full checks.

Comment at: 
llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll:69
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+  %tmp = trunc i64 %indvars.iv to i32
+  %conv = sitofp i32 %tmp to float

The script should have warned you about using variables named "tmp". 
Independent of whether we choose to use the scripted assertions or not, you 
should change this value name (even plain "t" for "trunc" is an improvement 
over "tmp").

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88154/new/

https://reviews.llvm.org/D88154

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-12 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D88154#2325713 , 
@venkataramanan.kumar.llvm wrote:

> As per review comments from Sanjay,  updated the test case to use metadata.  
> Also autogenerated the checks in the test cases using 
> llvm/utils/update_test_checks.py.

Thanks. But wouldn't it be better test coverage to vary the 
"llvm.loop.vectorize.width". Why is it always "4"?
I don't have any experience with this lib, so no real feedback on the code 
itself. Hopefully another reviewer can approve if there are no other concerns.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88154/new/

https://reviews.llvm.org/D88154

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-06 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D88154#2310653 , 
@venkataramanan.kumar.llvm wrote:

> In D88154#2290205 , @abique wrote:
>
>> Looks good to me.
>> Regarding the tests, it seems that you check if auto-vectorization takes 
>> advantages of libmvec?
>> Would it be interesting to have a test which declares a vector and call the 
>> builtin sin on it?
>>
>> Thank you very much for the changes! :)
>
> do we we have built-in support for sin that takes vector types?
>
> I tried
>
> __m128d compute_sin(__m128d x)
> {
>
>   return __builtin_sin(x);
>
> }
>
>>> error: passing '__m128d' (vector of 2 'double' values) to parameter of 
>>> incompatible type 'double'

We have LLVM intrinsics for sin/cos that may use vector types:
http://llvm.org/docs/LangRef.html#llvm-sin-intrinsic
...but I don't know of a way to produce those directly from C source.

Comment at: llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll:1
+; RUN: opt -vector-library=LIBMVEC  -inject-tli-mappings -force-vector-width=4 
-force-vector-interleave=1 -loop-vectorize -S < %s | FileCheck %s
+

Why does this test file use command-line options to specify the vector factor 
and the other uses metadata?
If we can use metadata, then can you vary it to get better coverage (for 
example <2 x double> or <8 x float>)?

Comment at: llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll:221-224
+; CHECK-LABEL: @exp_f32_intrin
+; CHECK: <4 x float> @_ZGVbN4v_expf
+; CHECK: ret
+

It would be better to consistently put the FileCheck lines after the 'define'.
Can you auto-generate the CHECK lines using  llvm/utils/update_test_checks.py ?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88154/new/

https://reviews.llvm.org/D88154

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88238: [APFloat] convert SNaN to QNaN in convert() and raise Invalid signal

2020-10-01 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG149f5b573c79: [APFloat] convert SNaN to QNaN in convert() 
and raise Invalid signal (authored by spatel).
Herald added subscribers: cfe-commits, jrtc27.
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88238/new/

https://reviews.llvm.org/D88238

Files:
  clang/test/CodeGen/builtin-nan-exception.c
  clang/test/CodeGen/builtin-nan-legacy.c
  clang/test/CodeGen/mips-unsupported-nan.c
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/Support/APFloat.cpp
  llvm/test/Transforms/InstSimplify/ConstProp/cast.ll
  llvm/test/Transforms/PhaseOrdering/X86/nancvt.ll
  llvm/unittests/ADT/APFloatTest.cpp

Index: llvm/unittests/ADT/APFloatTest.cpp
===
--- llvm/unittests/ADT/APFloatTest.cpp
+++ llvm/unittests/ADT/APFloatTest.cpp
@@ -1816,11 +1816,12 @@
   EXPECT_FALSE(losesInfo);
 
   test = APFloat::getSNaN(APFloat::IEEEsingle());
-  APFloat X87SNaN = APFloat::getSNaN(APFloat::x87DoubleExtended());
   APFloat::opStatus status = test.convert(APFloat::x87DoubleExtended(), APFloat::rmNearestTiesToEven, );
-  EXPECT_TRUE(test.bitwiseIsEqual(X87SNaN));
+  // Conversion quiets the SNAN, so now 2 bits of the 64-bit significand should be set.
+  APInt topTwoBits(64, 0x6000);
+  EXPECT_TRUE(test.bitwiseIsEqual(APFloat::getQNaN(APFloat::x87DoubleExtended(), false, )));
   EXPECT_FALSE(losesInfo);
-  EXPECT_EQ(status, APFloat::opOK);
+  EXPECT_EQ(status, APFloat::opInvalidOp);
 
   test = APFloat::getQNaN(APFloat::IEEEsingle());
   APFloat X87QNaN = APFloat::getQNaN(APFloat::x87DoubleExtended());
@@ -1832,6 +1833,7 @@
   test = APFloat::getSNaN(APFloat::x87DoubleExtended());
   test.convert(APFloat::x87DoubleExtended(), APFloat::rmNearestTiesToEven,
);
+  APFloat X87SNaN = APFloat::getSNaN(APFloat::x87DoubleExtended());
   EXPECT_TRUE(test.bitwiseIsEqual(X87SNaN));
   EXPECT_FALSE(losesInfo);
 
@@ -1841,13 +1843,13 @@
   EXPECT_TRUE(test.bitwiseIsEqual(X87QNaN));
   EXPECT_FALSE(losesInfo);
 
-  // The payload is lost in truncation, but we must retain NaN, so we set the bit after the quiet bit.
+  // The payload is lost in truncation, but we retain NaN by setting the quiet bit.
   APInt payload(52, 1);
   test = APFloat::getSNaN(APFloat::IEEEdouble(), false, );
   status = test.convert(APFloat::IEEEsingle(), APFloat::rmNearestTiesToEven, );
-  EXPECT_EQ(0x7fa0, test.bitcastToAPInt());
+  EXPECT_EQ(0x7fc0, test.bitcastToAPInt());
   EXPECT_TRUE(losesInfo);
-  EXPECT_EQ(status, APFloat::opOK);
+  EXPECT_EQ(status, APFloat::opInvalidOp);
 
   // The payload is lost in truncation. QNaN remains QNaN.
   test = APFloat::getQNaN(APFloat::IEEEdouble(), false, );
Index: llvm/test/Transforms/PhaseOrdering/X86/nancvt.ll
===
--- llvm/test/Transforms/PhaseOrdering/X86/nancvt.ll
+++ llvm/test/Transforms/PhaseOrdering/X86/nancvt.ll
@@ -18,6 +18,9 @@
 
 @var = external global i32
 
+; SNAN becomes QNAN on fptrunc:
+; 2147228864 = 0x7ffc1cc0 : QNAN
+
 define i32 @main() {
 ; CHECK-LABEL: @main(
 ; CHECK-NEXT:  entry:
@@ -30,15 +33,15 @@
 ; CHECK-NEXT:store volatile i32 2147228864, i32* @var, align 4
 ; CHECK-NEXT:store volatile i32 2147228864, i32* @var, align 4
 ; CHECK-NEXT:store volatile i32 2147228864, i32* @var, align 4
-; CHECK-NEXT:store volatile i32 2146502828, i32* @var, align 4
+; CHECK-NEXT:store volatile i32 2147027116, i32* @var, align 4
 ; CHECK-NEXT:store volatile i32 -1610612736, i32* @var, align 4
-; CHECK-NEXT:store volatile i32 2146502828, i32* @var, align 4
+; CHECK-NEXT:store volatile i32 2147027116, i32* @var, align 4
 ; CHECK-NEXT:store volatile i32 -2147483648, i32* @var, align 4
-; CHECK-NEXT:store volatile i32 2146502828, i32* @var, align 4
+; CHECK-NEXT:store volatile i32 2147027116, i32* @var, align 4
 ; CHECK-NEXT:store volatile i32 -1073741824, i32* @var, align 4
-; CHECK-NEXT:store volatile i32 2143034560, i32* @var, align 4
-; CHECK-NEXT:store volatile i32 2143034560, i32* @var, align 4
-; CHECK-NEXT:store volatile i32 2143034560, i32* @var, align 4
+; CHECK-NEXT:store volatile i32 2147228864, i32* @var, align 4
+; CHECK-NEXT:store volatile i32 2147228864, i32* @var, align 4
+; CHECK-NEXT:store volatile i32 2147228864, i32* @var, align 4
 ; CHECK-NEXT:ret i32 undef
 ;
 entry:
Index: llvm/test/Transforms/InstSimplify/ConstProp/cast.ll
===
--- llvm/test/Transforms/InstSimplify/ConstProp/cast.ll
+++ llvm/test/Transforms/InstSimplify/ConstProp/cast.ll
@@ -40,24 +40,24 @@
 }
 
 ; https://llvm.org/PR43907 - make sure that NaN doesn't morph into Inf.
-; SNaN remains SNaN.
+; SNaN becomes QNaN.
 
 define float @nan_f64_trunc() {
 ;

[PATCH] D88664: [AST] do not error on APFloat invalidOp in default mode

2020-10-01 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG686eb0d8ded9: [AST] do not error on APFloat invalidOp in 
default mode (authored by spatel).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88664/new/

https://reviews.llvm.org/D88664

Files:
  clang/lib/AST/ExprConstant.cpp


Index: clang/lib/AST/ExprConstant.cpp
===
--- clang/lib/AST/ExprConstant.cpp
+++ clang/lib/AST/ExprConstant.cpp
@@ -2439,7 +2439,8 @@
 return false;
   }
 
-  if (St & APFloat::opStatus::opInvalidOp) {
+  if ((St & APFloat::opStatus::opInvalidOp) &&
+  FPO.getFPExceptionMode() != LangOptions::FPE_Ignore) {
 // There is no usefully definable result.
 Info.FFDiag(E);
 return false;


Index: clang/lib/AST/ExprConstant.cpp
===
--- clang/lib/AST/ExprConstant.cpp
+++ clang/lib/AST/ExprConstant.cpp
@@ -2439,7 +2439,8 @@
 return false;
   }
 
-  if (St & APFloat::opStatus::opInvalidOp) {
+  if ((St & APFloat::opStatus::opInvalidOp) &&
+  FPO.getFPExceptionMode() != LangOptions::FPE_Ignore) {
 // There is no usefully definable result.
 Info.FFDiag(E);
 return false;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88664: [AST] do not error on APFloat invalidOp in default mode

2020-10-01 Thread Sanjay Patel via Phabricator via cfe-commits

spatel created this revision.
spatel added reviewers: rjmccall, efriedma, sepavloff.
Herald added a subscriber: mcrosier.
spatel requested review of this revision.

If FP exceptions are ignored, we should not error out of compilation just 
because APFloat indicated an exception. 
This is required as a preliminary step for D88238 
 which changes APFloat behavior for signaling 
NaN convert() to set the opInvalidOp exception status.

Currently, there is no way to trigger this error because convert() never sets 
opInvalidOp. FP binops that set opInvalidOp also create a NaN, so the path to 
checkFloatingPointResult() is blocked by a different diagnostic:

  // [expr.pre]p4:
  //   If during the evaluation of an expression, the result is not
  //   mathematically defined [...], the behavior is undefined.
  // FIXME: C++ rules require us to not conform to IEEE 754 here.
  if (LHS.isNaN()) {
Info.CCEDiag(E, diag::note_constexpr_float_arithmetic) << LHS.isNaN();
return Info.noteUndefinedBehavior();
  }
  return checkFloatingPointResult(Info, E, St);


https://reviews.llvm.org/D88664

Files:
  clang/lib/AST/ExprConstant.cpp


Index: clang/lib/AST/ExprConstant.cpp
===
--- clang/lib/AST/ExprConstant.cpp
+++ clang/lib/AST/ExprConstant.cpp
@@ -2439,7 +2439,8 @@
 return false;
   }
 
-  if (St & APFloat::opStatus::opInvalidOp) {
+  if ((St & APFloat::opStatus::opInvalidOp) &&
+  FPO.getFPExceptionMode() != LangOptions::FPE_Ignore) {
 // There is no usefully definable result.
 Info.FFDiag(E);
 return false;


Index: clang/lib/AST/ExprConstant.cpp
===
--- clang/lib/AST/ExprConstant.cpp
+++ clang/lib/AST/ExprConstant.cpp
@@ -2439,7 +2439,8 @@
 return false;
   }
 
-  if (St & APFloat::opStatus::opInvalidOp) {
+  if ((St & APFloat::opStatus::opInvalidOp) &&
+  FPO.getFPExceptionMode() != LangOptions::FPE_Ignore) {
 // There is no usefully definable result.
 Info.FFDiag(E);
 return false;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-09-18 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D87188#2281957 , @sanwou01 wrote:

> I know this has already been reverted but just FYI that I've bisected a ~2% 
> regression in SPEC2017 x264_r on AArch64 to this commit. Presumably this is 
> due to the extra unrolling / cost modelling issue already mentioned?

That would be my guess. I'm not familiar with the unroller or inliner, but I 
don't even see them using the cost model at 1st glance. Are they purely 
counting IR instructions?

If I missed the cost model calls, the cost model is still not correct for 
`code-size`:

  $ cat cost.ll
  declare i32@llvm.abs.i32(i32, i1)
  define i32 @abs(i32 %x, i32 %y) {
%I32 = call i32 @llvm.abs.i32(i32 %x, i1 false)
%negx = sub i32 0, %x
%cmp = icmp sgt i32 %x, -1
%sel = select i1 %cmp, i32 %x, i32 %negx 
ret i32 %I32
  }
  
  $ opt  -cost-model -cost-kind=code-size -analyze -mtriple=aarch64--  cost.ll 
  Printing analysis 'Cost Model Analysis' for function 'abs':
  Cost Model: Found an estimated cost of 1 for instruction:   %I32 = call i32 
@llvm.abs.i32(i32 %x, i1 false)
  Cost Model: Found an estimated cost of 1 for instruction:   %negx = sub i32 
0, %x
  Cost Model: Found an estimated cost of 1 for instruction:   %cmp = icmp sgt 
i32 %x, -1
  Cost Model: Found an estimated cost of 1 for instruction:   %sel = select i1 
%cmp, i32 %x, i32 %negx

So we're assuming the size cost is either 1 or 3 depending on how we encoded 
abs() in IR, but neither is correct for AArch64 IIUC:

  $ llc -o - -mtriple=aarch64-- cost.ll 
cmp w0, #0  // =0
cnegw0, w0, mi
ret


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87188/new/

https://reviews.llvm.org/D87188

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D87101: [X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851)

2020-09-04 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM - I think the odds of code that is using intrinsic vector abs inter-mixing 
with auto-vectorized code using the abs idiom are small.

I noticed one other LLVM codegen test that might want to be updated to use the 
IR intrinsics to be more relevant -- 
llvm-project/llvm/test/CodeGen/X86/combine-abs.ll. That can be done with this 
patch or later.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87101/new/

https://reviews.llvm.org/D87101

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D85787: [InstCombine] Aggregate reconstruction simplification (PR47060)

2020-08-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

This seems similar in spirit to the recursive element tracking that we do in 
collectShuffleElements() in this file or foldIdentityShuffles() in 
InstSimplify, but a bit more complicated (because of the phi translation?).
I've pointed out a few minor potential improvements, otherwise LGTM.




Comment at: llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp:730
+
+  static constexpr auto NotFound = None;
+

IIUC, you're intentionally giving the placeholder objects the same names as the 
enum values, but that confused me. Would it make sense to call this 
"InvalidValue" or similar?



Comment at: llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp:778
+
+  static constexpr auto FoundMismatch = nullptr;
+

Similar comment about the name as above for NotFound (and might be better to 
declare it on the next line after above?).



Comment at: llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp:780
+
+  enum class SourceAggegate {
+/// When analyzing the value that was inserted into an aggregate, we did

Spelling - "SourceAggregate"



Comment at: llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp:855
+  // Okay, what have we found? Does that correlate with previous findings?
+  switch (Describe(SourceAggregateForElement)) {
+  case SourceAggegate::NotFound:

Instead of switch, could we reduce to:
  if (Describe(SourceAggForElt) != SourceAggregate::Found)
return SourceAggForElt;



Comment at: llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp:907
+  // Don't bother if there are more than 64 predecessors.
+  if (UseBB->hasNPredecessorsOrMore(64 + 1))
+return nullptr;

Best to give that "64" a name in case we want to experiment with other settings.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85787/new/

https://reviews.llvm.org/D85787

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D85787: [InstCombine] Aggregate reconstruction simplification (PR47060)

2020-08-13 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D85787#2214038 , @lebedev.ri wrote:

> @ reviewers - i'm not so much interested in deep code/algo review,
> but more like in the general direction disscussion, like, is this okay for 
> instcombine? :)

The test diffs look great, and it seems to at least pay for itself in 
compile-time impact, so I think it's a good direction. There's always a 
question of "is this substantial enough (or would it grow enough with the 
'TODO' items) to be a stand-alone pass?".

Haven't had a chance to look at the code itself yet. Do we have tests where 
there are extra uses of the extracted values?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85787/new/

https://reviews.llvm.org/D85787

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78478: [UpdateTestChecks] Add UTC_ARGS support for update_{llc,cc}_test_checks.py

2020-07-02 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D78478#2128631 , @xbolva00 wrote:

> >> UTC_ARGS (I can't help associating it with Coordinated Universal Time).
>
> Me too. Any suggestions for new name? TCU_ARGS?

I also reflexively think of universal time when reading "UTC". I think we 
always print this with the script name advertisement line, so "SCRIPT_ARGS" 
seems non-ambiguous.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78478/new/

https://reviews.llvm.org/D78478

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80584: [utils] change update_test_checks.py use of 'TMP' value names

2020-06-01 Thread Sanjay Patel via Phabricator via cfe-commits

spatel closed this revision.
spatel added a comment.

In D80584#2066319 , @jdoerfert wrote:

> First, I think the warning is strictly a good thing. Thanks for keeping that 
> in!

Sure - we've been missing that for a long time.

> I don't think the options are really "complexity to the user" given a 
> sensible default (= the old value). If a user sees a warning that says: `Name 
> conflict, use --nameless-prefix=X, where X is a string not otherwise 
> used in your IR`, they will happily do so. The UTC_ARGS extension makes it 
> permanent and no one needs to worry about this until they see a warning 
> again. (If we can "hide" the option from the help message, I'm fine with that 
> too.)

I'm still not convinced that's worth doing. I did change instnamer though:
rGdd54432a0f5a 

...so between the warning and at least 1 of these not using 'tmp', I think 
we're good enough.

> Adding a test is just something we should do as it is easy to accidentally 
> break functionality like the warning.

Yep, I didn't remember that we had tests for scripts. So we have that now, and 
there's a clang test that wiggles on this too...although that seems like we 
should generalize that file to avoid killing all of the bots like I did. :)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80584/new/

https://reviews.llvm.org/D80584

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80584: [utils] change update_test_checks.py use of 'TMP' value names

2020-06-01 Thread Sanjay Patel via Phabricator via cfe-commits

spatel reopened this revision.
spatel added a comment.
This revision is now accepted and ready to land.

Effectively reverted (but still have the name conflict warning at least) with:
rGe5b877275673 


If we can change instnamer, we can probably keep living with this script as-is. 
Alternatively, let's re-fix this script and mass update test files, so there's 
no general concern about churn of existing tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80584/new/

https://reviews.llvm.org/D80584



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80584: [utils] change update_test_checks.py use of 'TMP' value names

2020-06-01 Thread Sanjay Patel via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGe5b877275673: [utils] change default nameless value to 
TMP (authored by spatel).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Changed prior to commit:
  https://reviews.llvm.org/D80584?vs=267500=267582#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80584/new/

https://reviews.llvm.org/D80584

Files:
  clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected
  clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
  llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll
  llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.expected
  
llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.funcsig.expected
  llvm/utils/UpdateTestChecks/common.py

Index: llvm/utils/UpdateTestChecks/common.py
===
--- llvm/utils/UpdateTestChecks/common.py
+++ llvm/utils/UpdateTestChecks/common.py
@@ -218,7 +218,7 @@
 # spaces, commas, paren, or end of the string
 IR_VALUE_RE = re.compile(r'(\s+)%([\w.-]+?)([,\s\(\)]|\Z)')
 
-NAMELESS_PREFIX = "NAMELESS"
+NAMELESS_PREFIX = "TMP"
 
 # Create a FileCheck variable name based on an IR name.
 def get_value_name(var):
Index: llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.funcsig.expected
===
--- llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.funcsig.expected
+++ llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.funcsig.expected
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature
-; Example input for update_llc_test_checks (taken from test/Transforms/InstSimplify/add.ll)
+; Example input for update_test_checks (taken from test/Transforms/InstSimplify/add.ll)
 ; RUN: opt < %s -instsimplify -S | FileCheck %s
 
 define i32 @common_sub_operand(i32 %X, i32 %Y) {
@@ -53,3 +53,13 @@
   %r = add <2 x i8> %yx, %xy
   ret <2 x i8> %r
 }
+
+define i32 @nameless_value(i32 %X) {
+; CHECK-LABEL: define {{[^@]+}}@nameless_value
+; CHECK-SAME: (i32 [[X:%.*]])
+; CHECK-NEXT:[[TMP1:%.*]] = sub i32 42, [[X]]
+; CHECK-NEXT:ret i32 [[TMP1]]
+;
+  %1 = sub i32 42, %X
+  ret i32 %1
+}
Index: llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.expected
===
--- llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.expected
+++ llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.expected
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; Example input for update_llc_test_checks (taken from test/Transforms/InstSimplify/add.ll)
+; Example input for update_test_checks (taken from test/Transforms/InstSimplify/add.ll)
 ; RUN: opt < %s -instsimplify -S | FileCheck %s
 
 define i32 @common_sub_operand(i32 %X, i32 %Y) {
@@ -48,3 +48,12 @@
   %r = add <2 x i8> %yx, %xy
   ret <2 x i8> %r
 }
+
+define i32 @nameless_value(i32 %X) {
+; CHECK-LABEL: @nameless_value(
+; CHECK-NEXT:[[TMP1:%.*]] = sub i32 42, [[X:%.*]]
+; CHECK-NEXT:ret i32 [[TMP1]]
+;
+  %1 = sub i32 42, %X
+  ret i32 %1
+}
Index: llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll
===
--- llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll
+++ llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll
@@ -1,4 +1,4 @@
-; Example input for update_llc_test_checks (taken from test/Transforms/InstSimplify/add.ll)
+; Example input for update_test_checks (taken from test/Transforms/InstSimplify/add.ll)
 ; RUN: opt < %s -instsimplify -S | FileCheck %s
 
 define i32 @common_sub_operand(i32 %X, i32 %Y) {
@@ -47,3 +47,12 @@
   %r = add <2 x i8> %yx, %xy
   ret <2 x i8> %r
 }
+
+define i32 @nameless_value(i32 %X) {
+; CHECK-LABEL: @nameless_value(
+; CHECK-NEXT:[[TMP1:%.*]] = sub i32 42, [[X:%.*]]
+; CHECK-NEXT:ret i32 [[TMP1]]
+;
+  %1 = sub i32 42, %X
+  ret i32 %1
+}
Index: clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
===
--- clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
+++ clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
@@ -9,10 +9,10 @@
 // CHECK-NEXT:[[B_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:store i64 [[A]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B]], i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[NAMELESS1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[NAMELESS1]] to i64
-//

[PATCH] D80251: [X86] Update some av512 shift intrinsics to use "unsigned int" parameter instead of int to match Intel documentaiton

2020-05-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

Is it possible to fix those other 5 in the Intel docs for consistency, or is 
there some functional reason that those are different?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80251/new/

https://reviews.llvm.org/D80251



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79472: [X86] Remove support some inline assembly constraints that are no longer supported by gcc.

2020-05-06 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:154-156
+BasicBlock::iterator It(New);
+while (isa(*It) || isa(*It))
+  ++It;

Unrelated diff?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79472/new/

https://reviews.llvm.org/D79472



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-03-27 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

Looks like D76649  landed and added another 
IRBuilder call that needs updating:

  [2974/5437] Building CXX object 
lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PartialReduction.cpp.o
  FAILED: 
lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PartialReduction.cpp.o 
  CCACHE_DIR=/mnt/disks/ssd0/ccache CCACHE_MAXSIZE=20G CCACHE_CPP2=yes 
CCACHE_HASHDIR=yes /usr/bin/ccache /usr/bin/clang++  -DGTEST_HAS_RTTI=0 
-D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_LIMIT_MACROS -Ilib/Target/X86 
-I/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/lib/Target/X86
 -I/usr/include/libxml2 -Iinclude 
-I/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/include 
-gmlt -fPIC -fvisibility-inlines-hidden -Werror=date-time 
-Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter 
-Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic 
-Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default 
-Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor 
-Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -O3 
 -fvisibility=hidden-fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT 
lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PartialReduction.cpp.o -MF 
lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PartialReduction.cpp.o.d -o 
lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PartialReduction.cpp.o -c 
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/lib/Target/X86/X86PartialReduction.cpp
  
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/lib/Target/X86/X86PartialReduction.cpp:387:22:
 error: call to member function 'CreateShuffleVector' is ambiguous
  Ops[0] = Builder.CreateShuffleVector(Ops[0], Ops[0], {0, 1});
   ^~~
  
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/include/llvm/IR/IRBuilder.h:2625:10:
 note: candidate function
Value *CreateShuffleVector(Value *V1, Value *V2, ArrayRef Mask,
   ^
  
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/include/llvm/IR/IRBuilder.h:2633:10:
 note: candidate function
Value *CreateShuffleVector(Value *V1, Value *V2, ArrayRef Mask,
   ^
  
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/include/llvm/IR/IRBuilder.h:2618:10:
 note: candidate function not viable: cannot convert initializer list argument 
to 'llvm::Value *'
Value *CreateShuffleVector(Value *V1, Value *V2, Value *Mask,


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72467/new/

https://reviews.llvm.org/D72467



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-03-12 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D72675#1919793 , @mibintc wrote:

> In D72675#1919685 , @spatel wrote:
>
> > In D72675#1917844 , @wristow wrote:
> >
> > > Revisiting this patch.  I think that before fixing the 
> > > `-ffp-contract=off` problem I originally raised here, there are two 
> > > questions that have come up in this discussion that we should first 
> > > resolve.  Specifically:
> > >
> > > (1)  Do we want to enable FMA transformations by default (at appropriate 
> > > optimization levels), like GCC does?  That is, should FMA be done for 
> > > targets that support it with a command as simple as the following?
> > >
> > >   clang -O2 -c test.c
> >
> >
> > This has been discussed/tried a few times including very recently. I'm not 
> > sure where we stand currently, but here's some background:
> >  https://reviews.llvm.org/rL282259
> >  D74436   - cc @mibintc
> >
> > And (sorry for the various flavors of links to the dev-lists, but that's 
> > how it showed up in the search results):
> >  http://lists.llvm.org/pipermail/llvm-dev/2017-March/29.html
> >  http://lists.llvm.org/pipermail/cfe-dev/2020-February/064645.html
> >  https://groups.google.com/forum/#!msg/llvm-dev/nBiCD5KvYW0/yfjrVzR7DAAJ
> >  https://groups.google.com/forum/#!topic/llvm-dev/Aev2jQYepKQ
> >  
> > http://clang-developers.42468.n3.nabble.com/RFC-FP-contract-on-td4056277.html
>
>
> A few weeks ago there was a discussion on cfe-dev and llvm-dev initiated by 
> @andrew.w.kaylor and while the response wasn't unanimous, there was an 
> overwhelming sentiment (in my opinion) that it is correct and desireable to 
> enable ffp-contract=on by default even at -O0.  I had submitted a patch to do 
> that. However the patch caused performance regressions in about 20 LNT tests 
> and the patch was reverted, the regression was seen on multiple architecture. 
> I can send provide a few more details about the performance problems, I don't 
> have a solution to the LNT regression.


Thanks for the update! I saw your questions about digging into the LNT 
regressions, but I don't have the answers (ping that message in case someone 
out there missed it?). 
Let's break this into pieces and see if we can make progress apart from that:

1. Where is the list of LNT tests that showed problems?
2. Were the regressions on x86 bots? If so, many developers can likely repro 
those locally.
3. For regressions on other targets, we can guess at the problems by comparing 
asm (or recruit someone with the target hardware?).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-03-12 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a subscriber: mibintc.
spatel added a comment.

In D72675#1917844 , @wristow wrote:

> Revisiting this patch.  I think that before fixing the `-ffp-contract=off` 
> problem I originally raised here, there are two questions that have come up 
> in this discussion that we should first resolve.  Specifically:
>
> (1)  Do we want to enable FMA transformations by default (at appropriate 
> optimization levels), like GCC does?  That is, should FMA be done for targets 
> that support it with a command as simple as the following?
>
>   clang -O2 -c test.c


This has been discussed/tried a few times including very recently. I'm not sure 
where we stand currently, but here's some background:
https://reviews.llvm.org/rL282259
D74436   - cc @mibintc

And (sorry for the various flavors of links to the dev-lists, but that's how it 
showed up in the search results):
http://lists.llvm.org/pipermail/llvm-dev/2017-March/29.html
http://lists.llvm.org/pipermail/cfe-dev/2020-February/064645.html
https://groups.google.com/forum/#!msg/llvm-dev/nBiCD5KvYW0/yfjrVzR7DAAJ
https://groups.google.com/forum/#!topic/llvm-dev/Aev2jQYepKQ
http://clang-developers.42468.n3.nabble.com/RFC-FP-contract-on-td4056277.html


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D75467: [instcombine] remove fsub to fneg hacks; only emit fneg

2020-03-10 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.

LGTM - see inline for 1 more comment nit.




Comment at: llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp:2132
 
   // Subtraction from -0.0 is the canonical form of fneg.
+  // fsub -0.0, X ==> fneg X

This comment is stale. Remove or update to something like:
  // Convert fsub from zero to the canonical fneg instruction.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75467/new/

https://reviews.llvm.org/D75467



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D75467: [instcombine] remove fsub to fneg hacks; only emit fneg

2020-03-10 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/include/llvm/IR/PatternMatch.h:622
 inline bind_ty m_Instruction(Instruction *) { return I; }
 /// Match a binary operator, capturing it if we match.
+inline bind_ty m_UnOp(UnaryOperator *) { return I; }

Match a binary -> Match a unary



Comment at: llvm/include/llvm/IR/PatternMatch.h:802-803
+
+  // The evaluation order is always stable, regardless of Commutability.
+  // The LHS is always matched first.
+  AnyUnaryOp_match(const OP_t ) : X(X) {}

The copied comment doesn't make sense for a unary op; there's no 
commutability/LHS here.



Comment at: llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp:379
 
+  // TODO de-duplicate binary and unary matches cmatchers with n-ary matching
+  UnaryOperator *UO;

typo: cmatchers?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75467/new/

https://reviews.llvm.org/D75467



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74500: clang: Treat ieee mode as the default for denormal-fp-math

2020-03-02 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/include/llvm/ADT/FloatingPointMode.h:48
 
-  DenormalMode() = default;
-  DenormalMode(DenormalModeKind Out, DenormalModeKind In) :
+  constexpr DenormalMode() = default;
+  constexpr DenormalMode(DenormalModeKind Out, DenormalModeKind In) :

Can these constexpr diffs be separated out as an NFC preliminary commit?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74500/new/

https://reviews.llvm.org/D74500



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2020-02-10 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM - the PS4 behavior was confirmed off-list.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69979/new/

https://reviews.llvm.org/D69979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-01-24 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D72467#1839172 , @efriedma wrote:

> In D72467#1838591 , @spatel wrote:
>
> > LGTM, but should get a 2nd opinion since I'm not familiar with some of the 
> > parts.
>
>
> Any specific part you're worried about?

I don't know anything about the scalable vector enhancements, but that seems 
covered now. Nuances of the bitcode serialization are beyond me, but I trust 
you've stepped through that. So, no worries. :)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72467/new/

https://reviews.llvm.org/D72467

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-01-24 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

LGTM, but should get a 2nd opinion since I'm not familiar with some of the 
parts.
Also, please update the related LangRef text for shufflevector in or alongside 
this patch.

Could add some ascii art to visualize the mask indexing as requested by 
@ctetreau ?

  shufflevector <3 x float> %x, <3 x float> %y,  %mask:
  +---+---+---+---+---+---+
  | x | y |  vector operands
  +---+---+---+---+---+---+
  | 0 | 1 | 2 | 0 | 1 | 2 |  per-operand indexing
  +---+---+---+---+---+---+
  | 0 | 1 | 2 | 3 | 4 | 5 |  shuffle mask indexing
  +---+---+---+---+---+---+




Comment at: llvm/include/llvm/IR/Instructions.h:2039
   /// Return the shuffle mask value of this instruction for the given element
   /// index. Return -1 if the element is undef.
   int getMaskValue(unsigned Elt) const {

Can replace the magic -1 with the named "UndefMaskElem" in this comment.



Comment at: llvm/include/llvm/IR/Instructions.h:2050
   /// Return the mask for this instruction as a vector of integers. Undefined
   /// elements of the mask are returned as -1.
   void getShuffleMask(SmallVectorImpl ) const {

Can replace the magic -1 with the named "UndefMaskElem" in this comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72467/new/

https://reviews.llvm.org/D72467



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-01-23 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/include/llvm/IR/Constants.h:1220
 
+  /// Assert that this is an shufflevector and return the mask. See class
+  /// ShuffleVectorInst for a description of the mask representation.

an shufflevector -> a shufflevector



Comment at: llvm/include/llvm/IR/Constants.h:1224
+
+  /// Assert that this is an shufflevector and return the mask.
+  ///

an shufflevector -> a shufflevector



Comment at: llvm/include/llvm/IR/IRBuilder.h:2544
+SmallVector IntMask;
+ShuffleVectorInst::getShuffleMask(cast(Mask), IntMask);
+return CreateShuffleVector(V1, V2, IntMask, Name);

Add an assert that Mask isa ?



Comment at: llvm/include/llvm/IR/Instructions.h:1985-1986
 ///
+/// The shuffle mask operand specifies, for each element of the result vector,
+/// which element of the two input vectors the result element gets. The
+/// shuffle mask is represented as an array of integers. Positive integers

This reads awkwardly to me (if you agree, we can update the LangRef too). 
How about:
"For each element of the result vector, the shuffle mask selects an element 
from one of the input vectors to copy to the result. Non-negative elements in 
the mask represent an index into the concatenated pair of input vectors. 
UndefMaskElem (-1) specifies that the result element is undefined."



Comment at: llvm/include/llvm/IR/Instructions.h:2015
 
-  // allocate space for exactly three operands
+  // allocate space for exactly two operands
   void *operator new(size_t s) {

This comment doesn't add value to me, so I'd just delete it. If we want to keep 
it, should fix it to be a proper sentence with a capital letter and period.



Comment at: llvm/include/llvm/IR/PatternMatch.h:1306
+
+struct m_ZeroMask {
+  bool match(ArrayRef Mask) {

IIUC, this is meant to replace the current uses of m_Zero() or m_ZeroInt() with 
shuffle pattern matching, but there's a logic difference because those matchers 
allow undefs, but this does not? Are we missing unittests to catch that case?



Comment at: llvm/include/llvm/IR/PatternMatch.h:1320
+
 /// Matches ShuffleVectorInst.
+template 

Update/add comment:
"Matches ShuffleVectorInst independently of mask value." ?



Comment at: llvm/lib/Bitcode/Writer/ValueEnumerator.cpp:937-939
+   if (auto *CE = dyn_cast(C))
+ if (CE->getOpcode() == Instruction::ShuffleVector)
+   EnumerateOperandType(CE->getShuffleMaskForBitcode());

Indentation looks off-by-1.



Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:1937
+  ArrayRef Mask;
+  if (const ShuffleVectorInst *SVI = dyn_cast())
+Mask = SVI->getShuffleMask();

if (auto *SVI = ...)



Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:3587
+  ArrayRef Mask;
+  if (const ShuffleVectorInst *SVI = dyn_cast())
+Mask = SVI->getShuffleMask();

if (auto *SVI = ...)



Comment at: llvm/lib/ExecutionEngine/Interpreter/Execution.cpp:1890
   for( unsigned i=0; igetSyncScopeID() == cast(I2)->getSyncScopeID();
+  if (const ShuffleVectorInst *SVI = dyn_cast(I1))
+return SVI->getShuffleMask() == 
cast(I2)->getShuffleMask();

Would have suggested "auto *" here, but really the whole thing should be turned 
into a switch() as a preliminary cleanup?



Comment at: llvm/lib/IR/Instructions.cpp:1824-1825
+ Instruction *InsertBefore)
+: 
Instruction(VectorType::get(cast(V1->getType())->getElementType(),
+Mask.size(), V1->getType()->getVectorIsScalable()),
+  ShuffleVector,

I see this is copied, but indentation seems off.



Comment at: llvm/lib/Transforms/Scalar/GVN.cpp:304
   e.varargs.push_back(*II);
+  } else if (ShuffleVectorInst *SVI = dyn_cast(I)) {
+ArrayRef ShuffleMask = SVI->getShuffleMask();

Use "auto *" in each clause (or convert to switch).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72467/new/

https://reviews.llvm.org/D72467



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D72675#1827893 , @wristow wrote:

> How to handle this seems like an implementation question.  The code here is 
> just deciding whether or not we intend to pass "-ffast-math" to cc1 (it isn't 
> directly defining `__FAST_MATH__`).  If we do pass it to cc1, then 
> "clang/lib/Frontend/InitPreprocessor.cpp" will pre-define `__FAST_MATH__`:
>
>   if (LangOpts.FastMath)
> Builder.defineMacro("__FAST_MATH__");
>   
>
> It seems to me that a straightforward way to deal with this question is to 
> make the above test more elaborate (that is, if `LangOpts.FastMath` is set, 
> OR whatever the appropriate subset of fast-math switches are set).  If we do 
> that, we don't need to deal with the "umbrella" aspect of "-ffast-math", 
> which gets messy.
>
> Which is to say the approach here can stay the same as it currently is 
> (`-ffp-contract=off/on` suppressing the passing of "-ffast-math" to cc1).  
> Although the comment about `__FAST_MATH__` here in "Clang.cpp" should be 
> changed, if we take this approach.


That sounds fine to me, so could update that comment/this patch.

Let's give this a couple of days after updating to see if anyone else has 
feedback though. I'm never exactly sure what our responsibility is with respect 
to gcc compatibility.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-17 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added reviewers: efriedma, scanon, arsenm, echristo, RKSimon.
spatel added a comment.
Herald added a subscriber: wdng.

Adding potential FP reviewers for question of gcc- (potentially-buggy-) 
compatibility.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-17 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

This follows the reasoning that we discussed in the earlier discussion, but 
after re-reading the gcc comment in particular, I'm wondering whether this is 
what we really want to do...
If __FAST_MATH__ is purely here for compatibility with gcc, then should we 
mimic gcc behavior in setting that macro even if we think it's buggy? 
Ie, when we translate these settings to LLVM's FMF, we can still override the 
-ffast-math flag by checking the -ffp-contract flag (if I'm seeing it 
correctly, the existing code will pass that alongside -ffast-math when contract 
is set to on/off).




Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2763
 
+  // If -ffp-contract=off has been specified on the command line, then we must
+  // suppress the emission of -ffast-math and -menable-unsafe-fp-math to cc1.

This comment doesn't match the code any more. Should be "If 
-ffp-contract=off/on, then..."



Comment at: clang/test/Driver/fast-math.c:184
+// -ffp-contract=off and -ffp-contract=on must disable the fast-math umbrella,
+// and the unsafe-fp-math umbrella (-ffp-conteact=fast leaves them enabled).
+// RUN: %clang -### -ffast-math -ffp-contract=off -c %s 2>&1 \

typo: conteact


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: Fix -ffast-math/-ffp-contract interaction

2020-01-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D72675#1824659 , @wristow wrote:

> This all sounds good to me.
>
> So to make sure we're all on the same page, my understanding is that the plan 
> forward is:
>
> 1. Make the Clang change first (including adding another pair of tests for 
> `-ffp-contract=on`).
> 2. Update the LLVM tests illustrating the current baseline state.
> 3. Make the LLVM change shown here, and update the tests with the fixes.
> 4. Bring in the DAG combiner work that Michael has done internally at Apple.
>
>   Points 1, 2, and 3 are essentially the points in the patch posted here, so 
> I'll do that.  And of course Michael will then take on point 4.
>
>   Is that the plan?  If yes, I'll transition this item to just be the Clang 
> pieces, and I'll spin off a new one to do the LLVM portion of what is posted 
> here.


SGTM

> Sanjay, regarding:
> 
>> But it would be better to have all of the baseline tests in place, so we 
>> know where we stand currently.
> 
> I'm interpreting that as applying to the LLVM tests.  That is, the Clang 
> change, along with the updated tests, can be in one commit, rather than first 
> updating the Clang driver tests with the current state.  If you'd prefer two 
> commits on the Clang side as well, let me know.

One commit for the clang changes should be ok; it's a very small diff. But I'm 
still not sure if the driver change induces frontend diffs that we should make 
visible via tests.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: Fix -ffast-math/-ffp-contract interaction

2020-01-16 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D72675#1823227 , @mcberg2017 wrote:

> We crossed that bridge internally at Apple a while ago, meaning I have some 
> code debt for cleaning up open source for fma formation that uses contract 
> and reassoc differently than we do today, both together and separately, case 
> by case.

Ah, great. I don't have a preference about the order of patches (ie, is it 
better to make this change first or do the backend cleanup first), so whatever 
works better.

But it would be better to have all of the baseline tests in place, so we know 
where we stand currently. So I'd prefer to have the PPC and x86 tests 
pre-committed to master (change test comments as needed to match current/future 
reality).

We could also decouple the clang part of this patch from the backend change 
IIUC. But do we need another change (or at least tests) to show how the driver 
difference translates in the clang front-end to LLVM FMF and/or function 
attributes?

Comment at: clang/test/Driver/fast-math.c:183-184
 //
+// -ffp-contract=off must disable the fast-math umbrella, and the 
unsafe-fp-math
+// umbrella.
+// RUN: %clang -### -ffast-math -ffp-contract=off -c %s 2>&1 \

Do we need another pair of tests for -ffp-contract=on?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72675: Fix -ffast-math/-ffp-contract interaction

2020-01-15 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: llvm/test/CodeGen/PowerPC/fmf-propagation.ll:201-203
 ; fma(X, 7.0, X * 42.0) --> X * 49.0
-; This is the minimum FMF needed for this transform - the FMA allows 
reassociation.
+; This is the minimum FMF needed for this transform - the 'contract' allows 
the needed reassociation.
+

I was ok with the reasoning up to here, but this example lost me.
Why does 'contract' alone allow splitting an fma?
Is 'contract' relevant on anything besides fadd (with an fmul operand)?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72028: [CodeGen] Emit conj/conjf/confjl libcalls as fneg instructions if possible.

2019-12-31 Thread Sanjay Patel via Phabricator via cfe-commits

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

Code change LGTM - see inline for a question about the testing.




Comment at: clang/test/CodeGen/complex-libcalls.c:115-118
+// NO__ERRNO-NOT: .conj
+// NO__ERRNO-NOT: @conj
+// HAS_ERRNO-NOT: .conj
+// HAS_ERRNO-NOT: @conj

Could we use positive matches for "fneg" here rather than NOT lines? If so, 
then do we need the new test file?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72028/new/

https://reviews.llvm.org/D72028



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-12-11 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added inline comments.



Comment at: clang/test/Driver/default-denormal-fp-math.c:7
+// crtfastmath enables ftz and daz
+// RUN: %clang -### -target x86_64-unknown-linux-gnu -ffast-math 
--sysroot=%S/Inputs/basic_linux_tree -c %s -v 2>&1 | FileCheck 
-check-prefix=CHECK-ZEROSIGN %s
+

The prefix should be PRESERVE_SIGN to match the flag?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69979/new/

https://reviews.llvm.org/D69979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-12-02 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a subscriber: andreadb.
spatel added inline comments.



Comment at: clang/include/clang/Driver/ToolChain.h:580
+  /// This checks for presence of the -Ofast, -ffast-math or -funsafe-math 
flags.
+  virtual bool FastMathRuntimeIsAvailable(
+const llvm::opt::ArgList , std::string ) const;

Formatting nit - prefer to start with verb and lower-case: 
isFastMathRuntimeAvailable() or hasFastMathRuntime().



Comment at: clang/include/clang/Driver/ToolChain.h:587
   /// This checks for presence of the -Ofast, -ffast-math or -funsafe-math 
flags.
-  virtual bool AddFastMathRuntimeIfAvailable(
-  const llvm::opt::ArgList , llvm::opt::ArgStringList ) const;
+  bool AddFastMathRuntimeIfAvailable(
+const llvm::opt::ArgList , llvm::opt::ArgStringList ) const;

Add -> add



Comment at: clang/lib/Driver/ToolChains/PS4CPU.h:95-96
+const llvm::fltSemantics *FPType) const override {
+// DAZ and FTZ are on by default.
+return llvm::DenormalMode::getPreserveSign();
+  }

@probinson / @andreadb - is this correct for PS4? or is there some equivalent 
to the Linux startup file?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69979/new/

https://reviews.llvm.org/D69979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70488: [InstCombine] Infer fast math flags on fadd/fsub/fmul/fcmp

2019-11-20 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added reviewers: cameron.mcinally, mcberg2017, arsenm.
spatel added a comment.
Herald added a subscriber: wdng.

I like the idea, but I'd be more comfortable reviewing the diffs in stages, so 
we know that the test coverage for the value tracking calls is good. So I'd 
prefer if we split this somehow - either by the opcode callers (fadd, fsub, 
fmul...) or the the FMF analysis (nnan, nsz, ninf). That raises a few questions:

1. Why aren't fdiv and frem included?
2. Can we infer FMF for FP intrinsics/libcalls/select/phi? (follow-on patches)
3. We're moving away from FMF on fcmp (recent step: rGebf9bf2cbc8f 
), so is 
it worth including that change, or can we wait for that part to settle? (Side 
question may be if/when we're going to allow FMF on fptrunc/fpextend).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70488/new/

https://reviews.llvm.org/D70488



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-18 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D69979#1749198 , @arsenm wrote:

> I just posted the test I wrote here: https://github.com/arsenm/subnormal_test


Thanks. I tried compiling with gcc (can't trust clang since it doesn't honor 
#pragma STDC FENV_ACCESS ON?). 
And running that on a Ubuntu 17.10 x86-64 system, it's behaving as I would 
expect. If you compile without -ffast-math,  it asserts:

  With denormals disabled
  a.out: subnormal_test.cpp:33: void fp32_denorm_test(): Assertion 
`std::fpclassify(subnormal) == FP_SUBNORMAL' failed.

And if you compile with -ffast-math, it asserts:

  In default FP mode
  a.out: subnormal_test.cpp:33: void fp32_denorm_test(): Assertion 
`std::fpclassify(subnormal) == FP_SUBNORMAL' failed.

This is what I see compiling Craig's csr tester:

  $ cc -O2 csr.c && ./a.out
  1f80
  $ cc -O2 csr.c -ffast-math && ./a.out
  9fc0

FZ is bit 15 (0x8000) and DAZ is bit 6 (0x0040), so they are clear in default 
(IEEE) mode and set with -ffast-math.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69979/new/

https://reviews.llvm.org/D69979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-14 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D69979#1740294 , @arsenm wrote:

> In D69979#1738099 , @craig.topper 
> wrote:
>
> > I checked Redhat 7.4 that's on the server I'm using for work. And I had a 
> > coworker check his Ubuntu 18.04 system with this program. And both systems 
> > printed 1f80 as the value of MXCSR which shows FTZ and DAZ are both 0. Are 
> > you seeing something different?
> >
> >   #include 
> >   #include 
> >  
> >   int main() {
> > int csr = _mm_getcsr();
> > printf("%x\n", csr);
> > return 0;
> >   }
> >
>
>
> I see the value as 1f80. However the test program I wrote suggests the 
> default is to flush (and what the comments in bug 34994 suggest?):

Is the test program attached somewhere? 
Bug 34994 (https://bugs.llvm.org/show_bug.cgi?id=34994) was limited to changing 
cases where we are running in some kind of loose-FP environment (otherwise, we 
would not be generating a sqrt estimate sequence at all). In the default 
(IEEE-compliant) environment, x86 would use a full-precision sqrt instruction 
or make a call to libm sqrt.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69979/new/

https://reviews.llvm.org/D69979

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-08 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a reviewer: cameron.mcinally.
spatel added a comment.

Also, I may have missed some discussions. Does this patch series replace the 
proposal to add instruction-level FMF for denorms?
http://lists.llvm.org/pipermail/llvm-dev/2019-September/135183.html

Ie, did we decide that a function-level attribute is good enough?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69979/new/

https://reviews.llvm.org/D69979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-08 Thread Sanjay Patel via Phabricator via cfe-commits

spatel added a comment.

In D69979#1738099 , @craig.topper 
wrote:

> I checked Redhat 7.4 that's on the server I'm using for work. And I had a 
> coworker check his Ubuntu 18.04 system with this program. And both systems 
> printed 1f80 as the value of MXCSR which shows FTZ and DAZ are both 0. Are 
> you seeing something different?

AFAIK, x86(-64) Linux is IEEE-compliant by default. It's only when compiling 
with -ffast-math that clang/gcc link in the startup routine to set FTZ/DAZ. So 
this patch should use that same mechanism to set the denorm mode. See: 
https://reviews.llvm.org/rL165240

@RKSimon - is it the same on PS4?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69979/new/

https://reviews.llvm.org/D69979

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 >

1 - 100 of 225 matches

Mail list logo