[PATCH] maintainer-scripts: Adding GIT_CUSTOMREPO parameters to gcc_release script.
Hi GCC community, I need to have ability to point to custom repository in gcc_release script. This small patch 1) does add a parameter "-g" to add custom repository to gcc_release , 2) does add a line to download prerequisites before building GCC (download_prerequisites) which is not present in gcc_release right now. I tested the script on x86_64 Linux. Adding GIT_CUSTOMREPO parameters to gcc_release script. * maintainer-scripts/gcc_release Best wishes, Navid.From 730fef2cfd589b58e5f16ae765518754af3766b8 Mon Sep 17 00:00:00 2001 From: Navid Rahimi Date: Wed, 30 Mar 2022 10:34:24 -0700 Subject: [PATCH] Adding GIT_CUSTOMREPO parameters to gcc_release script. * maintainer-scripts/gcc_release --- maintainer-scripts/gcc_release | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/maintainer-scripts/gcc_release b/maintainer-scripts/gcc_release index 2456908d716..20ded30723d 100755 --- a/maintainer-scripts/gcc_release +++ b/maintainer-scripts/gcc_release @@ -79,6 +79,7 @@ Options: -t tag Tag to mark the release in git. -u username Username for upload operations. -b local-git-repoLocal git repository to speed up cloning. + -g custom-repo-link Link to custom git repository. EOF exit 1 } @@ -113,9 +114,9 @@ build_sources() { error "Could not check out release sources" fi - # If this is a final release, make sure that the ChangeLogs + # If this is a final release and it is not custom repository, make sure that the ChangeLogs # and version strings are updated. - if [ ${FINAL} -ne 0 ]; then + if [ ${FINAL} -ne 0 ] && [ -z $GIT_CUSTOMREPO ]; then inform "Updating ChangeLogs and version files" grep -q "gcc-${RELEASE_MAJOR}/index.html gcc-${RELEASE_MAJOR}/changes.html" \ @@ -266,6 +267,7 @@ EOF '' | 0* | *[!0-9]*) num_cpus=1;; esac fi +contrib/download_prerequisites contrib/gcc_build -d ${SOURCE_DIRECTORY} -o ${OBJECT_DIRECTORY} \ -c "--enable-generated-files-in-srcdir --disable-multilib" \ -m "-j$num_cpus" build || \ @@ -538,7 +540,8 @@ GIT_SERVER="gcc.gnu.org" GIT_REPOSITORY="/git/gcc.git" # The username to use when connecting to the server. GIT_USERNAME="${USER}" - +# Link to custom repo. +GIT_CUSTOMREPO="" # The machine to which files will be uploaded. GCC_HOSTNAME="gcc.gnu.org" # The name of the account on the machine to which files are uploaded. @@ -623,13 +626,14 @@ TAR="${TAR:-tar}" # Parse the options. -while getopts "d:fr:u:t:p:s:lb:" ARG; do +while getopts "d:fr:u:t:p:g:s:lb" ARG; do case $ARG in d)DESTINATION="${OPTARG}";; r)RELEASE="${OPTARG}";; t)TAG="${OPTARG}";; u)GIT_USERNAME="${OPTARG}";; f)FINAL=1;; +g)GIT_CUSTOMREPO="${OPTARG}";; s)SNAPSHOT=1 BRANCH=${OPTARG%:*} GITBRANCH=${OPTARG#*:} @@ -728,7 +732,9 @@ WORKING_DIRECTORY="${DESTINATION}/gcc-${RELEASE}" SOURCE_DIRECTORY="${WORKING_DIRECTORY}/gcc-${RELEASE}" # Set up GITROOT. -if [ $LOCAL -eq 0 ]; then +if [ -n $GIT_CUSTOMREPO ]; then +GITROOT=${GIT_CUSTOMREPO} +elif [ $LOCAL -eq 0 ]; then GITROOT="git+ssh://${GIT_USERNAME}@${GIT_SERVER}${GIT_REPOSITORY}" else GITROOT="/git/gcc.git" -- 2.25.1
Re: [EXTERNAL] [PATCH] testsuite: Fix up tree-ssa/pr103514.c testcase [PR103514]
Thanks Jakob for the correction. Sadly, I didn’t have any access to any non x86 architecture. But x86 was fully tested and there was no regression. In my spare time I will look at implementation of this for short-circuit targets. Best wishes, Navid. From: Jakub Jelinek Sent: Saturday, January 29, 2022 8:46:09 AM To: Richard Biener ; Jeff Law Cc: Navid Rahimi ; gcc-patches@gcc.gnu.org Subject: [EXTERNAL] [PATCH] testsuite: Fix up tree-ssa/pr103514.c testcase [PR103514] [You don't often get email from ja...@redhat.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] On Fri, Jan 28, 2022 at 03:14:16PM -0700, Jeff Law via Gcc-patches wrote: > > This patch will add the missed pattern described in bug 103514 [1] to the > > match.pd. [1] includes proof of correctness for the patch too. > > > > PR tree-optimization/103514 > > * match.pd (a & b) ^ (a == b) -> !(a | b): New optimization. > > * match.pd (a & b) == (a ^ b) -> !(a | b): New optimization. > > * gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization. > > > > 1) > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D103514data=04%7C01%7Cnavidrahimi%40microsoft.com%7C712766ef9fc24c7ffeda08d9e346e086%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637790716153978385%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=CGslhJuy%2BTSrPpYbALD9pBh9945Hl6lINeHKmTEWqK0%3Dreserved=0 > Note the bug was filed an fixed during stage3, review just didn't happen in > a reasonable timeframe. > > I'm going to ACK this for the trunk and go ahead and commit it for you. The testcase FAILs on short-circuit targets like powerpc64le-linux. While the first 2 functions are identical, the last two look like: : if (a_5(D) != 0) goto ; [INV] else goto ; [INV] : if (b_6(D) != 0) goto ; [INV] else goto ; [INV] : : # iftmp.1_4 = PHI <1(3), 0(4)> _1 = a_5(D) == b_6(D); _2 = (int) _1; _3 = _2 ^ iftmp.1_4; _9 = _2 != iftmp.1_4; return _9; instead of the expected: : _3 = a_8(D) & b_9(D); _4 = (int) _3; _5 = a_8(D) == b_9(D); _6 = (int) _5; _1 = a_8(D) | b_9(D); _2 = ~_1; _7 = (int) _2; _10 = ~_1; return _10; so no wonder it doesn't match. E.g. x86_64-linux will also use jumps if it isn't just a && b but a && b && c && d (will do a & b and c & d tests and jump based on those. As it is too late to implement this optimization even for the short circuiting targets this late (not even sure which pass would be best), this patch just forces non-short-circuiting for the test. Tested on x86_64-linux -m32/-m64 and powerpc64le-linux, ok for trunk? 2022-01-29 Jakub Jelinek PR tree-optimization/103514 * gcc.dg/tree-ssa/pr103514.c: Add --param logical-op-non-short-circuit=1 to dg-options. --- gcc/testsuite/gcc.dg/tree-ssa/pr103514.c.jj 2022-01-29 11:11:39.338627697 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr103514.c2022-01-29 17:37:18.255237211 +0100 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O -fdump-tree-optimized" } */ +/* { dg-options "-O --param logical-op-non-short-circuit=1 -fdump-tree-optimized" } */ #include bool @@ -30,4 +30,4 @@ h (bool a, bool b) /* Make sure we have removed "==" and "^" and "&". */ /* { dg-final { scan-tree-dump-not "&" "optimized"} } */ /* { dg-final { scan-tree-dump-not "\\^" "optimized"} } */ -/* { dg-final { scan-tree-dump-not "==" "optimized"} } */ \ No newline at end of file +/* { dg-final { scan-tree-dump-not "==" "optimized"} } */ Jakub
[PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization
Hi GCC community, This patch will add the missed pattern described in bug 103514 [1] to the match.pd. [1] includes proof of correctness for the patch too. PR tree-optimization/103514 * match.pd (a & b) ^ (a == b) -> !(a | b): New optimization. * match.pd (a & b) == (a ^ b) -> !(a | b): New optimization. * gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization. 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103514 Best wishes, Navid.From 7bc34478cc3a494bb634625281b5f03e43f210a9 Mon Sep 17 00:00:00 2001 From: Navid Rahimi Date: Wed, 1 Dec 2021 00:00:54 -0800 Subject: [PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization * match.pd (a & b) ^ (a == b) -> !(a | b): New optimization. * match.pd (a & b) == (a ^ b) -> !(a | b): New optimization. * gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization. --- gcc/match.pd | 8 ++ gcc/testsuite/gcc.dg/tree-ssa/pr103514.c | 33 2 files changed, 41 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103514.c diff --git a/gcc/match.pd b/gcc/match.pd index 0d7b8dd1334..df55206d2ec 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1768,6 +1768,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (negate (nop_convert? (bit_not @0))) (plus (view_convert @0) { build_each_one_cst (type); })) +/* (a & b) ^ (a == b) -> !(a | b) */ +/* (a & b) == (a ^ b) -> !(a | b) */ +(for first_op (bit_xor eq) + second_op (eq bit_xor) + (simplify + (first_op:c (bit_and:c truth_valued_p@0 truth_valued_p@1) (second_op:c @0 @1)) +(bit_not (bit_ior @0 @1 + /* Convert ~ (A - 1) or ~ (A + -1) to -A. */ (simplify (bit_not (convert? (minus @0 integer_each_onep))) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103514.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103514.c new file mode 100644 index 000..5942ad37bf0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103514.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-optimized" } */ +#include + +bool +i (bool a, bool b) +{ + return (a & b) ^ (a == b); +} + +bool +j (bool a, bool b) +{ + return (a & b) == (a ^ b); +} + +bool +g (bool a, bool b) +{ +return (a && b) == (a ^ b); +} + +bool +h (bool a, bool b) +{ + return (a && b) ^ (a == b); +} + + +/* Make sure we have removed "==" and "^" and "&". */ +/* { dg-final { scan-tree-dump-not "&" "optimized"} } */ +/* { dg-final { scan-tree-dump-not "\\^" "optimized"} } */ +/* { dg-final { scan-tree-dump-not "==" "optimized"} } */ \ No newline at end of file -- 2.25.1
[COMMITTED] MAINTAINERS: Add myself to write after approval and DCO
MAINTAINERS: Add myself to write after approval and DCO sections. * MAINTAINERS: Adding myself. Best wishes, Navid. 0001-MAINTAINERS-Add-myself-to-write-after-approval-and-D.patch Description: 0001-MAINTAINERS-Add-myself-to-write-after-approval-and-D.patch
Re: [EXTERNAL] Re: [PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization
Hi Marc, thanks for clear explanation. Actually I have to withdraw this patch. As you noticed there are some problems with this. I was testing it yesterday, and I did realize I made mistake combining different types in this pattern. The same approach would work only and only if the types of every operand is boolean. But if there are any integer here, then this is not going to work. For looking at example and link to proof in each case [1]. One actual example, with the input values is like this: (a & b) ^ (a == b) -> !(a | b) a = 2 and b = 2, src : (a == b) => true (a & b) => 2 (a & b) ^ (a == b) => (2) ^ (true) => 3 (the compiler will consider true as 1) [2]. tgt: (a | b) => 2 !(a | b) => !(2) => false. Which means the transformation is incorrect for other types. (Same transformation does work for bool types, so I think in my next patch I will keep the approach and will restrict it to bool types only). 1) https://compiler-explorer.com/z/h7hcohY74 2) https://eel.is/c++draft/conv.prom#6 Best wishes, Navid. From: Marc Glisse Sent: Saturday, December 4, 2021 13:22 To: gcc-patches@gcc.gnu.org Cc: Navid Rahimi Subject: [EXTERNAL] Re: [PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization [You don't often get email from marc.gli...@inria.fr. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] +/* (a & b) ^ (a == b) -> !(a | b) */ +/* (a & b) == (a ^ b) -> !(a | b) */ +(for first_op (bit_xor eq) + second_op (eq bit_xor) + (simplify + (first_op:c (bit_and:c @0 @1) (second_op:c @0 @1)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) +&& types_match (TREE_TYPE (@0), TREE_TYPE (@1))) +(convert (bit_not (bit_ior @0 @1)) I don't think you need types_match, if both are operands of bit_and, their types must already match. It isn't clear what the INTEGRAL_TYPE_P test is for. Your 2 transformations don't seem that similar to me. The first one requires that a and b have the same type as the result of ==, so they are boolean-like. The second one makes sense for more general integers, but then it looks like it should produce (a|b)==0. It doesn't look like we have a canonical representation between a^b and a!=b for booleans :-( (sorry for the broken thread, I was automatically unsubscribed because mailman doesn't like greylisting) -- Marc Glisse
[PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization
Hi GCC community, This patch will add the missed pattern described in bug 103514 [1] to the match.pd. Tested on x86_64 Linux. tree-optimization/103514 Missing XOR-EQ-AND Optimization * match.pd (a & b) == (a ^ b) -> !(a | b): New optimization. * match.pd (a & b) ^ (a == b) -> !(a | b): New optimization. * gcc.dg/tree-ssa/pr102232.c: Testcase for this optimization. 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103514 Best wishes, Navid. 0001-tree-optimization-103514-Missing-XOR-EQ-AND-Opt-v3.patch Description: 0001-tree-optimization-103514-Missing-XOR-EQ-AND-Opt-v3.patch
Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift
I see. That makes sense. Thanks for the explanation. I was looking at the 64992 and it seems all the implementation right now are only using INTEGER_CST at the moment. But now it makes sense. Best wishes, Navid. From: Andrew Pinski Sent: Tuesday, November 30, 2021 15:18 To: Navid Rahimi Cc: Navid Rahimi via Gcc-patches Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift On Tue, Nov 30, 2021 at 3:08 PM Navid Rahimi wrote: > > Hi Andrew, > > Thanks for your detailed comment. There are two problem I wanted to discuss > with you about: > > a) The optimization I have sent patch, does optimize variable length "<<" > too(for example B0 << x, where x is variable). This [1] link shows the actual > optimization and a link for the proof is included in the editor. > > b) I am unable to prove the optimization you are describing for non-constant > length shift. You can take a look at the code example [2] and proof [3]. I am > getting "Transformation doesn't verify!" when I do implement the optimization > you mentioned for non-constant shift. > > The optimization you are describing only works for "(take: (t << 1) != 0) -> > ((t & 0x7fff) != 0)" which only is provable and works for INTEGER_CST. No it works with non constants too: t << y != 0 -> t & (-1u>>y) != 0 When y == 0, you have t != 0. Which is exactly what you think it should be. Which can be further reduced to t != 0 as y >= sizeof(t)*BITS_PER_UNIT is undefined. Thanks, Andrew Pinski > > My understanding might be incorrect here, please don't hesitate to correct me. > > 1) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2Fr46znh4Tjdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521114457%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=hGkyUkh4Srjb5%2BhvYdT30VLaDLGlkM6jBt3TmfcHFUw%3Dreserved=0 > 2) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FK1so39dbKdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521124452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=EiVD3aIDzds%2BIX5EY3onWVuc%2FdMjoeDSyc5I1B2Xr%2F4%3Dreserved=0 > 3) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2F-54zZvdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521124452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=vFiA3eWi5Ry3rFwp6iUc61JaVtzoWS6careE4I5rZvk%3Dreserved=0 > > Best wishes, > Navid. > > > From: Andrew Pinski > Sent: Tuesday, November 30, 2021 14:03 > To: Navid Rahimi > Cc: Navid Rahimi via Gcc-patches > Subject: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out > boolean left shift > > On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches > wrote: > > > > Hi GCC community, > > > > This patch will add the missed pattern described in bug 98956 [1] to the > > match.pd. The codegen and correctness proof for this pattern is here [2,3] > > in case anyone is curious. Tested on x86_64 Linux. > > > > A better way to optimize this is the following (which I describe in PR 64992): > take: (t << 1) != 0; > > This should be transformed into: > (t & 0x7fff) != 0 > > The rest will just fall out really. That is there is no reason to > special case bool here. > I have most of the patch except for creating the mask part which > should be simple, I just did not want to look up the wi:: functions at > the time I was writing it into the bug report. > > Thanks, > Andrew Pinski > > > > > Tree-optimization/98956: > > > > Adding new optimization to match.pd: > > * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization. > > * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization. > > * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with > > side-effect. > > > > 1) > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D98956data=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521124452%7CUnknown%7CTWFpbGZsb3d8e
Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift
Hi Andrew, Thanks for your detailed comment. There are two problem I wanted to discuss with you about: a) The optimization I have sent patch, does optimize variable length "<<" too(for example B0 << x, where x is variable). This [1] link shows the actual optimization and a link for the proof is included in the editor. b) I am unable to prove the optimization you are describing for non-constant length shift. You can take a look at the code example [2] and proof [3]. I am getting "Transformation doesn't verify!" when I do implement the optimization you mentioned for non-constant shift. The optimization you are describing only works for "(take: (t << 1) != 0) -> ((t & 0x7fff) != 0)" which only is provable and works for INTEGER_CST. My understanding might be incorrect here, please don't hesitate to correct me. 1) https://compiler-explorer.com/z/r46znh4Tj 2) https://compiler-explorer.com/z/K1so39dbK 3) https://alive2.llvm.org/ce/z/-54zZv Best wishes, Navid. From: Andrew Pinski Sent: Tuesday, November 30, 2021 14:03 To: Navid Rahimi Cc: Navid Rahimi via Gcc-patches Subject: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches wrote: > > Hi GCC community, > > This patch will add the missed pattern described in bug 98956 [1] to the > match.pd. The codegen and correctness proof for this pattern is here [2,3] in > case anyone is curious. Tested on x86_64 Linux. > A better way to optimize this is the following (which I describe in PR 64992): take: (t << 1) != 0; This should be transformed into: (t & 0x7fff) != 0 The rest will just fall out really. That is there is no reason to special case bool here. I have most of the patch except for creating the mask part which should be simple, I just did not want to look up the wi:: functions at the time I was writing it into the bug report. Thanks, Andrew Pinski > Tree-optimization/98956: > > Adding new optimization to match.pd: > * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization. > * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization. > * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with > side-effect. > > 1) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D98956data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=EO7zAIa9sux4JklTDeALImoX3Kcjqeug%2BssU0E%2Fp6mY%3Dreserved=0 > 2) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2Fnj4PTrecWdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=GyivNuda31%2FPXJQQ4Z9tK2cFtj3N9YcvRdtM7rVkhHg%3Dreserved=0 > 3) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2FjyJAoSdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=esqOKjKS5JZDbNBmAi0Bwwk0JTTHzInQ2Lgeq%2BPHJ9w%3Dreserved=0 > > Best wishes, > Navid.
[PATCH] tree-optimization/98956 Optimizing out boolean left shift
Hi GCC community, This patch will add the missed pattern described in bug 98956 [1] to the match.pd. The codegen and correctness proof for this pattern is here [2,3] in case anyone is curious. Tested on x86_64 Linux. Tree-optimization/98956: Adding new optimization to match.pd: * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization. * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization. * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with side-effect. 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98956 2) https://compiler-explorer.com/z/nj4PTrecW 3) https://alive2.llvm.org/ce/z/jyJAoS Best wishes, Navid. 0001-Tree-optimization-98956.patch Description: 0001-Tree-optimization-98956.patch
Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification
Jeff, Sorry for bringing back this thread. Quick question, you mentioned checking the TYPE_PRECISION to make sure the type is a canonical Boolean type (and not a fancy signed/unsigned Boolean type from Ada Andrew mentioned). But I noticed that truth_valued_p does already check for: (if (INTEGRAL_TYPE_P (type) && TYPE_PRECISION (type) == 1)) So in this case, there should not any other Boolean type fall into truth_valued_p type [1]? Is that right? 1) https://github.com/gcc-mirror/gcc/blob/master/gcc/match.pd#L1717 Best wishes, Navid. From: Jeff Law Sent: Tuesday, November 23, 2021 12:03 To: Navid Rahimi; Andrew Pinski Cc: Navid Rahimi via Gcc-patches Subject: Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification On 11/23/2021 12:55 PM, Navid Rahimi wrote: >> Did you test Ada with this patch as that is where the "odd" boolean >> types show up? > No I haven't tested Ada yet. Since it is work in progress still [WIP]. Quick > question, to prevent applying this optimization to those odd Boolean types in > Ada, there should be a check to check whether it is canonical boolean type or > signed/unsigned, which should prevent messing with odd Boolean types in Ada. IIRC, you should check the type's precision. THere should be examples you can find in one or more of the gimple optimizers. jeff
Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification
> In gimple your primary goal should be to reduce the number of > expressions that are evaluated. This patch does the opposite. That is actually a really good point in my opinion. I am hesitant about this patch and wanted to hear gcc-patch opinion about this. Doing something like this in IR level is a little bit counter intuitive to me. I will take a look at LLVM in my spare time to see where they are transferring that pattern and what was the rationale behind it. Best wishes, Navid. From: Jeff Law Sent: Tuesday, November 23, 2021 12:02 To: Navid Rahimi; Navid Rahimi via Gcc-patches Subject: Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification On 11/23/2021 12:42 PM, Navid Rahimi wrote: > In case of x86_64. This is the code: > > src_1(bool, bool): > cmp dil, sil > setbal > ret > > tgt_1(bool, bool): > xor edi, 1 > mov eax, edi > and eax, esi > ret > > > Lets look at the latency of the src_1: > cmp: latency of 1: (page 663, table C-17) > setb: latency of 2. They don't report setb latency in intel instruction > manual. But the closest instruction to this setbe does have latency of 2. > > But for tgt_1: > xor: latency 1. > mov: latency 1. (But it seems x86_64 does optimize this instruction and > basically it is latency 0 in this case. In Zero-Latency MOV Instructions > section they explain it [1].) > and: latency 1. > > So even if you consider setb as latency of 1 it is equal. But if it is > latency of 2, it should be a 1 latency win. > > 1) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.intel.com%2Fcontent%2Fdam%2Fwww%2Fpublic%2Fus%2Fen%2Fdocuments%2Fmanuals%2F64-ia-32-architectures-optimization-manual.pdfdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cda4bfe80ceaa432a813e08d9aebc33ee%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732945624565576%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=sopToDx8Y4xfROdI7nRYxYQ%2FCHJPgjIKKGEaWiAXmL4%3Dreserved=0 But these are target issues you've raised -- those should be handled in the RTL pipeline and are not a significant concern for gimple. In gimple your primary goal should be to reduce the number of expressions that are evaluated. This patch does the opposite. jeff
Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification
> Did you test Ada with this patch as that is where the "odd" boolean > types show up? No I haven't tested Ada yet. Since it is work in progress still [WIP]. Quick question, to prevent applying this optimization to those odd Boolean types in Ada, there should be a check to check whether it is canonical boolean type or signed/unsigned, which should prevent messing with odd Boolean types in Ada. Best wishes, Navid. From: Andrew Pinski Sent: Tuesday, November 23, 2021 11:33 To: Jeff Law Cc: Navid Rahimi; Navid Rahimi via Gcc-patches Subject: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification [You don't often get email from pins...@gmail.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] On Tue, Nov 23, 2021 at 11:15 AM Jeff Law via Gcc-patches wrote: > > > > On 11/23/2021 11:34 AM, Navid Rahimi via Gcc-patches wrote: > > Hi GCC community, > > > > I wanted you take a quick look at this patch to solve this bug [1]. This is > > the code example for the optimization [2] which does include a link to > > proof of each different optimization. > > > > I think it should be possible to use simpler approach than what Andrew has > > used here [3]. > > > > P.S. Tested and verified on Linux x86_64. > > > > 1) > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D101808data=04%7C01%7Cnavidrahimi%40microsoft.com%7C7b45fdd017874f287caf08d9aeb836ad%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732928490579680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=UsE0BbrZpRhrPZZreF%2Bj2spaYmJZuVLc053sWTFG6Ow%3Dreserved=0 > > 2) > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FGc448eE3zdata=04%7C01%7Cnavidrahimi%40microsoft.com%7C7b45fdd017874f287caf08d9aeb836ad%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732928490579680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=vGlVXyBqBeABvP8hQGb6paYj1t078rSlLdpI0t6qDlc%3Dreserved=0 > > 3) > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D101808%23c1data=04%7C01%7Cnavidrahimi%40microsoft.com%7C7b45fdd017874f287caf08d9aeb836ad%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732928490579680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=DNAyrNo9uZ05FyJYOdc%2BD85Dc9A3VgP95htSfaxRS40%3Dreserved=0 > Don't those match.pd patterns make things worse? We're taking a single > expression evaluation (the conditional) and turning it into two logicals > AFAICT. > > For the !x expression, obviously if x is a constant, then we can > compute that at compile time and we're going from a single conditional > to a single logical which is probably a win, but that's not the case > with this patch AFAICT. One thing is you could use ! to see if bit_not simplifies down to a constant which is what I did in the bug report. But it might be more useful to use the ^ flag (which I added in a different patch) which says the bit_xor is removed then accept it. Note (bit_not @0) is wrong, it should be (bit_xor @0 { booleantrue; } ) as there are boolean types which are signed and/or > 1 precision which is what I had in my patch. Did you test Ada with this patch as that is where the "odd" boolean types show up? Thanks, Andrew Pinski > > jeff
Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification
In case of x86_64. This is the code: src_1(bool, bool): cmp dil, sil setbal ret tgt_1(bool, bool): xor edi, 1 mov eax, edi and eax, esi ret Lets look at the latency of the src_1: cmp: latency of 1: (page 663, table C-17) setb: latency of 2. They don't report setb latency in intel instruction manual. But the closest instruction to this setbe does have latency of 2. But for tgt_1: xor: latency 1. mov: latency 1. (But it seems x86_64 does optimize this instruction and basically it is latency 0 in this case. In Zero-Latency MOV Instructions section they explain it [1].) and: latency 1. So even if you consider setb as latency of 1 it is equal. But if it is latency of 2, it should be a 1 latency win. 1) https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf Best wishes, Navid. From: Jeff Law Sent: Tuesday, November 23, 2021 11:14 To: Navid Rahimi; Navid Rahimi via Gcc-patches Subject: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification On 11/23/2021 11:34 AM, Navid Rahimi via Gcc-patches wrote: > Hi GCC community, > > I wanted you take a quick look at this patch to solve this bug [1]. This is > the code example for the optimization [2] which does include a link to proof > of each different optimization. > > I think it should be possible to use simpler approach than what Andrew has > used here [3]. > > P.S. Tested and verified on Linux x86_64. > > 1) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D101808data=04%7C01%7Cnavidrahimi%40microsoft.com%7C29308ca3ff234b91a31608d9aeb57500%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732916650766903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=m%2BIgviZpMo0MT369dcIzefp810oz%2FMU9LC1Mk2FdChk%3Dreserved=0 > 2) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FGc448eE3zdata=04%7C01%7Cnavidrahimi%40microsoft.com%7C29308ca3ff234b91a31608d9aeb57500%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732916650766903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=IwNQZsaEUaB1MKRfL8OWkWYvx0ODq86Obt3eFuxZD40%3Dreserved=0 > 3) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D101808%23c1data=04%7C01%7Cnavidrahimi%40microsoft.com%7C29308ca3ff234b91a31608d9aeb57500%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732916650766903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=2DB2AlZWbjJ6Fd3Aw%2Bb2Oub3t8d1i7%2FnaUQKUe2m4uQ%3Dreserved=0 Don't those match.pd patterns make things worse? We're taking a single expression evaluation (the conditional) and turning it into two logicals AFAICT. For the !x expression, obviously if x is a constant, then we can compute that at compile time and we're going from a single conditional to a single logical which is probably a win, but that's not the case with this patch AFAICT. jeff
[PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification
Hi GCC community, I wanted you take a quick look at this patch to solve this bug [1]. This is the code example for the optimization [2] which does include a link to proof of each different optimization. I think it should be possible to use simpler approach than what Andrew has used here [3]. P.S. Tested and verified on Linux x86_64. 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101808 2) https://compiler-explorer.com/z/Gc448eE3z 3) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101808#c1 Best wishes, Navid. 0001-PR-tree-optimization-101808.patch Description: 0001-PR-tree-optimization-101808.patch
Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
Thanks Jeff for this too. Best wishes, Navid. From: Jeff Law Sent: Monday, November 22, 2021 19:09 To: Richard Biener; Navid Rahimi Cc: gcc-patches@gcc.gnu.org Subject: Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd [You don't often get email from jeffreya...@gmail.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] On 11/10/2021 1:35 AM, Richard Biener via Gcc-patches wrote: > On Tue, Nov 9, 2021 at 5:25 PM Navid Rahimi wrote: >> Hi Richard, >> >> Thank you so much for your detailed feedback. I am attaching another version >> of the patch which does include all the changes you mentioned. >> >> Bellow you can see my response to your feedbacks: >> >>> the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as >>> constants come last - you can then remove the 'c' >> Fixed. I was not aware of the canonical order. >> >>> you can use INTEGRAL_TYPE_P (type). >> Fixed. Didn't know about "type" either. >> >>> this test is not necessary >> Fixed. >> >>> But should we also optimize x * (2 + y/x) - y -> 2*x - y % x? So >>> it looks like a conflict with the x * (1 + b) <-> x + x * b transform >>> (fold_plusminus_mult)? That said, the special case of one >>> definitely makes the result cheaper (one less operation). >> For this special case, it does remove operation indeed. But I was not able >> to reproduce it for any other constant [1]. If it was possible to do it with >> other constants I would've changed the pattern to have be more general like >> "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a >> win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to >> me when there is a "- y" at the end "x * (1 + b)", there is opportunity to >> optimize. Without that "- y" I was not able to make any operation more >> performant. Either direction, looks like same amount of computation. >> >> 1) >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FdWsq7Tzf4data=04%7C01%7Cnavidrahimi%40microsoft.com%7C1ddd355d86ee4a5d645808d9ae2e9e73%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732337521412439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=K4ecZcigYkh8%2BSiqDegRxm72gkL%2FnsbuDDp5nsM3ZqA%3Dreserved=0 >> >>> Please move the pattern next to the most relatest which I think is >> Fixed. >> >>> the return value of 1 is an unreliable way to fail, please instead call >>> __builtin_abort (); >> Fixed. >> >>> do we really need -O3 for this? I think that -O should be enough here? >> We don't specifically need that. But I realized that the optimization can >> happen in two different level at the compiler. It seems if you spread the >> computation over multiple statement like: >>int c = a/b; >>int y = b * (1 + c); >>return y - a; >> >> instead of : >>return b * (1 + a / b) - a; >> >> Then you have to have at least -O1 to have it optimized. Granted, I am not >> doing that in the testcase. In the new patch I am changing it to -O. Let me >> know if you have any suggestions. > -O is fine, generally at -O0 we shouldn't expect such transforms to > happen (but they still do, of course). > > The patch looks OK now. I've pushed this to the trunk for Navid. jeff
Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/96779 Adding a missing pattern to match.pd
Thanks Jeff. Best wishes, Navid. From: Jeff Law Sent: Monday, November 22, 2021 16:48 To: Richard Biener; Navid Rahimi Cc: Navid Rahimi via Gcc-patches Subject: Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/96779 Adding a missing pattern to match.pd [You don't often get email from jeffreya...@gmail.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] On 11/22/2021 1:45 AM, Richard Biener via Gcc-patches wrote: > On Fri, Nov 19, 2021 at 11:33 PM Navid Rahimi > wrote: >> Hi Richard, >> >> Thanks for the detailed comment. I am attaching a newer version of the patch >> which does have required fixes included. Bellow you can see my response to >> your feedbacks: >> >>> you need to check TYPE_OVERFLOW_WRAPS on TREE_TYPE (@0), >>> otherwise you check on boolean. >> Fixed it. >> >>> no need for :c on the result pattern. Otherwise it looks OK, but how >>> did you check the patch? >> Fixed it. For checking the patch, I have script which builds and runs make >> check for 1) trunk and 2) trunk+patch in a separate directory and diffs the >> test results from each directory. My test script did had a subtle problem. >> The bug was, because of a typo in the path I introduced few days ago, it was >> diffing same trunk+patch test results against trunk+patch test results. > OK, please indicate that in the future, like with "Bootstrapped and > tested on x86_64-linux" or so. > >> That was a good reminder to setup an account for myself here asap [1]. >> >> 1) >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fwiki%2FCompileFarmdata=04%7C01%7Cnavidrahimi%40microsoft.com%7C02e3c84f12dc48c34c7e08d9ae1b030e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732253312568135%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=8virYthoelGhRKtjTo1g%2F9GV67I4rze5iv07XnpCpNk%3Dreserved=0 > The updated patch is OK. I don't think Navid has commit privs, so I fixed up the commit message and committed the patch for Navid. jeff
Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/96779 Adding a missing pattern to match.pd
Hi Richard, Thanks for the detailed comment. I am attaching a newer version of the patch which does have required fixes included. Bellow you can see my response to your feedbacks: > you need to check TYPE_OVERFLOW_WRAPS on TREE_TYPE (@0), > otherwise you check on boolean. Fixed it. > no need for :c on the result pattern. Otherwise it looks OK, but how > did you check the patch? Fixed it. For checking the patch, I have script which builds and runs make check for 1) trunk and 2) trunk+patch in a separate directory and diffs the test results from each directory. My test script did had a subtle problem. The bug was, because of a typo in the path I introduced few days ago, it was diffing same trunk+patch test results against trunk+patch test results. That was a good reminder to setup an account for myself here asap [1]. 1) https://gcc.gnu.org/wiki/CompileFarm Best wishes, Navid. From: Richard Biener Sent: Friday, November 19, 2021 03:43 To: Navid Rahimi Cc: Navid Rahimi via Gcc-patches Subject: [EXTERNAL] Re: [PATCH] PR tree-optimization/96779 Adding a missing pattern to match.pd [You don't often get email from richard.guent...@gmail.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] On Tue, Nov 16, 2021 at 11:51 PM Navid Rahimi via Gcc-patches wrote: > > Hi GCC community, > > This patch will add the missed pattern described in bug 102232 [1] to the > match.pd. > > Tree-optimization/96779: Adding new optimization to match.pd: > > * match.pd (-x == x) -> (x == 0): New optimization. > * gcc.dg/tree-ssa/pr96779.c: testcase for this optimization. > * gcc.dg/tree-ssa/pr96779-disabled.c: testcase for this > optimization when -fwrapv passed. > > 1) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D96779data=04%7C01%7Cnavidrahimi%40microsoft.com%7C11c3214ef8164af4d50008d9ab51d9bc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637729190792397989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=mxYBk6rex%2Bq5UMot%2BWfJqXeTYEYuM16hrvLGyp4PGeI%3Dreserved=0 +/* -x == x -> x == 0 */ +(for cmp (eq ne) + (simplify + (cmp:c @0 (negate @0)) + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) +&& !TYPE_OVERFLOW_WRAPS (type)) you need to check TYPE_OVERFLOW_WRAPS on TREE_TYPE (@0), otherwise you check on boolean. +(cmp:c @0 { build_zero_cst (TREE_TYPE(@0)); } + no need for :c on the result pattern. Otherwise it looks OK, but how did you check the patch? Thanks, Richard. > Best wishes, > Navid. 0001-tree-optimization-96779-v2.patch Description: 0001-tree-optimization-96779-v2.patch
[PATCH] PR tree-optimization/96779 Adding a missing pattern to match.pd
Hi GCC community, This patch will add the missed pattern described in bug 102232 [1] to the match.pd. Tree-optimization/96779: Adding new optimization to match.pd: * match.pd (-x == x) -> (x == 0): New optimization. * gcc.dg/tree-ssa/pr96779.c: testcase for this optimization. * gcc.dg/tree-ssa/pr96779-disabled.c: testcase for this optimization when -fwrapv passed. 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96779 Best wishes, Navid. 0001-tree-optimization-96779.patch Description: 0001-tree-optimization-96779.patch
Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
Hi Richard, Thank you so much for your detailed feedback. I am attaching another version of the patch which does include all the changes you mentioned. Bellow you can see my response to your feedbacks: > the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as > constants come last - you can then remove the 'c' Fixed. I was not aware of the canonical order. > you can use INTEGRAL_TYPE_P (type). Fixed. Didn't know about "type" either. > this test is not necessary Fixed. > But should we also optimize x * (2 + y/x) - y -> 2*x - y % x? So > it looks like a conflict with the x * (1 + b) <-> x + x * b transform > (fold_plusminus_mult)? That said, the special case of one > definitely makes the result cheaper (one less operation). For this special case, it does remove operation indeed. But I was not able to reproduce it for any other constant [1]. If it was possible to do it with other constants I would've changed the pattern to have be more general like "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to me when there is a "- y" at the end "x * (1 + b)", there is opportunity to optimize. Without that "- y" I was not able to make any operation more performant. Either direction, looks like same amount of computation. 1) https://compiler-explorer.com/z/dWsq7Tzf4 > Please move the pattern next to the most relatest which I think is Fixed. > the return value of 1 is an unreliable way to fail, please instead call > __builtin_abort (); Fixed. > do we really need -O3 for this? I think that -O should be enough here? We don't specifically need that. But I realized that the optimization can happen in two different level at the compiler. It seems if you spread the computation over multiple statement like: int c = a/b; int y = b * (1 + c); return y - a; instead of : return b * (1 + a / b) - a; Then you have to have at least -O1 to have it optimized. Granted, I am not doing that in the testcase. In the new patch I am changing it to -O. Let me know if you have any suggestions. Best wishes, Navid. From: Richard Biener Sent: Tuesday, November 9, 2021 02:36 To: Navid Rahimi Cc: gcc-patches@gcc.gnu.org Subject: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd On Tue, Nov 9, 2021 at 5:12 AM Navid Rahimi via Gcc-patches wrote: > > Hi GCC community, > > This patch will add the missed pattern described in bug 102232 [1] to the > match.pd. The testcase will test whether the multiplication and division has > been removed from the code or not. The correctness proof for this pattern is > here [2] in case anyone is curious. > > PR tree-optimization/102232 > * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization. +/* x * (1 + y / x) - y -> x - y % x */ +(simplify + (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1) the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as constants come last - you can then remove the 'c' + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) you can use INTEGRAL_TYPE_P (type). + && types_match (@0, @1)) this test is not necessary + (minus @0 (trunc_mod @1 @0 But should we also optimize x * (2 + y/x) - y -> 2*x - y % x? So it looks like a conflict with the x * (1 + b) <-> x + x * b transform (fold_plusminus_mult)? That said, the special case of one definitely makes the result cheaper (one less operation). Please move the pattern next to the most relatest which I think is /* X - (X / Y) * Y is the same as X % Y. */ (simplify (minus (convert1? @0) (convert2? (mult:c (trunc_div @@0 @@1) @1))) (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) (convert (trunc_mod @0 @1 +int +main (void) +{ + // few randomly generated test cases + if (foo (71856034, 238) != 212) +{ + return 1; the return value of 1 is an unreliable way to fail, please instead call __builtin_abort (); +/* { dg-options "-O3 -fdump-tree-optimized" } */ do we really need -O3 for this? I think that -O should be enough here? Thanks, Richard. > * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization. > > > 1) > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D102232data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cbc89643c6a14411d11ac08d9a36ce282%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720510334697749%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=Br7B1ShCT4ynLwypyM29LqaNF9GBJF3%2BIR6sZrDRTKM%3Dreserved=0 > 2) > https://nam06.safelinks.protection.outlook.com/?url=
Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
Sorry, a last minute update via my editor changed the MIME type to application/octet-stream. You can find the patch with correct MIME (text/x-diff) attached to this email. (Same file and content, fixed MIME). Best wishes, Navid. From: Gcc-patches on behalf of Navid Rahimi via Gcc-patches Sent: Monday, November 8, 2021 20:11 To: gcc-patches@gcc.gnu.org Subject: [EXTERNAL] [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd Hi GCC community, This patch will add the missed pattern described in bug 102232 [1] to the match.pd. The testcase will test whether the multiplication and division has been removed from the code or not. The correctness proof for this pattern is here [2] in case anyone is curious. PR tree-optimization/102232 * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization. * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization. 1) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D102232data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cded0a3258b594204129608d9a3371763%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720279311390932%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=AaXdpyn%2BNQ2BizC6FILJfL3B8WV84v5FuLuU1d8MJxU%3Dreserved=0 2) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2F2VScjDdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cded0a3258b594204129608d9a3371763%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720279311400927%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=3zM4awjl2LxhsfMd1VAOjH4JQ9621Q7Mg61gEB2Su%2Bk%3Dreserved=0 Best wishes, Navid. 0001-PR-tree-optimization-102232.patch Description: 0001-PR-tree-optimization-102232.patch
[PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
Hi GCC community, This patch will add the missed pattern described in bug 102232 [1] to the match.pd. The testcase will test whether the multiplication and division has been removed from the code or not. The correctness proof for this pattern is here [2] in case anyone is curious. PR tree-optimization/102232 * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization. * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization. 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102232 2) https://alive2.llvm.org/ce/z/2VScjD Best wishes, Navid. 0001-PR-tree-optimization-102232.patch Description: 0001-PR-tree-optimization-102232.patch