[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-06-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #21 from kugan at gcc dot gnu.org ---
(In reply to Christophe Lyon from comment #20)
> Hi Kugan,
> 
> The new test fails with -mabi=ilp32:
> FAIL: gcc.target/aarch64/pr88834.c scan-assembler-times \\tld2w\\t{z[0-9]+.s
> - z[0-9]+.s}, p[0-7]/z, \\[x[0-9]+, x[0-9]+, lsl 2\\]\\n 2
> FAIL: gcc.target/aarch64/pr88834.c scan-assembler-times \\tst2w\\t{z[0-9]+.s
> - z[0-9]+.s}, p[0-7], \\[x[0-9]+, x[0-9]+, lsl 2\\]\\n 1

Thanks Christophe. In the back-end, when we use ILP32, we don't accept SImode
ops if like:

(plus:SI (mult:SI (reg:SI 91)
(const_int 4 [0x4]))
(reg:SI 90))

While we would accept Pmode. My question is, should we care about ILP32 for
SVE? If so we need to fix this. Otherwise, we can run the test for LP64.

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-06-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #6 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Thu Jun 13 03:34:28 2019
New Revision: 272233

URL: https://gcc.gnu.org/viewcvs?rev=272233=gcc=rev
Log:

gcc/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88838
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): If the
compare_type is not with Pmode size, we will create an IV with
Pmode size with truncated use (i.e. converted to the correct type).
* tree-vect-loop.c (vect_verify_full_masking): Find IV type.
(vect_iv_limit_for_full_masking): New. Factored out of
vect_set_loop_condition_masked.
* tree-vectorizer.h (LOOP_VINFO_MASK_IV_TYPE): New.
(vect_iv_limit_for_full_masking): Declare.

gcc/testsuite/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88838
* gcc.target/aarch64/pr88838.c: New test.
* gcc.target/aarch64/sve/while_1.c: Adjust.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr88838.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/sve/while_1.c
trunk/gcc/tree-vect-loop-manip.c
trunk/gcc/tree-vect-loop.c
trunk/gcc/tree-vectorizer.h

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-06-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #19 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Thu Jun 13 03:18:54 2019
New Revision: 272232

URL: https://gcc.gnu.org/viewcvs?rev=272232=gcc=rev
Log:

gcc/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88834
* tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle
IFN_MASK_LOAD_LANES and IFN_MASK_STORE_LANES.
(get_alias_ptr_type_for_ptr_address): Likewise.
(add_iv_candidate_for_use): Add scaled index candidate if useful.
* tree-ssa-address.c (preferred_mem_scale_factor): New.
* config/aarch64/aarch64.c (aarch64_classify_address): Relax
allow_reg_index_p.

gcc/testsuite/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88834
* gcc.target/aarch64/pr88834.c: New test.
* gcc.target/aarch64/sve/struct_vect_1.c: Adjust.
* gcc.target/aarch64/sve/struct_vect_14.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_15.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_16.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_17.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_7.c: Likewise.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr88834.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_1.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_14.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_15.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_16.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_17.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_7.c
trunk/gcc/tree-ssa-address.c
trunk/gcc/tree-ssa-address.h
trunk/gcc/tree-ssa-loop-ivopts.c

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #17 from kugan at gcc dot gnu.org ---
(In reply to Wilco from comment #16)
> (In reply to kugan from comment #15)
> > (In reply to Wilco from comment #11)
> > > There is also something odd with the way the loop iterates, this doesn't
> > > look right:
> > > 
> > > whilelo p0.s, x3, x4
> > > incwx3
> > > ptest   p1, p0.b
> > > bne .L3
> > 
> > I am not sure I understand this. I tried with qemu using an execution
> > testcase and It seems to work.
> > 
> > whilelo p0.s, x4, x5
> > incwx4
> > ptest   p1, p0.b
> > bne .L3
> > In my case I have the above (register allocation difference only) incw is
> > correct considering two vector word registers? Am I missing something here?
> 
> I'm talking about the completely redundant ptest, where does that come from?

It is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #15 from kugan at gcc dot gnu.org ---
(In reply to Wilco from comment #11)
> There is also something odd with the way the loop iterates, this doesn't
> look right:
> 
> whilelo p0.s, x3, x4
> incwx3
> ptest   p1, p0.b
> bne .L3

I am not sure I understand this. I tried with qemu using an execution testcase
and It seems to work.

whilelo p0.s, x4, x5
incwx4
ptest   p1, p0.b
bne .L3
In my case I have the above (register allocation difference only) incw is
correct considering two vector word registers? Am I missing something here?

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #14 from kugan at gcc dot gnu.org ---
Created attachment 46104
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46104=edit
testcase

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #46040|0   |1
is obsolete||

--- Comment #13 from kugan at gcc dot gnu.org ---
Created attachment 46103
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46103=edit
ivopt changes alone

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #12 from kugan at gcc dot gnu.org ---
(In reply to rsand...@gcc.gnu.org from comment #10)
> (In reply to kugan from comment #9)
> > Created attachment 46040 [details]
> > patch
> 
> Wasn't sure whether this patch was WIP or the final version
> for review, but we need to do something more generic than
> dividing by 4.  I think the test will still fail with "int"
> changed to "short" for example.
> 
> I also don't think the new candidate should be tied to the
> mask/load store functions.  Maybe one approach would be to
> check when adding a zero-based candidate for a use in:
> 
>   /* Record common candidate with initial value zero.  */
>   basetype = TREE_TYPE (iv->base);
>   if (POINTER_TYPE_P (basetype))
> basetype = sizetype;
>   record_common_cand (data, build_int_cst (basetype, 0), iv->step, use);
> 
> whether the use actually benefits from this unscaled iv.
> If the use is USE_REF_ADDRESS, we could compare the cost
> of an address with an unscaled index with the cost of an address
> with a scaled index.  I think the natural scale value to try
> would be GET_MODE_INNER (TYPE_MODE (mem_type)).

Thanks for the comments. I agree this is the right place. But I am not sure if
checking the cost at this point is what IV opt generally does. In general,
IV-opt adds candidates which can be helpful and later decides the optimal set. 

If we are to use get_computation_cost to see the costs, we have to create
iv_cand and then discard. Since we are adding only one candidate and that too
for SVE like targets, I am thinking that it is OK. If you still prefer to check
the cost, I will change that.

Attached patch (only the ivopt changes) and testcase

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Mar 30 04:28:51 2019
New Revision: 270031

URL: https://gcc.gnu.org/viewcvs?rev=270031=gcc=rev
Log:

2019-03-29  Kugan Vivekanandarajah  

Backport from mainline
2019-03-29  Kugan Vivekanandarajah  
Eric Botcazou  

PR rtl-optimization/89862
* rtl.h (word_register_operation_p): Exclude CONST_INT from operations
that operates on the full registers for WORD_REGISTER_OPERATIONS
architectures.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/rtl.h

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

--- Comment #3 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Mar 30 04:24:22 2019
New Revision: 270030

URL: https://gcc.gnu.org/viewcvs?rev=270030=gcc=rev
Log:

2019-03-29  Kugan Vivekanandarajah  
Eric Botcazou  

PR rtl-optimization/89862
* rtl.h (word_register_operation_p): Exclude CONST_INT from operations
that operates on the full registers for WORD_REGISTER_OPERATIONS
architectures.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/rtl.h

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-28 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

--- Comment #2 from kugan at gcc dot gnu.org ---
(In reply to Eric Botcazou from comment #1)
> Can you try this instead?
> 
> Index: rtl.h
> ===
> --- rtl.h   (revision 269886)
> +++ rtl.h   (working copy)
> @@ -4401,6 +4401,7 @@ word_register_operation_p (const_rtx x)
>  {
>switch (GET_CODE (x))
>  {
> +case CONST_INT:
>  case ROTATE:
>  case ROTATERT:
>  case SIGN_EXTRACT:
Thanks for looking into it. Disallowing all the CONST_INT works for me. I have
verified that lto-bootstrap works with the above changes. I will test for
regression and post it to gcc-patches.

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #45686|0   |1
is obsolete||

--- Comment #9 from kugan at gcc dot gnu.org ---
Created attachment 46040
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46040=edit
patch

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #8 from kugan at gcc dot gnu.org ---
(In reply to rsand...@gcc.gnu.org from comment #7)
> Thanks for looking at this.
> 
> (In reply to kugan from comment #6)
> > cmp w3, 0
> > ble .L1
> > sub w3, w3, #1
> > mov x4, 0
> > cntwx5
> > ptrue   p1.s, all
> > lsr w3, w3, 1
> > add w3, w3, 1
> > whilelo p0.s, xzr, x3
> > .p2align 3,,7
> > .L3:
> > ld2w{z4.s - z5.s}, p0/z, [x1, x4, lsl 2]
> > ld2w{z2.s - z3.s}, p0/z, [x2, x4, lsl 2]
> > add z0.s, z4.s, z2.s
> > sub z1.s, z5.s, z3.s
> > st2w{z0.s - z1.s}, p0, [x0, x4, lsl 2]
> > whilelo p0.s, x5, x3
> > incbx4, all, mul #2
> > incwx5
> > ptest   p1, p0.b
> > bne .L3
> > .L1:
> > ret
> > .cfi_endproc
> 
> This doesn't look right.  x4 is an index, so it should be
> incremented by the number of words in two vectors, rather than
> the number of bytes in two vectors.

Thanks for the comments. Fixed it with the attached patch it generates

f:
.LFB0:
.cfi_startproc
cmp w3, 0
ble .L1
sub w5, w3, #1
cntwx4
mov x3, 0
ptrue   p1.s, all
lsr w5, w5, 1
add w5, w5, 1
whilelo p0.s, xzr, x5
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1, x3, lsl 2]
ld2w{z2.s - z3.s}, p0/z, [x2, x3, lsl 2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0, x3, lsl 2]
whilelo p0.s, x4, x5
inchx3
incwx4
ptest   p1, p0.b
bne .L3
.L1:
ret
.cfi_endproc

[Bug rtl-optimization/89862] New: LTO bootstrap fails for ARM

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

Bug ID: 89862
   Summary: LTO bootstrap fails for ARM
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

Created attachment 46039
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46039=edit
patch

With the commit:
commit 67c18bce7054934528ff5930cca283b4ac967dca
Author: ebotcazou 
Date:   Wed Jan 31 10:03:06 2018 +PR rtl-optimization/84071
* combine.c (record_dead_and_set_regs_1): Record the source
unmodified
for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target.

LTO bootstrap fails for arm (possibly for other WORD_REGISTER_OPERATIONS
targets).

There are internal compiler error: in operator+=, at profile-count.h:792. It
looks like the profile_count is set incorrectly.

Commit 67c18bce7054934528ff5930cca283b4ac967dca skips generating gen_lowpart
for
(set (subreg:SI (reg:QI 1434) 0)
(const_int 224 [0xe0])) and likes. This seems to be the reason for the
error.

attached patch fixes this. Does this look reasonable?

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-03-20 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #5 from kugan at gcc dot gnu.org ---
Created attachment 46000
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46000=edit
RFC patch

RFC patch fixes this for review.

[Bug target/88836] [SVE] Redundant PTEST in loop test

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

--- Comment #2 from kugan at gcc dot gnu.org ---
Created attachment 45795
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45795=edit
RFC patch

AFIK, we need to:
1. Change the whilelo pattern in backend
2. Change RTL CSE
- Add support for VEC_DUPLICATE
- When handling PARALLEL rtx, we  may kill CSE defined in the first set so that
it docent reach

Attached patch fix this. With the patch I now have:
.LFB0:
.cfi_startproc
cmp w3, 0
ble .L1
sub w4, w3, #1
cntwx3
lsr w4, w4, 1
add w4, w4, 1
whilelo p0.s, xzr, x4
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1]
ld2w{z2.s - z3.s}, p0/z, [x2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0]
incbx1, all, mul #2
whilelo p0.s, x3, x4
incbx0, all, mul #2
incwx3
incbx2, all, mul #2
bne .L3
.L1:
ret
.cfi_endproc

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #4 from kugan at gcc dot gnu.org ---
sorry wr(In reply to kugan from comment #3)
> Created attachment 45794 [details]
> RFC patch

Oops wrong place, it should be for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #3 from kugan at gcc dot gnu.org ---
Created attachment 45794
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45794=edit
RFC patch

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
AFIK, we need to:
1. Change the whilelo pattern in backend
2. Change RTL CSE
- Add support for VEC_DUPLICATE
- When handling PARALLEL rtx, we  may kill CSE defined in the first set so that
it docent reach

Attached patch fix this. With the patch I now have:
.LFB0:
.cfi_startproc
cmp w3, 0
ble .L1
sub w4, w3, #1
cntwx3
lsr w4, w4, 1
add w4, w4, 1
whilelo p0.s, xzr, x4
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1]
ld2w{z2.s - z3.s}, p0/z, [x2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0]
incbx1, all, mul #2
whilelo p0.s, x3, x4
incbx0, all, mul #2
incwx3
incbx2, all, mul #2
bne .L3
.L1:
ret
.cfi_endproc

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #6 from kugan at gcc dot gnu.org ---

> 
> Note the difference in mode for aarch64_classify_address. Not sure if this
> is because of the way my patch changes ivopt.

Yes, it ws my mistake in iv-use. with attached patch, I now get
cmp w3, 0
ble .L1
sub w3, w3, #1
mov x4, 0
cntwx5
ptrue   p1.s, all
lsr w3, w3, 1
add w3, w3, 1
whilelo p0.s, xzr, x3
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1, x4, lsl 2]
ld2w{z2.s - z3.s}, p0/z, [x2, x4, lsl 2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0, x4, lsl 2]
whilelo p0.s, x5, x3
incbx4, all, mul #2
incwx5
ptest   p1, p0.b
bne .L3
.L1:
ret
.cfi_endproc

I will post the patch for review after stage-1 opens. In the meantime any
review is appreciated. Especially the part where iv-use is setup and
get_alias_ptr_type_for_ptr_address.

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #45661|0   |1
is obsolete||

--- Comment #5 from kugan at gcc dot gnu.org ---
Created attachment 45686
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45686=edit
ivopt patch v2

[Bug tree-optimization/89296] New: tree copy-header masking uninitialized warning

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89296

Bug ID: 89296
   Summary: tree copy-header masking uninitialized warning
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

void test_func(void) {
  int loop;  // uninitialized and "garbage"
  while (!loop) {
   loop = get_a_value();  // <- must be for this test
   printk("...");
  }
}

from Linaro bug report https://bugs.linaro.org/show_bug.cgi?id=4134
-fno-tree-ch gets the required warning

diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index c876d62..d405d00 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -393,7 +393,7 @@ ch_base::copy_headers (function *fun)
{
  gimple *stmt = gsi_stmt (bsi);
  if (gimple_code (stmt) == GIMPLE_COND)
-   gimple_set_no_warning (stmt, true);
+   ;//gimple_set_no_warning (stmt, true);
  else if (is_gimple_assign (stmt))
{
  enum tree_code rhs_code = gimple_assign_rhs_code (stmt);

also gets the required warning. Looking into it.

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #4 from kugan at gcc dot gnu.org ---
Created attachment 45661
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45661=edit
ivopt patch v1

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #3 from kugan at gcc dot gnu.org ---
I added iv-use for MASKED_LOAD_LANE and the result is
cmp w3, 0
ble .L1
sub w5, w3, #1
mov x4, 0
lsr w5, w5, 1
add w5, w5, 1
whilelo p0.s, xzr, x5
.p2align 3,,7
.L3:
lsl x3, x4, 3
incwx4
add x7, x1, x3
add x6, x2, x3
ld2w{z4.s - z5.s}, p0/z, [x7]
ld2w{z2.s - z3.s}, p0/z, [x6]
add x3, x0, x3
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x3]
whilelo p0.s, x4, x5
bne .L3
.L1:
ret

No base plus scaled index addressing mode. This is because in ivopt

When called from ivopt:
Breakpoint 4, aarch64_classify_address (info=0x7fffcba0, x=0x76c44f30,
mode=E_DImode, strict_p=false, type=ADDR_QUERY_M)
at
/home/kugan/work/abe/snapshots/gcc.git~origin~aarch64~sve-acle-branch/gcc/config/aarch64/aarch64.c:5689
5689{
(gdb) p debug_rtx (x)
(plus:DI (mult:DI (reg:DI 91)
(const_int 8 [0x8]))
(reg:DI 90))

it accepts it.

When in cfgexpand:
Breakpoint 5, aarch64_classify_address (info=0x7fffcca0, x=0x76c5b840,
mode=E_VNx8SImode, strict_p=false, type=ADDR_QUERY_M)
at
/home/kugan/work/abe/snapshots/gcc.git~origin~aarch64~sve-acle-branch/gcc/config/aarch64/aarch64.c:5689
5689{
(gdb) p debug_rtx (x)
(plus:DI (mult:DI (reg:DI 92 [ ivtmp_28 ])
(const_int 8 [0x8]))
(reg/v/f:DI 110 [ y ]))


This is not accepted because of aarch64_classify_index (info, op1, mode,
strict_p) failing (as it should).

Note the difference in mode for aarch64_classify_address. Not sure if this is
because of the way my patch changes ivopt.

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
I'll assign it to myself unless it is being looked at by someone else.

[Bug sanitizer/88333] [9 Regression] ice in asan_emit_stack_protection, at asan.c:1574

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88333

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #7 from kugan at gcc dot gnu.org ---
*** Bug 88350 has been marked as a duplicate of this bug. ***

[Bug sanitizer/88350] Linux kernel build ICE with allyesconfig for aarch64

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from kugan at gcc dot gnu.org ---
Duplicate

*** This bug has been marked as a duplicate of bug 88333 ***

[Bug sanitizer/88350] Linux kernel build ICE with allyesconfig for aarch64

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Alias|PR88333 |

--- Comment #2 from kugan at gcc dot gnu.org ---
Dup of PR88333 and fixed.

[Bug sanitizer/88350] New: Linux kernel build ICE with allyesconfig for aarch64

2018-12-04 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350

Bug ID: 88350
   Summary: Linux kernel build ICE with allyesconfig for aarch64
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

When Linux kernel is built (allyesconfig) with trunk,  


++ make
CC=/home/tcwg-buildslave/workspace/tcwg_kernel-bisect-gnu_0/bin/aarch64-cc
ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTCC=gcc -j32 -s -k
:1335:2: warning: #warning syscall rseq not implemented [-Wcpp]
*** WARNING *** there are active plugins, do not report this as a bug unless
you can reproduce it without enabling any plugins.
Event| Plugins
PLUGIN_FINISH_TYPE   | randomize_layout_plugin structleak_plugin
PLUGIN_FINISH_DECL   | randomize_layout_plugin
PLUGIN_ATTRIBUTES| randomize_layout_plugin
latent_entropy_plugin structleak_plugin
PLUGIN_START_UNIT| latent_entropy_plugin
PLUGIN_ALL_IPA_PASSES_START  | randomize_layout_plugin
during RTL pass: expand
arch/arm64/mm/flush.c: In function '__sync_icache_dcache':
arch/arm64/mm/flush.c:61:6: internal compiler error: in
asan_emit_stack_protection, at asan.c:1574
   61 | void __sync_icache_dcache(pte_t pte)
  |  ^~~~


Full build Log can be found in:
https://ci.linaro.org/job/tcwg_kernel-bisect-gnu-master-aarch64-stable-allyesconfig/11/artifact/artifacts/build-1d89613e77d7db420b13ce3ad8b98f07aaf474e8/console.log


Commit that seem to trigger this is:
Author: marxin 
Date:   Fri Nov 30 14:25:15 2018 +

Make red zone size more flexible for stack variables (PR sanitizer/81715).

2018-11-30  Martin Liska  

PR sanitizer/81715
* asan.c (asan_shadow_cst): Remove, partially transform
into flush_redzone_payload.
(RZ_BUFFER_SIZE): New.
(struct asan_redzone_buffer): New.
(asan_redzone_buffer::emit_redzone_byte): Likewise.
(asan_redzone_buffer::flush_redzone_payload): Likewise.
(asan_redzone_buffer::flush_if_full): Likewise.
(asan_emit_stack_protection): Use asan_redzone_buffer class
that is responsible for proper aligned stores and flushing
of shadow memory payload.
* asan.h (ASAN_MIN_RED_ZONE_SIZE): New.
(asan_var_and_redzone_size): Likewise.
* cfgexpand.c (expand_stack_vars): Use smaller alignment
(ASAN_MIN_RED_ZONE_SIZE) in order to make shadow memory
for automatic variables more compact.
2018-11-30  Martin Liska  

PR sanitizer/81715
* c-c++-common/asan/asan-stack-small.c: New test.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@24
138bc75d-0d04-0410-961f-82ee72b054a4

[Bug rtl-optimization/88212] New: IRA Register Coalescing not working for the testcase

2018-11-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88212

Bug ID: 88212
   Summary: IRA Register Coalescing not working for the testcase
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

When compiling the following on aarch64 with -O2:
#include 
void g(int32_t *p, int32x2x2_t val, int x)
{
 vst2_lane_s32(p,val,0);
}

generates:
.cfi_startproc
mov v2.8b, v0.8b
mov v3.8b, v1.8b
st2 {v2.s - v3.s}[0], [x0]
ret

clang produces:
st2 { v0.s, v1.s }[0], [x0]
ret

Essentially the problem is that access to part-registers doesn't get
coalesced, so IRA generates moves which aren't actually required.

[Bug target/86677] popcount builtin detection is breaking some kernel build

2018-11-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86677

--- Comment #13 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon Nov 12 23:43:56 2018
New Revision: 266039

URL: https://gcc.gnu.org/viewcvs?rev=266039=gcc=rev
Log:
gcc/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN
POPCOUNT
as expensive when backend does not define it.

gcc/testsuite/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* g++.dg/tree-ssa/pr86544.C: Run only for target supporting popcount
pattern.
* gcc.dg/tree-ssa/popcount.c: Likewise.
* gcc.dg/tree-ssa/popcount2.c: Likewise.
* gcc.dg/tree-ssa/popcount3.c: Likewise.
* gcc.target/aarch64/popcount4.c: New test.
* lib/target-supports.exp (check_effective_target_popcountl): New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/popcount4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
trunk/gcc/testsuite/lib/target-supports.exp
trunk/gcc/tree-scalar-evolution.c

[Bug middle-end/87528] Popcount changes caused 531.deepsjeng_r run-time regression on Skylake

2018-11-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87528

--- Comment #7 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon Nov 12 23:43:56 2018
New Revision: 266039

URL: https://gcc.gnu.org/viewcvs?rev=266039=gcc=rev
Log:
gcc/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN
POPCOUNT
as expensive when backend does not define it.

gcc/testsuite/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* g++.dg/tree-ssa/pr86544.C: Run only for target supporting popcount
pattern.
* gcc.dg/tree-ssa/popcount.c: Likewise.
* gcc.dg/tree-ssa/popcount2.c: Likewise.
* gcc.dg/tree-ssa/popcount3.c: Likewise.
* gcc.target/aarch64/popcount4.c: New test.
* lib/target-supports.exp (check_effective_target_popcountl): New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/popcount4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
trunk/gcc/testsuite/lib/target-supports.exp
trunk/gcc/tree-scalar-evolution.c

[Bug c++/87469] [9 Regression] ice in record_estimate, at tree-ssa-loop-niter.c:3271

2018-10-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87469

--- Comment #5 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon Oct 29 22:02:45 2018
New Revision: 265605

URL: https://gcc.gnu.org/viewcvs?rev=265605=gcc=rev
Log:
gcc/testsuite/ChangeLog:

2018-10-29  Kugan Vivekanandarajah  

PR middle-end/87469
* g++.dg/pr87469.C: New test.

gcc/ChangeLog:

2018-10-29  Kugan Vivekanandarajah  

PR middle-end/87469
* tree-ssa-loop-niter.c (number_of_iterations_popcount): Fix niter
max value.



Added:
trunk/gcc/testsuite/g++.dg/pr87469.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-niter.c

[Bug c++/87469] [9 Regression] ice in record_estimate, at tree-ssa-loop-niter.c:3271

2018-10-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87469

--- Comment #4 from kugan at gcc dot gnu.org ---
In the loop here, the value defined in the loop (e) is used outside the loop
hence this should not be detected as popcount (AFIK). I will have a look at
fixing this.

[Bug target/87253] New: Python test_ctypes fails when built with gcc 8.2

2018-09-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87253

Bug ID: 87253
   Summary: Python test_ctypes fails when built with gcc 8.2
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

Python-2.7.15

Steps to reproduce error
In Python src directory:
./configure
make
./python Lib/test/regrtest.py -v test_ctypes

==
FAIL: test_struct_by_value (ctypes.test.test_win32.Structures)
--
Traceback (most recent call last):
  File
"/home/kugan.vivekanandarajah/Python-2.7.15/Lib/ctypes/test/test_win32.py",
line 113, in test_struct_by_value
self.assertEqual(ret.left, left.value)
AssertionError: -200 != 10



gdb ./python
b ReturnRect
r Lib/test/regrtest.py -v test_ctypesQuit

(gdb) p cp
$9 = {x = 15, y = 25}
(gdb) p fp
$10 = {x = 548534164448, y = 9890688}

cp and fp are the same as can  be seen from below:

vi /home/kugan.vivekanandarajah/Python-2.7.15/Lib/ctypes/test/test_win32.py
+112

pt = POINT(15, 25)
...
ReturnRect = dll.ReturnRect
ReturnRect.argtypes = [c_int, RECT, POINTER(RECT), POINT, RECT,
  POINTER(RECT), POINT, RECT]


ret = ReturnRect(i, rect, pointer(rect), pt, rect,
 byref(rect), pt, rect)


gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/kugan.vivekanandarajah/install/usr/local/bin/../libexec/gcc/aarch64-unknown-linux-gnu/8.2.1/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../gcc/configure --disable-bootstrap
Thread model: posix
gcc version 8.2.1 20180907 (GCC)

[Bug target/86677] popcount builtin detection is breaking some kernel build

2018-07-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86677

--- Comment #2 from kugan at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> The kernel simply has to provide __popcount{s,d}i2 like it provides other
> libgcc functions if it chooses to not link against libgcc.

Yes, I created this bug just so that I can point it to the kernel people. I
will raise it with the kernel people internally and see what I can do. Thanks.

[Bug target/86677] New: popcount builtin detection is breaking some kernel build

2018-07-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86677

Bug ID: 86677
   Summary: popcount builtin detection is breaking some kernel
build
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

Linux kernel build for arm/aarch64 (and possibly other targets) which does not
provide appropriate patterns in the backend will break the kernel build. 

As for aarch64 this happens because kernel is built with -mgeneral-regs-only

Also discussed in:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00489.html

[Bug tree-optimization/86544] Popcount detection generates different code on C and C++

2018-07-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86544

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed Jul 18 22:11:24 2018
New Revision: 262864

URL: https://gcc.gnu.org/viewcvs?rev=262864=gcc=rev
Log:
gcc/ChangeLog:

2018-07-18  Kugan Vivekanandarajah  

PR middle-end/86544
* tree-ssa-phiopt.c (cond_removal_in_popcount_pattern): Handle
comparision with EQ_EXPR
in last stmt.

gcc/testsuite/ChangeLog:

2018-07-18  Kugan Vivekanandarajah  

PR middle-end/86544
* g++.dg/tree-ssa/pr86544.C: New test.


Added:
trunk/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-phiopt.c

[Bug tree-optimization/86544] Popcount detection generates different code on C and C++

2018-07-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86544

--- Comment #2 from kugan at gcc dot gnu.org ---
Patch posted at https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00975.html

[Bug tree-optimization/86544] Popcount detection generates different code on C and C++

2018-07-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86544

--- Comment #1 from kugan at gcc dot gnu.org ---
(In reply to ktkachov from comment #0)
> Great to see that GCC now detects the popcount loop in PR 82479!
> I am seeing some curious differences between gcc and g++ though.
> int
> pc (unsigned long long b)
> {
> int c = 0;
> 
> while (b) {
> b &= b - 1;
> c++;
> }
> 
> return c;
> }
> 
> If compiled with gcc -O3 on aarch64 this gives:
> pc:
> fmovd0, x0
> cnt v0.8b, v0.8b
> addvb0, v0.8b
> umovw0, v0.b[0]
> ret
> 
> whereas if compiled with g++ -O3 it gives:
> _Z2pcy:
> .LFB0:
> .cfi_startproc
> fmovd0, x0
> cmp x0, 0
> cnt v0.8b, v0.8b
> addvb0, v0.8b
> umovw0, v0.b[0]
> and x0, x0, 255
> cselw0, w0, wzr, ne
> ret
> 
> which is suboptimal. It seems that phiopt3 manages to optimise the C version
> better. The GIMPLE dumps just before the phiopt pass are:
> For the C (good version):
> 
>   int c;
>   int _7;
> 
>[local count: 118111601]:
>   if (b_4(D) != 0)
> goto ; [89.00%]
>   else
> goto ; [11.00%]
> 
>[local count: 105119324]:
>   _7 = __builtin_popcountl (b_4(D));
> 
>[local count: 118111601]:
>   # c_12 = PHI <0(2), _7(3)>
>   return c_12;
> 
> 
> For the C++ (bad version):
> 
>   int c;
>   int _7;
> 
>[local count: 118111601]:
>   if (b_4(D) == 0)
> goto ; [11.00%]
>   else
> goto ; [89.00%]
> 
>[local count: 105119324]:
>   _7 = __builtin_popcountl (b_4(D));
> 
>[local count: 118111601]:
>   # c_12 = PHI <0(2), _7(3)>
>   return c_12;
> 
> As you can see the order of the gotos and the jump conditions is inverted.
> 
> It seems to me that the two are equivalent and GCC could be doing a better
> job of optimising.
> 
> Can we improve phiopt to handle this more effectively?

Thanks for the test case. I will look at it.

[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

--- Comment #7 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Fri Jul 13 05:25:47 2018
New Revision: 262622

URL: https://gcc.gnu.org/viewcvs?rev=262622=gcc=rev
Log:
gcc/ChangeLog:

2018-07-13  Kugan Vivekanandarajah  
Richard Biener  

PR middle-end/86489
* tree-ssa-loop-niter.c (number_of_iterations_popcount): Check
that the loop latch destination where phi is defined.

gcc/testsuite/ChangeLog:

2018-07-13  Kugan Vivekanandarajah  

PR middle-end/86489
* gcc.dg/pr86489.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/pr86489.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-niter.c

[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

--- Comment #3 from kugan at gcc dot gnu.org ---
(In reply to Richard Biener from comment #2)
>   gimple *phi = SSA_NAME_DEF_STMT (b_11);
>   if (gimple_code (phi) != GIMPLE_PHI
>   || (gimple_assign_lhs (and_stmt)
>   != gimple_phi_arg_def (phi, loop_latch_edge (loop)->dest_idx)))
> return false;
> 
> this may fail if the PHI in question is not the correct one in which case
> it may not have the argument at the latch dest_idx.  Try first verifying
> that the loop latch destination is indeed gimple_bb (phi).

yes, thanks for spotting. I am testing the following patch:

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index f6fa2f7..fbdf838 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2555,6 +2555,7 @@ number_of_iterations_popcount (loop_p loop, edge exit,
... = PHI .  */
   gimple *phi = SSA_NAME_DEF_STMT (b_11);
   if (gimple_code (phi) != GIMPLE_PHI
+  || (gimple_bb (phi) != loop_latch_edge (loop)->dest)
   || (gimple_assign_lhs (and_stmt)
  != gimple_phi_arg_def (phi, loop_latch_edge (loop)->dest_idx)))
 return false;

is checking that there is argument at the latch dest_idx (argument count of
PHI) is still necessary?

[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

--- Comment #1 from kugan at gcc dot gnu.org ---
Sorry about the breakage, I am trying to reproduce it on x86-64. Please let me
know if you have testcase.

[Bug middle-end/82479] missing popcount builtin detection

2018-06-16 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82479

--- Comment #13 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Jun 16 21:39:31 2018
New Revision: 261682

URL: https://gcc.gnu.org/viewcvs?rev=261682=gcc=rev
Log:
gcc/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/82479
* ipa-fnsummary.c (will_be_nonconstant_expr_predicate): Handle
CALL_EXPR.
* tree-scalar-evolution.c (interpret_expr): Likewise.
(expression_expensive_p): Likewise.
* tree-ssa-loop-ivopts.c (contains_abnormal_ssa_name_p): Likewise.
* tree-ssa-loop-niter.c (number_of_iterations_popcount): New.
(number_of_iterations_exit_assumptions): Use
number_of_iterations_popcount.
(ssa_defined_by_minus_one_stmt_p): New.

gcc/testsuite/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/82479
* gcc.dg/tree-ssa/popcount.c: New test.
* gcc.dg/tree-ssa/popcount2.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-fnsummary.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-scalar-evolution.c
trunk/gcc/tree-ssa-loop-ivopts.c
trunk/gcc/tree-ssa-loop-niter.c

[Bug tree-optimization/64946] [AArch64] gcc.target/aarch64/vect-abs-compile.c - "abs" vectorization fails for char/short types

2018-06-16 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64946

--- Comment #24 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Jun 16 21:34:29 2018
New Revision: 261681

URL: https://gcc.gnu.org/viewcvs?rev=261681=gcc=rev
Log:
gcc/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/64946
* cfgexpand.c (expand_debug_expr): Hande ABSU_EXPR.
* config/i386/i386.c (ix86_add_stmt_cost): Likewise.
* dojump.c (do_jump): Likewise.
* expr.c (expand_expr_real_2): Check operand type's sign.
* fold-const.c (const_unop): Handle ABSU_EXPR.
(fold_abs_const): Likewise.
* gimple-pretty-print.c (dump_unary_rhs): Likewise.
* gimple-ssa-backprop.c (backprop::process_assign_use): Likesie.
(strip_sign_op_1): Likesise.
* match.pd: Add new pattern to generate ABSU_EXPR.
* optabs-tree.c (optab_for_tree_code): Handle ABSU_EXPR.
* tree-cfg.c (verify_gimple_assign_unary): Likewise.
* tree-eh.c (operation_could_trap_helper_p): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-vect-patterns.c (vect_recog_sad_pattern): Likewise.
* tree.def (ABSU_EXPR): New.

gcc/c-family/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

* c-common.c (c_common_truthvalue_conversion): Handle ABSU_EXPR.

gcc/c/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

* c-typeck.c (build_unary_op): Handle ABSU_EXPR;
* gimple-parser.c (c_parser_gimple_statement): Likewise.
(c_parser_gimple_unary_expression): Likewise.

gcc/cp/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

* constexpr.c (potential_constant_expression_1): Handle ABSU_EXPR.
* cp-gimplify.c (cp_fold): Likewise.

gcc/testsuite/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/64946
* gcc.dg/absu.c: New test.
* gcc.dg/gimplefe-29.c: New test.
* gcc.target/aarch64/pr64946.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/absu.c
trunk/gcc/testsuite/gcc.dg/gimplefe-29.c
trunk/gcc/testsuite/gcc.target/aarch64/pr64946.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.c
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-typeck.c
trunk/gcc/c/gimple-parser.c
trunk/gcc/cfgexpand.c
trunk/gcc/config/i386/i386.c
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/constexpr.c
trunk/gcc/cp/cp-gimplify.c
trunk/gcc/dojump.c
trunk/gcc/expr.c
trunk/gcc/fold-const.c
trunk/gcc/gimple-pretty-print.c
trunk/gcc/gimple-ssa-backprop.c
trunk/gcc/match.pd
trunk/gcc/optabs-tree.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-cfg.c
trunk/gcc/tree-eh.c
trunk/gcc/tree-inline.c
trunk/gcc/tree-pretty-print.c
trunk/gcc/tree-vect-patterns.c
trunk/gcc/tree.def

[Bug fortran/78387] OpenMP segfault/stack size exceeded writing to internal file

2017-10-15 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78387

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #17 from kugan at gcc dot gnu.org ---
*** Bug 82555 has been marked as a duplicate of this bug. ***

[Bug libfortran/82555] SPECcpu201 Wrf_s deadlock

2017-10-15 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82555

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #6 from kugan at gcc dot gnu.org ---


*** This bug has been marked as a duplicate of bug 78387 ***

[Bug libfortran/82555] SPECcpu201 Wrf_s deadlock

2017-10-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82555

--- Comment #5 from kugan at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #4)
> Actually PR 78387 seems exactly this issue.  Please test with a newer
> version of gfortran.

Thanks Andrew. Looks like this is the issue. So far, current trunk is
continuing without error.

[Bug libgomp/82555] SPECcpu201 Wrf_s deadlock

2017-10-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82555

--- Comment #1 from kugan at gcc dot gnu.org ---
My gcc is slightly old. 
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/kugan.vivekanandarajah/install/test/usr/local/bin/../libexec/gcc/aarch64-unknown-linux-gnu/8.0.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../gcc-exp2/configure : (reconfigured) ../gcc-exp2/configure
--enable-languages=c,c++,fortran,lto,objc --no-create --no-recursion
Thread model: posix
gcc version 8.0.0 20170822 (experimental) (GCC)

I will try with the latest version.

[Bug libgomp/82555] New: SPECcpu201 Wrf_s deadlock

2017-10-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82555

Bug ID: 82555
   Summary: SPECcpu201 Wrf_s deadlock
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Wrf_s is hanging or deadlocks when run on 48 threads (cores). It doesnt always
happen and I have to run with --iterations=111 and it will eventually happens.
Sometimes in the 2nd iterations and some times much later.

I attached the process to gdb and the back trace is:
(gdb) bt
#0  0x01019924 in __lll_lock_wait (futex=futex@entry=0x2c3b1e0
<_gfortrani_unit_lock>, private=0) at lowlevellock.c:43
#1  0x01012cbc in __pthread_mutex_lock (mutex=0x2c3b1e0
<_gfortrani_unit_lock>) at pthread_mutex_lock.c:80
#2  0x00fd20ac in __gthread_mutex_lock (__mutex=0x2c3b1e0
<_gfortrani_unit_lock>) at ../libgcc/gthr-default.h:748
#3  _gfortrani_close_units () at ../../../gcc-exp2/libgfortran/io/unit.c:835
#4  0x0103950c in __libc_csu_fini ()
#5  0x0103f068 in __run_exit_handlers ()
#6  0x0103f0b0 in exit ()
#7  0x00fc6e60 in _gfortrani_exit_error (status=1, status@entry=3) at
../../../gcc-exp2/libgfortran/runtime/error.c:196
#8  0x00fc7314 in _gfortrani_internal_error
(cmp=cmp@entry=0xcdf23d00, 
message=message@entry=0x11548a8 "stash_internal_unit(): Stack Size
Exceeded") at ../../../gcc-exp2/libgfortran/runtime/error.c:422
#9  0x00fd1a84 in _gfortrani_stash_internal_unit (dtp=0xcdf23d00)
at ../../../gcc-exp2/libgfortran/io/unit.c:549
#10 0x00fd0f6c in _gfortran_st_write_done (dtp=0xcdf23d00) at
../../../gcc-exp2/libgfortran/io/transfer.c:4168
#11 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#12 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#13 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#14 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#15 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#16 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#17 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#18 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#19 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#20 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#21 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#22 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#23 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#24 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#25 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#26 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#27 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#28 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#29 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#30 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#31 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#32 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#33 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#34 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#35 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#36 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#37 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#38 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#39 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#40 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#41 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#42 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#43 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#44 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#45 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()
#46 0x00db933c in __module_ra_rrtm_MOD_rrtmlwrad ()

I am running this on AArch64 but I dont think this is an AArch64 specific
issue. Is anyone else seeing this?

[Bug middle-end/82479] missing popcount builtin detection

2017-10-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82479

--- Comment #4 from kugan at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #2)
> Confirmed. How useful this optimization is questionable.

This code is part of spec2017/deepsjeng. There is some gain if we can. 

> 
> Gcc has __builtin_popcount which can be used.

I agree.

[Bug middle-end/82479] missing popcount builtin detection

2017-10-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82479

--- Comment #1 from kugan at gcc dot gnu.org ---
gcc trunk generates:
PopCount:
mov w2, 0
cbz x0, .L1
.p2align 3
.L3:
sub x1, x0, #1
add w2, w2, 1
andsx0, x0, x1
bne .L3
.L1:
mov w0, w2
ret

[Bug middle-end/82479] New: missing popcount builtin detection

2017-10-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82479

Bug ID: 82479
   Summary: missing popcount builtin detection
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

gcc does not have support to detect builtin pop count. As a results, gcc
generates bad code for

int PopCount (long b) {
int c = 0;

while (b) {
b &= b - 1;
c++;
}
return c;
}

clang seems to do that and generates (for aarch64):

_Z8PopCounty:
fmov d0, x0
cnt  v0.8b, v0.8b
uaddlv  h0, v0.8b
fmov w0, s0
ret

[Bug tree-optimization/81558] Loop not vectorized

2017-07-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81558

--- Comment #2 from kugan at gcc dot gnu.org ---

> Does LLVM do a runtime alias check here?  For foo1 GCC adds a runtime alias
> check
> (BB vectorization cannot version for aliasing).

Yes. LLVM does not seem to be unrolling the inner loop. As you said, when
disabling cunrolli it works. cunroll pass will unroll after loop vectorisation.
Can anything  done with the heuristics for this case? Thanks.

[Bug middle-end/81558] New: Loop not vectorized

2017-07-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81558

Bug ID: 81558
   Summary: Loop not vectorized
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

For the testcase:

struct I
{
  int opix_x;
  int opix_y;
};

//#define R 
#define R __restrict__
extern struct I * R img;
extern unsigned short ** R imgY_org;
extern unsigned short orig_blocks[256];

void foo1 (int n)
{
  int x = 1, y = 1;
  unsigned short *orgptr=orig_blocks;
  // Vectorized
  for (y = 0; y < img->opix_y; y++)
for (x = 0; x < img->opix_x; x++)
  *orgptr++ = imgY_org [y][x];
}

void foo2 (int n)
{
  int x = 1, y = 1;
  unsigned short *orgptr=orig_blocks;
  // Not vectorized
  for (y = img->opix_y; y < img->opix_y+16; y++)
for (x = img->opix_x; x < img->opix_x+16; x++)
  *orgptr++ = imgY_org [y][x];
}

Loop in foo2 is not vectorized.

In the *.156t.vect, I see:
Creating dr for *_40
analyze_innermost: failed: evolution of base is not affine.
base_address: 
offset from base address: 
constant offset from base address: 
step: 
aligned to: 
base_object: *_40


LLVM seems to be able to vectorize this.

[Bug tree-optimization/80612] [7/8 Regression] ICE in get_range_info, at tree-ssanames.c:375

2017-05-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80612

--- Comment #5 from kugan at gcc dot gnu.org ---
(In reply to Marek Polacek from comment #4)
> This should fix it:
> 
> --- a/gcc/calls.c
> +++ b/gcc/calls.c
> @@ -1270,7 +1270,7 @@ get_size_range (tree exp, tree range[2])
>  
>wide_int min, max;
>enum value_range_type range_type
> -= (TREE_CODE (exp) == SSA_NAME
> += ((TREE_CODE (exp) == SSA_NAME && INTEGRAL_TYPE_P (TREE_TYPE (exp)))
> ? get_range_info (exp, , ) : VR_VARYING);
>  
>if (range_type == VR_VARYING)

Looked at the other uses of get_range_info too. There are uses of this in
gcc/gimple-ssa-warn-alloca.c without the check for INTEGRAL_TYPE_P but I think
it is intentional.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-22 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #26 from kugan at gcc dot gnu.org ---
(In reply to Richard Biener from comment #20)
> Look at tree-ssanames.c:range_info_def for "tricks" (make them variable
> size):
> 
> /* Value range information for SSA_NAMEs representing non-pointer variables.
> */
> 
> struct GTY ((variable_size)) range_info_def {
>   /* Minimum, maximum and nonzero bits.  */
>   TRAILING_WIDE_INT_ACCESSOR (min, ints, 0)
>   TRAILING_WIDE_INT_ACCESSOR (max, ints, 1)
>   TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2)
>   trailing_wide_ints <3> ints;
> };

I am working on a patch to change ipa vrp based on the above.

[Bug tree-optimization/78721] [7 Regression] ICE on valid code at -O2 and -O3 on x86_64-linux-gnu: in set_value_range, at tree-vrp.c:371

2016-12-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78721

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Fri Dec  9 19:47:10 2016
New Revision: 243501

URL: https://gcc.gnu.org/viewcvs?rev=243501=gcc=rev
Log:
gcc/testsuite/ChangeLog:

2016-12-09  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/78721
* gcc.dg/pr78721.c: New test.

gcc/ChangeLog:

2016-12-09  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/78721
* ipa-cp.c (propagate_vr_accross_jump_function): drop_tree_overflow
after fold_convert.


Added:
trunk/gcc/testsuite/gcc.dg/pr78721.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-cp.c
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/78721] [7 Regression] ICE on valid code at -O2 and -O3 on x86_64-linux-gnu: in set_value_range, at tree-vrp.c:371

2016-12-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78721

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #3 from kugan at gcc dot gnu.org ---
Created attachment 40280
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40280=edit
untested patch

[Bug tree-optimization/77862] [7 Regression] ice in add_equivalence

2016-12-07 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77862

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from kugan at gcc dot gnu.org ---
Fixed in trunk.

[Bug tree-optimization/72835] [7 Regression] Incorrect arithmetic optimization involving bitfield arguments

2016-11-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72835

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from kugan at gcc dot gnu.org ---
Fixed in trunk.

[Bug tree-optimization/71408] [7 Regression] wrong code at -Os and above on x86_64-linux-gnu

2016-11-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71408

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from kugan at gcc dot gnu.org ---
Fixed in trunk.

[Bug tree-optimization/40921] missed optimization: x + (-y * z * z) => x - y * z * z

2016-11-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40921

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||kugan at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #6 from kugan at gcc dot gnu.org ---
Fixed in trunk.

[Bug ipa/78296] [7 regression] test case gcc.dg/ipa/vrp7.c fails starting with r242032

2016-11-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78296

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from kugan at gcc dot gnu.org ---
Fixed with r242368.

[Bug c/78365] [7 Regression] ICE in determine_value_range, at tree-ssa-loo p-niter.c:413

2016-11-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78365

--- Comment #6 from kugan at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
> IPA has to deal with argument mismatches (I think I've said this elsewhere).

As I understand, this is along what you found earlier but a different issue. I
posted a patch at https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01878.html for
review.

[Bug c/78365] [7 Regression] ICE in determine_value_range, at tree-ssa-loo p-niter.c:413

2016-11-15 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78365

--- Comment #4 from kugan at gcc dot gnu.org ---
bug320.c also has the same issue:

static void finddpos (coord *,int,int,int,int);

bug320.c +10093 has:
static void
finddpos(cc, xl,yl,xh,yh)
coord *cc;
xchar xl,yl,xh,yh;

[Bug c/78365] [7 Regression] ICE in determine_value_range, at tree-ssa-loo p-niter.c:413

2016-11-15 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78365

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #3 from kugan at gcc dot gnu.org ---
Reduces testcase looks invalid:

a, b, c;
char d;
static fn1(int *, int);
fn1(cc, yh) int *cc;
char yh;
{
  char y;
  a = fn2(c - b + 1);
  for (; y <= yh; y++)
;
}
fn3() {
  fn1(fn3, 1);
  fn1(fn3, d - 1);
}


static fn1(int *, int); is the prototype
and then we have

fn1(cc, yh) int *cc;
char yh;

second argument is now char. I think FE should reject this.

[Bug ipa/78258] [7 Regression] ICE in compare_values_warnv, at tree-vrp.c:1218

2016-11-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78258

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #7 from kugan at gcc dot gnu.org ---
Duplicate and fixed.

*** This bug has been marked as a duplicate of bug 78121 ***

[Bug tree-optimization/78121] [7 Regression] ice in set_value_range

2016-11-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78121

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||gerhard.steinmetz.fortran@t
   ||-online.de

--- Comment #9 from kugan at gcc dot gnu.org ---
*** Bug 78258 has been marked as a duplicate of this bug. ***

[Bug ipa/78258] [7 Regression] ICE in compare_values_warnv, at tree-vrp.c:1218

2016-11-13 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78258

--- Comment #5 from kugan at gcc dot gnu.org ---
Looks like a dupof PR78121 which is fixed. z1.f90 is now working.

[Bug ipa/78296] [7 regression] test case gcc.dg/ipa/vrp7.c fails starting with r242032

2016-11-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78296

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amker at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
*** Bug 78316 has been marked as a duplicate of this bug. ***

[Bug ipa/78316] FAIL: gcc.dg/ipa/vrp7.c scan-ipa-dump-times cp "Setting value range of param 0 \\[-10, 9\\]" 1

2016-11-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78316

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||kugan at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #3 from kugan at gcc dot gnu.org ---
duplicate.

*** This bug has been marked as a duplicate of bug 78296 ***

[Bug ipa/78296] [7 regression] test case gcc.dg/ipa/vrp7.c fails starting with r242032

2016-11-10 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78296

kugan at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |kugan at gcc dot gnu.org

--- Comment #1 from kugan at gcc dot gnu.org ---
(In reply to Bill Seurer from comment #0)
> spawn /home/seurer/gcc/build/gcc-test2/gcc/xgcc
> -B/home/seurer/gcc/build/gcc-test2/gcc/
> /home/seurer/gcc/gcc-test2/gcc/testsuite/gcc.dg/ipa/vrp7.c
> -fno-diagnostics-show-caret -fdiagnostics-color=never -O2
> -fdump-ipa-cp-details -S -o vrp7.s
> PASS: gcc.dg/ipa/vrp7.c (test for excess errors)
> FAIL: gcc.dg/ipa/vrp7.c scan-ipa-dump-times cp "Setting value range of param
> 0 \\[-10, 9\\]" 1
> 
>   === gcc Summary ===
> 
> # of expected passes  1
> # of unexpected failures  1

Thanks for the report. This is expected as I have reverted r241990 which does
this optimization. I will repost r241990 when I have fixed the bootstrap
comparison issue at the earliest.

[Bug ipa/78268] [7 Regression] internal compiler error: Segmentation fault

2016-11-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78268

--- Comment #1 from kugan at gcc dot gnu.org ---
(In reply to Markus Trippelsdorf from comment #0)
> Either r241990 or r241989 causes a new ICE during Firefox build:
> 
> /home/trippels/gecko-dev/rdf/base/rdfutil.cpp:111:1: internal compiler
> error: Segmentation fault
>  }
>  ^
> 0x10b6b1d3 crash_signal
> ../../gcc/gcc/toplev.c:338
> 0x108308dc unshare_expr_without_location(tree_node*)
> ../../gcc/gcc/gimplify.c:978
> 0x10903163 ipa_set_jf_arith_pass_through
> ../../gcc/gcc/ipa-prop.c:468
> 0x10903163 update_jump_functions_after_inlining
> ../../gcc/gcc/ipa-prop.c:2645
> 0x10915f8b propagate_info_to_inlined_callees
> ../../gcc/gcc/ipa-prop.c:3409
> 0x10917c1f ipa_propagate_indirect_call_infos(cgraph_edge*, vec<cgraph_edge*,
> va_heap, vl_ptr>*)
> ../../gcc/gcc/ipa-prop.c:3561
> 0x113c401b inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap,
> vl_ptr>*, int*, bool, bool*)
> ../../gcc/gcc/ipa-inline-transform.c:447
> 0x113b973b inline_small_functions
> ../../gcc/gcc/ipa-inline.c:2029
> 0x113b973b ipa_inline
> ../../gcc/gcc/ipa-inline.c:2439
> 0x113b973b execute
> ../../gcc/gcc/ipa-inline.c:2850
> 
> 
> Reducing.

Sorry about the breakage. can you please attach the preprocessed source file to
reproduce this.

[Bug tree-optimization/78121] [7 Regression] ice in set_value_range

2016-11-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78121

--- Comment #7 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed Nov  9 01:41:26 2016
New Revision: 241989

URL: https://gcc.gnu.org/viewcvs?rev=241989=gcc=rev
Log:
Fix ice in set_value_range
gcc/ChangeLog:

2016-11-09  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/78121
* ipa-cp.c (propagate_vr_accross_jump_function): Pass param type.
Also fold constant passed as argument while computing value range.
(propagate_constants_accross_call): Pass param type.
* ipa-prop.c: export ipa_get_callee_param_type.
* ipa-prop.h: export ipa_get_callee_param_type.

gcc/testsuite/ChangeLog:

2016-11-09  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/78121
* gcc.dg/ipa/pr78121.c: New test.



Added:
trunk/gcc/testsuite/gcc.dg/ipa/pr78121.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-cp.c
trunk/gcc/ipa-prop.c
trunk/gcc/ipa-prop.h
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/78121] [7 Regression] ice in set_value_range

2016-11-05 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78121

--- Comment #6 from kugan at gcc dot gnu.org ---
(In reply to David Binderman from comment #5)
> (In reply to kugan from comment #4)
> > Created attachment 39904 [details]
> > untested patch
> > 
> > testing this patch
> 
> patch any good ?

Posted at https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02309.html
waiting for Honza's approval.

[Bug tree-optimization/78121] [7 Regression] ice in set_value_range

2016-10-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78121

--- Comment #4 from kugan at gcc dot gnu.org ---
Created attachment 39904
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39904=edit
untested patch

testing this patch

[Bug tree-optimization/78121] [7 Regression] ice in set_value_range

2016-10-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78121

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |kugan at gcc dot gnu.org

--- Comment #3 from kugan at gcc dot gnu.org ---
Looks like ipa-vrp issue. I will have a look.

[Bug tree-optimization/77921] [7 Regression] tree-ssanames.c miscompiled during PGO bootstrap

2016-10-10 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77921

--- Comment #4 from kugan at gcc dot gnu.org ---
Sorry about the breakage. I will try to reproduce it.

(In reply to Markus Trippelsdorf from comment #1)
> gcc version 7.0.0 20161007 was fine
Are you saying that this is issue is gone latent? 20161007 should have
early-vrp and ipa-vrp.

[Bug tree-optimization/77862] [7 Regression] ice in add_equivalence

2016-10-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77862

--- Comment #5 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Thu Oct  6 19:58:46 2016
New Revision: 240842

URL: https://gcc.gnu.org/viewcvs?rev=240842=gcc=rev
Log:
Fix PR77862
gcc/testsuite/ChangeLog:

2016-10-06  Kugan Vivekanandarajah  <kug...@linaro.org>

PR tree-optimization/77862
* gcc.dg/pr77862.c: New test.

gcc/ChangeLog:

2016-10-06  Kugan Vivekanandarajah  <kug...@linaro.org>

PR tree-optimization/77862
* tree-vrp.c (add_equivalence): Use get_value_range so that
num_vr_values is checked before accessing vr_values.



Added:
trunk/gcc/testsuite/gcc.dg/pr77862.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vrp.c

[Bug tree-optimization/77862] [7 Regression] ice in add_equivalence

2016-10-05 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77862

--- Comment #4 from kugan at gcc dot gnu.org ---
patch posted for review at:
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00349.html

[Bug tree-optimization/77677] [7 Regression] ICE at -O1 and above in both 32-bit and 64-bit modes on x86_64-linux-gnu (internal compiler error: in set_value_range, at tree-vrp.c:361)

2016-09-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77677

--- Comment #11 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Tue Sep 27 03:41:14 2016
New Revision: 240517

URL: https://gcc.gnu.org/viewcvs?rev=240517=gcc=rev
Log:
Fix ipa-vrp convert value_range

gcc/ChangeLog:

2016-09-27  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/77677
* ipa-prop.c (ipa_compute_jump_functions_for_edge): Use
extract_range_from_unary_expr to convert value_range.
* tree-vrp.c (extract_range_from_unary_expr_1): Rename to.
(extract_range_from_unary_expr): This.
* tree-vrp.h (extract_range_from_unary_expr): Declare.

gcc/testsuite/ChangeLog:

2016-09-27  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/77677
* gcc.dg/torture/pr77677-2.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/torture/pr77677-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-prop.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vrp.c
trunk/gcc/tree-vrp.h

[Bug tree-optimization/77719] [7 Regression] ICE in pp_string, at pretty-print.c:955

2016-09-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77719

--- Comment #7 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon Sep 26 18:16:23 2016
New Revision: 240505

URL: https://gcc.gnu.org/viewcvs?rev=240505=gcc=rev
Log:
Fix PR77719
gcc/testsuite/ChangeLog:

2016-09-26  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/77719
* gfortran.dg/pr77719.f90: New test.

gcc/ChangeLog:

2016-09-26  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/77719
* tree-ssa-reassoc.c (make_new_ssa_for_def): Use gimple_get_lhs to get
lhs
instead of gimple_assign_lhs as stmt can be builtins too.



Added:
trunk/gcc/testsuite/gfortran.dg/pr77719.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/77719] [7 Regression] ICE in pp_string, at pretty-print.c:955

2016-09-24 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77719

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #5 from kugan at gcc dot gnu.org ---
(In reply to Joost VandeVondele from comment #0)
> recent trunk regression:
> 
> > cat bug.f90
> SUBROUTINE urep_egr(erep,derep,surr)
>   INTEGER, PARAMETER :: dp=8
>   REAL(dp), INTENT(inout)  :: erep, derep(3)
>   REAL(dp), INTENT(in) :: surr(2)
>   REAL(dp) :: de_z, rz
>   IF (n_urpoly > 0) THEN
> IF (r < spxr(1,1)) THEN
>   ispg: DO isp = 1,spdim ! condition ca)
> IF (isp /= spdim) THEN
>   nsp = 5 ! condition cb
>   DO jsp = 0,nsp
> IF( jsp <= 3 ) THEN
> ELSE
>   erep = erep + surr(jsp-3)*rz**(jsp)
> ENDIF
>   END DO
> END IF
>   END DO ispg
> END IF
>   END IF
> END SUBROUTINE urep_egr
> 
> > gfortran  -c -O3 -ffast-math bug.f90
> [...]
> in pp_string, at pretty-print.c:955
> 0x14506c6 pp_string
>   ../../gcc/gcc/pretty-print.c:955
> 0x14506c6 pp_string(pretty_printer*, char const*)
>   ../../gcc/gcc/pretty-print.c:953
> 0x14514e9 pp_format(pretty_printer*, text_info*)
>   ../../gcc/gcc/pretty-print.c:597
> 0x14445f1 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
>   ../../gcc/gcc/diagnostic.c:941
> 0x1444e48 diagnostic_impl
>   ../../gcc/gcc/diagnostic.c:1064
> 0x1444f74 internal_error(char const*, ...)
>   ../../gcc/gcc/diagnostic.c:1349
> 0x9130f8 gimple_check_failed(gimple const*, char const*, int, char const*,
> gimple_code, tree_code)
>   ../../gcc/gcc/gimple.c:1177
> 0xd992d7 GIMPLE_CHECK2
>   ../../gcc/gcc/gimple.h:73
> 0xd8a037 gimple_phi_arg
>   ../../gcc/gcc/tree-phinodes.h:37
> 0xd8a037 gimple_phi_arg_imm_use_ptr
>   ../../gcc/gcc/tree-phinodes.h:37
> 0xd8a037 op_iter_next_use
>   ../../gcc/gcc/ssa-iterators.h:490
> 0xd8a037 link_use_stmts_after
>   ../../gcc/gcc/ssa-iterators.h:902
> 0xd8a037 next_imm_use_stmt
>   ../../gcc/gcc/ssa-iterators.h:955
> 0xd8a037 make_new_ssa_for_def
>   ../../gcc/gcc/tree-ssa-reassoc.c:1167
> 0xd8d908 make_new_ssa_for_all_defs
>   ../../gcc/gcc/tree-ssa-reassoc.c:1194
> 0xd8d908 zero_one_operation
>   ../../gcc/gcc/tree-ssa-reassoc.c:1338
> 0xd95430 undistribute_ops_list
>   ../../gcc/gcc/tree-ssa-reassoc.c:1684
> 0xd96178 reassociate_bb
>   ../../gcc/gcc/tree-ssa-reassoc.c:5393
> 0xd95fa7 reassociate_bb
>   ../../gcc/gcc/tree-ssa-reassoc.c:5528
> 0xd95fa7 reassociate_bb
>   ../../gcc/gcc/tree-ssa-reassoc.c:5528
> Please submit a full bug report,
> 
> > gfortran -v
> Using built-in specs.
> COLLECT_GCC=gfortran
> COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-
> linux-gnu/7.0.0/lto-wrapper
> Target: x86_64-pc-linux-gnu
> Configured with: ../gcc/configure
> --prefix=/data/vjoost/gnu/gcc_trunk/install --enable-languages=c,c++,fortran
> --disable-multilib --enable-plugins --enable-lto --disable-bootstrap
> Thread model: posix
> gcc version 7.0.0 20160924 (experimental) [trunk revision 240461] (GCC)

Sorry for the breakage. Sent a patch to fix this at
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01760.html.

[Bug tree-optimization/77677] [7 Regression] ICE at -O1 and above in both 32-bit and 64-bit modes on x86_64-linux-gnu (internal compiler error: in set_value_range, at tree-vrp.c:361)

2016-09-23 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77677

--- Comment #10 from kugan at gcc dot gnu.org ---
(In reply to Pat Haugen from comment #9)
> (In reply to kugan from comment #8)
> > Author: kugan
> > Date: Fri Sep 23 10:25:09 2016
> > New Revision: 240420
> > 
> > URL: https://gcc.gnu.org/viewcvs?rev=240420=gcc=rev
> > Log:
> > Drop TREE_OVERFLOW
> > 
> > gcc/ChangeLog:
> > 
> > 2016-09-23  Kugan Vivekanandarajah  <kug...@linaro.org>
> > 
> > PR ipa/77677
> > * ipa-cp.c (propagate_vr_accross_jump_function): Drop TREE_OVERFLOW
> > from constant while creating value range.
> 
> 
> Unfortunately this does not fix the problem building 176.gcc on powerpc.
> Following is reduced testcase. Failure occurs with 'gcc -O2'.
> 
> enum machine_mode { MAX_MACHINE_MODE };
> struct {
>   int mode : 8
> } a;
> b;
> static fn1();
> fn2() { fn1(a, a.mode); }
> 
> fn1(x, mode) enum machine_mode mode;
> { int c = b = c; }

Sorry for the breakage. Posted a patch at
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01746.html for this.

[Bug tree-optimization/77677] [7 Regression] ICE at -O1 and above in both 32-bit and 64-bit modes on x86_64-linux-gnu (internal compiler error: in set_value_range, at tree-vrp.c:361)

2016-09-23 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77677

--- Comment #8 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Fri Sep 23 10:25:09 2016
New Revision: 240420

URL: https://gcc.gnu.org/viewcvs?rev=240420=gcc=rev
Log:
Drop TREE_OVERFLOW

gcc/ChangeLog:

2016-09-23  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/77677
* ipa-cp.c (propagate_vr_accross_jump_function): Drop TREE_OVERFLOW
from constant while creating value range.

gcc/testsuite/ChangeLog:

2016-09-23  Kugan Vivekanandarajah  <kug...@linaro.org>

PR ipa/77677
* gcc.dg/torture/pr77677.c: New test.



Added:
trunk/gcc/testsuite/gcc.dg/torture/pr77677.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-cp.c
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/77677] [7 Regression] ICE at -O1 and above in both 32-bit and 64-bit modes on x86_64-linux-gnu (internal compiler error: in set_value_range, at tree-vrp.c:361)

2016-09-22 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77677

--- Comment #7 from kugan at gcc dot gnu.org ---
(In reply to Bill Seurer from comment #6)
> The test case 176.gcc in the spec2000 benchmarks still fails apparently with
> the same error even after 240352
> 
> (this is with 240383)
> 
> loop.c: At top level:
> loop.c:6648:1: internal compiler error: in set_value_range, at tree-vrp.c:367
>  }
>  ^
> 0x10bf4f6f set_value_range
>   /home/seurer/gcc/gcc-test/gcc/tree-vrp.c:367
> 0x10bf9067 vrp_meet_1
>   /home/seurer/gcc/gcc-test/gcc/tree-vrp.c:8639
> 0x10bf9067 vrp_meet(value_range*, value_range const*)
>   /home/seurer/gcc/gcc-test/gcc/tree-vrp.c:8716
> 0x110a14c3 ipcp_vr_lattice::meet_with_1(value_range const*)
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:987
> 0x110a4c9f ipcp_vr_lattice::meet_with(value_range const*)
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:969
> 0x110a4c9f propagate_vr_accross_jump_function
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:2077
> 0x110a4c9f propagate_constants_accross_call
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:2435
> 0x110abdc7 propagate_constants_topo
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:3329
> 0x110abdc7 ipcp_propagate_stage
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:3439
> 0x110acf13 ipcp_driver
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:5246
> 0x110acf13 execute
>   /home/seurer/gcc/gcc-test/gcc/ipa-cp.c:5342
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <http://gcc.gnu.org/bugs.html> for instructions.
> specmake: *** [loop.o] Error 1
> specmake: *** Waiting for unfinished jobs
> specmake -j20 options 2> options.err | tee options.out
> COMP: /home/seurer/gcc/install/gcc-test/bin/gcc -c -o options.o 
> -fno-strict-aliasing -m32 -DHOST_WORDS_BIG_ENDIAN -DSPEC_CPU2000_LINUX_PPC32
> -O3 -mcpu=power7 -fpeel-loops -funroll-loops -ffast-math -fvect-cost-model
> -mpopcntd -mrecip=rsqrt  
> LINK: /home/seurer/gcc/install/gcc-test/bin/gcc -m32 -Wl,-q
> -Wl,-rpath=/home/seurer/gcc/install/gcc-test/lib  -O3 -mcpu=power7
> -fpeel-loops -funroll-loops -ffast-math -fvect-cost-model -mpopcntd
> -mrecip=rsqrt  -lm  -o options
>   Some files did not appear to be built: cc1
> *** Error building 176.gcc

Sorry about the breakage. As Richard pointed out, it could come from other
places too. I have posted a patch at
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01629.html. I haven’t tested the
patch with spec2000 yet.

[Bug tree-optimization/72835] [7 Regression] Incorrect arithmetic optimization involving bitfield arguments

2016-09-20 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72835

--- Comment #5 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed Sep 21 03:28:24 2016
New Revision: 240299

URL: https://gcc.gnu.org/viewcvs?rev=240299=gcc=rev
Log:
Incorrect arithmetic optimization involving bitfield arguments

gcc/ChangeLog:

2016-09-21  Kugan Vivekanandarajah  <kug...@linaro.org>

PR tree-optimization/72835
* tree-ssa-reassoc.c (make_new_ssa_for_def): New.
(make_new_ssa_for_all_defs): Likewise.
(zero_one_operation): Replace all SSA_NAMEs defined in the chain.


gcc/testsuite/ChangeLog:

2016-09-21  Kugan Vivekanandarajah  <kug...@linaro.org>

PR tree-optimization/72835
* gcc.dg/tree-ssa/pr72835.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr72835.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-13 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #13 from kugan at gcc dot gnu.org ---
(In reply to avieira from comment #12)
> I heard Kugan was working on getting rid of superfluous zero_extends. Adding
> him to the watch list.
> 
> @Kugan: Could your work help this case? And when do you plan to have it
> submitted?

Thanks for the testcase.
With my type promotion pass, I am getting:
cmp r1, r2
ble .L10
-   push{r4, r5, r6, r7}
-   ldr r7, .L14
+   push{r4, r5, r6, lr}
+   ldr r5, .L14
movwr6, #45345
 .L4:
-   smull   r5, r4, r7, r1
+   smull   lr, r4, r5, r1
lsrsr0, r0, #1
sub r4, r4, r1, asr #31
-   eor r5, r0, r6
add r4, r4, r4, lsl #1
cmp r1, r4
sub r1, r1, r3
it  ne
-   uxthne  r0, r5
+   eorne   r0, r0, r6
cmp r2, r1
blt .L4
-   pop {r4, r5, r6, r7}
-   bx  lr
+   uxthr0, r0
+   pop {r4, r5, r6, pc}
 .L10:
+   uxthr0, r0
bx  lr
 .L15:
.align  2
 .L14:
.word   1431655766
.size   foo, .-foo

Even though, extension is removed from loop, it has an extra uxth. I have some
cases like this to look at before I post the patch again. I am afraid, I will
not be able to do it during this stage1.

[Bug tree-optimization/77387] Value range not computed in some cases for ABS_EXPR

2016-08-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77387

--- Comment #1 from kugan at gcc dot gnu.org ---
With :

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e4d789b..2d1f4c8 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3416,6 +3416,17 @@ extract_range_from_unary_expr_1 (value_range *vr,
  return;
}

+  /* If SIGNED and VARYING set [0, TYPE_MAX].  */
+  if (!TYPE_UNSIGNED (type)
+ && vr0.type == VR_VARYING)
+   {
+ set_value_range (vr, VR_RANGE,
+  build_int_cst (type, 0),
+  vrp_val_max (type),
+  NULL);
+ return;
+   }
+
   /* For the remaining varying or symbolic ranges we can't do anything
 useful.  */
   if (vr0.type == VR_VARYING


vrp1 dump becomes:
Value ranges after VRP:

_1: [0, 127]  EQUIVALENCES: { x_4 } (1 elements)
i_2(D): VARYING
x_3: [0, +INF]
x_4: [0, 127]
x_7: [0, 127]  EQUIVALENCES: { x_4 } (1 elements)


Folding predicate x_4 > 256 to 0
Removing basic block 5
Merging blocks 2 and 3
Merging blocks 2 and 4
foo (int i)
{
  int x;

  :
  x_3 = ABS_EXPR <i_2(D)>;
  x_4 = x_3 >> 24;
  return x_4;

}

[Bug tree-optimization/77387] New: Value range not computed in some cases for ABS_EXPR

2016-08-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77387

Bug ID: 77387
   Summary: Value range not computed in some cases for ABS_EXPR
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

For testcase:

int foo (int i)
{
  int x = i;
  x = __builtin_abs (i);
  x >>= 24;
  if (x > 256)
return 0;
  return x;
}

vrp1 dump is:
Value ranges after VRP:

_1: [-INF, 256]
i_2(D): VARYING
x_3: [0, +INF(OVF)]
x_4: VARYING
x_6: [257, +INF]  EQUIVALENCES: { x_4 } (1 elements)
x_7: [-INF, 256]  EQUIVALENCES: { x_4 } (1 elements)


Removing basic block 3
foo (int i)
{
  int x;
  int _1;

  :
  x_3 = ABS_EXPR <i_2(D)>;
  x_4 = x_3 >> 24;
  if (x_4 > 256)
goto ;
  else
goto ;

  :

  :
  # _1 = PHI <0(3), x_4(2)>
  return _1;

}


Note:
x_3: [0, +INF(OVF)]
x_4: VARYING

[Bug tree-optimization/61839] More optimize opportunity for VRP

2016-08-19 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61839

--- Comment #3 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Aug 20 01:18:09 2016
New Revision: 239637

URL: https://gcc.gnu.org/viewcvs?rev=239637=gcc=rev
Log:

gcc/testsuite/ChangeLog:

2016-08-20  Kugan Vivekanandarajah  <kug...@linaro.org>

PR tree-optimization/61839
* gcc.dg/tree-ssa/pr61839_1.c: New test.
* gcc.dg/tree-ssa/pr61839_2.c: New test.
* gcc.dg/tree-ssa/pr61839_3.c: New test.
* gcc.dg/tree-ssa/pr61839_4.c: New test.

gcc/ChangeLog:

2016-08-20  Kugan Vivekanandarajah  <kug...@linaro.org>

PR tree-optimization/61839
* tree-vrp.c (two_valued_val_range_p): New.
(simplify_stmt_using_ranges): Convert CST BINOP VAR where VAR is
two-valued to VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2).
Also Convert VAR BINOP CST where VAR is two-valued to
VAR == VAL1 ? (VAL1 BINOP CST) : (VAL2 BINOP CST).


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr61839_2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr61839_4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vrp.c

[Bug tree-optimization/72835] [7 Regression] Incorrect arithmetic optimization involving bitfield arguments

2016-08-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72835

--- Comment #4 from kugan at gcc dot gnu.org ---
Looks like it was a latent issue. In rewrite_expr_tree, when re-associate
operands, we should reset range_info for the LHS. We don’t do that now.
Following patch fixes the test case. 


diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 7fd7550..6272d98 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -3945,6 +3945,7 @@ rewrite_expr_tree (gimple *stmt, unsigned int opindex,
  gimple_assign_set_rhs1 (stmt, oe1->op);
  gimple_assign_set_rhs2 (stmt, oe2->op);
  update_stmt (stmt);
+ reset_flow_sensitive_info (lhs);
}

  if (rhs1 != oe1->op && rhs1 != oe2->op)


I think we also need to do the same in rewrite_expr_tree_parallel.

[Bug tree-optimization/72835] [7 Regression] Incorrect arithmetic optimization involving bitfield arguments

2016-08-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72835

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #3 from kugan at gcc dot gnu.org ---
Looking into it. 

diff of .115t.dse2 and .116t.reassoc1 is for the c++ testcase:

+  unsigned int _16;
+  unsigned int _17;
+  unsigned int _18;

   :
   _1 = s1.m2;
   _2 = (unsigned int) _1;
   _3 = s1.m3;
   _4 = (unsigned int) _3;
-  _5 = -_4;
-  _6 = _2 * _5;
+  _5 = _4;
+  _6 = _5 * _2;
   var_32.0_7 = var_32;
   _8 = (unsigned int) var_32.0_7;
   _9 = s1.m1;
   _10 = (unsigned int) _9;
-  _11 = -_10;
-  _12 = _8 * _11;
-  c_14 = _6 + _12;
+  _11 = _10;
+  _12 = _11 * _8;
+  _16 = _12 + _6;
+  _18 = _16;
+  _17 = -_18;
+  c_14 = _17;
   if (c_14 != 4098873984)


Also works with -fno-tree-vrp

[Bug rtl-optimization/68217] Wrong constant folding

2016-07-28 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68217

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Fri Jul 29 00:35:23 2016
New Revision: 238846

URL: https://gcc.gnu.org/viewcvs?rev=238846=gcc=rev
Log:
gcc/ChangeLog:

2016-07-29  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/68217
* tree-vrp.c (extract_range_from_binary_expr_1): In case of signed
& sign-bit-CST, generate [-INF, 0] instead of [-INF, INF].


gcc/testsuite/ChangeLog:

2016-07-29  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/68217
* gcc.dg/pr68217.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr68217.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vrp.c

[Bug tree-optimization/71994] [7 Regression] ICE: verify_gimple failed

2016-07-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71994

--- Comment #5 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed Jul 27 23:02:44 2016
New Revision: 238803

URL: https://gcc.gnu.org/viewcvs?rev=238803=gcc=rev
Log:
gcc/testsuite/ChangeLog:

2016-07-28  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/71994
* gcc.dg/torture/pr71994.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/torture/pr71994.c

[Bug tree-optimization/71994] [7 Regression] ICE: verify_gimple failed

2016-07-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71994

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed Jul 27 22:45:46 2016
New Revision: 238802

URL: https://gcc.gnu.org/viewcvs?rev=238802=gcc=rev
Log:
gcc/testsuite/ChangeLog:

2016-07-28  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/71994
* gcc.dg/torture/pr71994.c: New test.

gcc/ChangeLog:

2016-07-28  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/71994
* tree-ssa-reassoc.c (maybe_optimize_range_tests): Check tcc_comparison
 before calling get_ops.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/71994] [7 Regression] ICE: verify_gimple failed

2016-07-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71994

--- Comment #2 from kugan at gcc dot gnu.org ---
Patch to fix this is posted for review at
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01680.html

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2016-07-24 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726

--- Comment #19 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sun Jul 24 12:47:29 2016
New Revision: 238695

URL: https://gcc.gnu.org/viewcvs?rev=238695=gcc=rev
Log:
gcc/ChangeLog:

2016-07-24  Kugan Vivekanandarajah  <kug...@linaro.org>

PR middle-end/66726
* tree-ssa-reassoc.c (optimize_vec_cond_expr): Handle tcc_compare stmt
whose result is used in PHI.
(final_range_test_p): Likewise.
(maybe_optimize_range_tests): Likewise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/71170] [7 Regression] ICE in rewrite_expr_tree, at tree-ssa-reassoc.c:3898

2016-06-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

--- Comment #17 from kugan at gcc dot gnu.org ---
(In reply to kugan from comment #15)
> (In reply to David Binderman from comment #14)
> > (In reply to Jakub Jelinek from comment #12)
> > > Is it still broken?
> > 
> > I think so. Attachment seems to break svn 237286, dated today.
> 
> The issue with tree-reassoc is fixed now. Attached code ICE with
> -fno-tree-reassoc also. Here is reduced testcase that shows the current ICE.
> I think we should close this PR and create a new one for this.
> 
> cat b.c
> struct {
>   int error;
> } *a;
> 
> extern int fz_push_try ();
> int pdf_page_render() { return fz_push_try() && (a->error = __sigsetjmp()); }
> 
> ./gcc/cc1 -O2 b.c -fno-tree-reassoc
>  pdf_page_render
> b.c: In function ‘pdf_page_render’:
> b.c:6:61: warning: implicit declaration of function ‘__sigsetjmp’
> [-Wimplicit-function-declaration]
>  int pdf_page_render() { return fz_push_try() && (a->error = __sigsetjmp());
> }
>  ^~~
> 
> Analyzing compilation unit
> Performing interprocedural optimizations
>  <*free_lang_data>  b.c:6:1: error: definition
> in block 4 does not dominate use in block 5
>  int pdf_page_render() { return fz_push_try() && (a->error = __sigsetjmp());
> }
>  ^~~
> for SSA_NAME: a.1_2 in statement:
> # .MEM_14 = VDEF <.MEM_13>
> a.1_2->error = _3;
> b.c:6:1: internal compiler error: verify_ssa failed
> 0xdacfab verify_ssa(bool, bool)
>   ../../test/gcc/tree-ssa.c:1039
> 0xac3c87 execute_function_todo
>   ../../test/gcc/passes.c:1971
> 0xac2ce7 do_per_function
>   ../../test/gcc/passes.c:1648
> 0xac3e1c execute_todo
>   ../../test/gcc/passes.c:2016
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <http://gcc.gnu.org/bugs.html> for instructions.

This new issue is duplicate of PR71104 and started with r235817. This PR which
is about tree-reassoc is fixed and can be closed.

  1   2   3   >