[Bug rtl-optimization/111673] New: assign_hard_reg() routine should scale save/restore costs of callee save registers with basic block frequency

2023-10-03 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111673 Bug ID: 111673 Summary: assign_hard_reg() routine should scale save/restore costs of callee save registers with basic block frequency Product: gcc Version:

[Bug debug/105041] '-fcompare-debug' failure w/ -mcpu=power6 -O2 -fharden-compares -frename-registers

2022-04-06 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105041 --- Comment #6 from Surya Kumari Jangala --- I will be debugging the issue to figure the root cause.

[Bug debug/105586] [11/12/13 Regression] -fcompare-debug failure (length) with -O2 -fno-if-conversion -mtune=power4 -fno-guess-branch-probability

2022-05-19 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105586 Surya Kumari Jangala changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jskumari at gcc dot

[Bug rtl-optimization/105041] '-fcompare-debug' failure w/ -mcpu=power6 -O2 -fharden-compares -frename-registers

2022-06-15 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105041 Surya Kumari Jangala changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED

[Bug rtl-optimization/105041] '-fcompare-debug' failure w/ -mcpu=power6 -O2 -fharden-compares -frename-registers

2022-06-14 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105041 Surya Kumari Jangala changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/100799] Stackoverflow in optimized code on PPC

2022-10-17 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799 --- Comment #18 from Surya Kumari Jangala --- I git cloned and built flexiblas to see what is the frame size and what is the assembly code generated for the flexiblas C wrapper routine for dgebal. The important assembly code snippets for

[Bug target/100799] Stackoverflow in optimized code on PPC

2022-10-17 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799 --- Comment #17 from Surya Kumari Jangala --- I analysed the reduced test case specified in comment 15. In the .s file, the callee decrements r1 by 224, ie, calleeā€™s frame size is 224. But there is an instruction in the callee that accesses

[Bug target/100799] Stackoverflow in optimized code on PPC

2022-10-17 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799 --- Comment #19 from Surya Kumari Jangala --- There is a keyword called BIND(C) which can be specified on a Fortran procedure to make it interoperable. I tried this keyword on DGEBAL fortran routine which is a part of the openblas library and

[Bug target/100799] Stackoverflow in optimized code on PPC

2022-09-18 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799 --- Comment #15 from Surya Kumari Jangala --- (In reply to Segher Boessenkool from comment #14) > What is the exact command line (and relevant configuration!) required to > reproduce this? The reduced testcase is: SUBROUTINE DGEBAL(

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-01-05 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 --- Comment #8 from Surya Kumari Jangala --- Using -O3 with gcc13, I got (with the test in comment 2): For P8: cmpwi 0,3,2 bgt 0,.L3 subfic 4,4,9 srdi 3,4,63 xori 3,3,0x1 rldicl 3,3,0,63

[Bug middle-end/108073] [rs6000] sub-optimal float member accessing on struct parameter

2022-12-20 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108073 Surya Kumari Jangala changed: What|Removed |Added CC||jskumari at gcc dot gnu.org ---

[Bug rtl-optimization/106418] '-fcompare-debug' failure w/ -mcpu=e500mc -O2 -fnon-call-exceptions -fsched-stalled-insns -fno-reorder-blocks -fno-thread-jumps -fno-tree-dce

2022-11-22 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106418 Surya Kumari Jangala changed: What|Removed |Added Resolution|--- |FIXED

[Bug testsuite/107171] New test case gcc.target/powerpc/pr105586.c fails after its introduction in r13-2525-gbec35caafae8db

2022-11-29 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107171 Surya Kumari Jangala changed: What|Removed |Added Resolution|--- |FIXED

[Bug rtl-optimization/105586] [11/12 Regression] -fcompare-debug failure (length) with -O2 -fno-if-conversion -mtune=power4 -fno-guess-branch-probability

2022-11-08 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105586 --- Comment #12 from Surya Kumari Jangala --- Richard has clarified here (https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605386.html) that backporting is not required.

[Bug rtl-optimization/105586] [11/12 Regression] -fcompare-debug failure (length) with -O2 -fno-if-conversion -mtune=power4 -fno-guess-branch-probability

2022-11-08 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105586 --- Comment #10 from Surya Kumari Jangala --- (In reply to Segher Boessenkool from comment #9) > I read > as approval to backport, fwiw :-) I read that as: Since it is

[Bug rtl-optimization/105586] [11/12 Regression] -fcompare-debug failure (length) with -O2 -fno-if-conversion -mtune=power4 -fno-guess-branch-probability

2022-11-09 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105586 Surya Kumari Jangala changed: What|Removed |Added Resolution|--- |FIXED

[Bug target/100799] Stackoverflow in optimized code on PPC

2022-11-09 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799 Surya Kumari Jangala changed: What|Removed |Added Status|ASSIGNED|WAITING --- Comment #21 from

[Bug target/106770] powerpc64le: Unnecessary xxpermdi before mfvsrd

2023-03-02 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770 --- Comment #10 from Surya Kumari Jangala --- The swap pass analyzes vector computations and removes unnecessary doubleword swaps (xxswapdi instructions). The swap pass first constructs webs and removes swap instructions if possible. If the web

[Bug target/106770] powerpc64le: Unnecessary xxpermdi before mfvsrd

2023-03-02 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770 --- Comment #12 from Surya Kumari Jangala --- (In reply to Jens Seifert from comment #6) > The left part of VSX registers overlaps with floating point registers, that > is why no register xxpermdi is required and mfvsrd can access all (left) >

[Bug rtl-optimization/109009] New: Shrink Wrap missed opportunity

2023-03-03 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 Bug ID: 109009 Summary: Shrink Wrap missed opportunity Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug target/106770] powerpc64le: Unnecessary xxpermdi before mfvsrd

2023-03-01 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770 --- Comment #9 from Surya Kumari Jangala --- RTL after dfinit pass for the vec_sub() and the vec_extract(): (insn 13 12 14 2 (set (reg:V2DI 132 [ vrD.3952 ]) (minus:V2DI (subreg:V2DI (reg:V2DF 117 [ _1 ]) 0) (subreg:V2DI

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-03-04 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #2 from Surya Kumari Jangala --- For the working testcase: long foo (long i, long cond) { if (cond) bar (); return i; } The input RTL to the shrink wrap pass is: BB2: set r100, compare(r4, 0) if r100 jump BB4 else

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-03-05 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #3 from Surya Kumari Jangala --- For the working case: * Input RTL to the IRA pass: BB2: set r123, r4 set r122, r3 set r120, compare(r123, 0) set r118, r122 if r120 jump BB4 else jump BB3 BB3: call bar() BB4: set r3,

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-03-05 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 --- Comment #10 from Surya Kumari Jangala --- After the expand pass, we have a single return bb which first zero extends r117 (this reg holds the return value which has been set by predecessor blocks). Zero extension is done because r117 is of

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-03-06 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 --- Comment #13 from Surya Kumari Jangala --- Thanks David and Segher for your comments. I wanted to note down my analysis and thoughts from when I had worked on this bug in January. Ajit is looking into it now.

[Bug target/106770] powerpc64le: Unnecessary xxpermdi before mfvsrd

2023-03-01 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770 --- Comment #8 from Surya Kumari Jangala --- While the first two xxpermdi's are fine, the 3rd one is a bug. It is incorrect. Here is the C code inlined into assembly: _Z4cmp2dd: .LFB1: .cfi_startproc // vector double va =

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-02-28 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 --- Comment #9 from Surya Kumari Jangala --- The same issue of unnecessary rldicl instruction is there if we change return value from bool to int. int foo (int a, int b) { if (a > 2) return 0; if (b < 10) return 1; return 0; }

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-04-14 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #5 from Surya Kumari Jangala --- I was analysing and comparing the following test cases: Test1 (shrink wrapped) long foo (long i, long cond) { i = i + 1; if (cond) bar (); return i; } Test2 (not shrink wrapped) long

[Bug rtl-optimization/110254] New: improve_allocation() routine does not update allocated_hardreg_p[] array

2023-06-14 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110254 Bug ID: 110254 Summary: improve_allocation() routine does not update allocated_hardreg_p[] array Product: gcc Version: unknown Status: UNCONFIRMED Severity:

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-07-20 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 --- Comment #15 from Surya Kumari Jangala --- This is another test which has unnecessary zero extension: #include bool glob1; bool glob2; bool foo (int a, bool d) { bool c; if (a > 2) c = glob1 & glob2; else c = glob1 | glob2;

[Bug rtl-optimization/110071] New: improve_allocation() routine should consider save/restore cost of callee-save registers

2023-06-01 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110071 Bug ID: 110071 Summary: improve_allocation() routine should consider save/restore cost of callee-save registers Product: gcc Version: unknown Status: UNCONFIRMED

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-06-23 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #8 from Surya Kumari Jangala --- (In reply to Surya Kumari Jangala from comment #7) > There are a couple of issues in IRA: > > 1. In improve_allocation() routine, we are not considering save/restore cost > of using a callee save

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-06-27 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #12 from Surya Kumari Jangala --- (In reply to Peter Bergner from comment #10) > (In reply to Peter Bergner from comment #9) > > Yes, you'll need to factor in the BB frequency. Since the save/restore code > > will go into (at this

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-06-27 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #11 from Surya Kumari Jangala --- (In reply to Peter Bergner from comment #9) > (In reply to Surya Kumari Jangala from comment #8) > > However, while computing the save/restore cost, we are considering only the > > memory move cost

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-05-10 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #6 from Surya Kumari Jangala --- Continuing with the analysis of the test cases specified in comment 5, here are some findings: After graph colouring, when we do improve_allocation(), we find that in the failing test case, the

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-05-11 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009 --- Comment #7 from Surya Kumari Jangala --- There are a couple of issues in IRA: 1. In improve_allocation() routine, we are not considering save/restore cost of using a callee save register (r31 in the failing case). Due to this, r31 is being

[Bug target/96017] Powerpc suboptimal register spill in likely path

2023-11-24 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96017 --- Comment #14 from Surya Kumari Jangala --- Instead of using a non-volatile register to hold the value of foo, a volatile register (r9) is assigned to hold foo. This avoids setting up the stack frame in the fast path.

[Bug target/96017] Powerpc suboptimal register spill in likely path

2023-11-24 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96017 --- Comment #13 from Surya Kumari Jangala --- With the patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631849.html, the testcase gets shrink wrapped. This is the assembly produced: addis 2,12,.TOC.-.LCF0@ha addi

[Bug rtl-optimization/110071] improve_allocation() routine should consider save/restore cost of callee-save registers

2024-02-01 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110071 Surya Kumari Jangala changed: What|Removed |Added Last reconfirmed||2024-02-01

[Bug rtl-optimization/110071] improve_allocation() routine should consider save/restore cost of callee-save registers

2024-02-01 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110071 Surya Kumari Jangala changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/114004] GCC emits a superfluous instruction for simple test case on ppc

2024-02-27 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114004 Surya Kumari Jangala changed: What|Removed |Added Status|NEW |ASSIGNED