from:"ramana at gcc dot gnu.org"

[Bug target/104689] aarch64: libgcc: DW_CFA_val_expression is not supported for RA_SIGN_SATE register

2022-11-20 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104689

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan  ---
(In reply to nsz from comment #2)
> fixed for gcc-13

In an AArch64 Ubuntu 22.04 VM on my Apple Silicon M1 I see :  

Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp
asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdfhm dit
uscat ilrcpc flagm ssbs sb paca pacg dcpodp flagm2 frint


i.e. PACA / PACG ,

I see this test failing with trunk as of
136029059686fed2d99c755baf35f98553fc0232 simply bootstrapped with
$srcdir/configure --enable-languages=c,c++ . 

I'll see if I can pull something out when I have some time.

Ramana

[Bug other/107620] Build errors when using sphinx

2022-11-13 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107620

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan  ---
I worked around this by installing everything in $SRCDIR/doc/requirements.txt.

pip3 install -r requirements.txt 


 I found that using that allowed me to build html documentation. However I
seemed to need in addition a whole bunch of stuff to build latex and pdf
documentation.

Ramana

[Bug tree-optimization/107326] [13 Regression] ICE: verify_gimple failed (error: type mismatch in binary expression) since r13-3219-g25413fdb2ac249

2022-11-13 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107326

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #4 from Ramana Radhakrishnan  ---
Fixed ?

[Bug target/92999] [armhf] struct with adjacent __fp16's copies wrongly

2022-11-07 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92999

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ramana at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from Ramana Radhakrishnan  ---
I think I have a patch but might need some help testing it.

[Bug target/105929] [AArch64] armv8.4-a allows atomic stp. 64-bit constants can use 2 32-bit halves with _Atomic or volatile

2022-11-05 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105929

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-11-05

--- Comment #1 from Ramana Radhakrishnan  ---
Confirmed as an 8 byte aligned store is always going to be fully within a 16
byte block aligned to a 16 byte aligned address.

[Bug debug/53135] Duplicates cause size explosion (vta/dwarf)

2022-11-05 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53135

--- Comment #20 from Ramana Radhakrishnan  ---
(In reply to Jeffrey A. Law from comment #19)
> I think it's just  workaround that got installed in 2012, not a real fix. 
> Of course, 10 years later one could ask if the workaround has become the
> "real fix".

That is of course a jolly good question :P 

Ramana

[Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64

2022-11-05 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107533

Bug ID: 107533
   Summary: Inefficient code sequence for fp16 testcase on aarch64
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ramana at gcc dot gnu.org
  Target Milestone: ---

Derived from PR92999 



struct phalf {
__fp16 first;
__fp16 second;
};

struct phalf phalf_copy(struct phalf* src) __attribute__((noinline));
struct phalf phalf_copy(struct phalf* src) {
return *src;
}

Compiling for AArch64 with a recent enough compiler produces. 

phalf_copy:
ldr w0, [x0]
ubfxx1, x0, 0, 16
lsr w0, w0, 16
dup v0.4h, w1
dup v1.4h, w0
ret


Couldn't it just be ldr h0, [x0]
ldr h1, [x0, 2] 

IIRC this is in base v8 rather than v8.2 


regards
Ramana

[Bug target/92999] [armhf] struct with adjacent __fp16's copies wrongly

2022-11-05 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92999

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1
  Known to fail||13.0
   Last reconfirmed||2022-11-05

--- Comment #2 from Ramana Radhakrishnan  ---
confirmed on trunk.

I think this is an issue in the way the registers are allocated for the VFP
PCS.

[Bug debug/100523] [11/12/13 Regression] armv8.1-m.main -fcompare-debug failure with -O -fmodulo-sched -mtune=cortex-a53

2022-11-04 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100523

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan  ---
It does fall over but isn't this is a bit of an undefined testcase given that
crc is an uninitialised local variable used before initialisation ? 

What am I missing ? 

Ramana

[Bug target/94604] support for the ETSI basic operations

2022-11-04 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94604

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2022-11-04
 CC||ramana at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan  ---
I'd suggest moving this to Waiting given the time without a response and the
correct links to documentation and that this should be covered really by
arm_acle.h . 

Ramana

[Bug debug/53135] Duplicates cause size explosion (vta/dwarf)

2022-11-04 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53135

--- Comment #18 from Ramana Radhakrishnan  ---
Since the fix got installed in 2012 this really should have been fixed from
4.8.0 onwards. 

Should we really keep this still open or can we close this out ? 

Ramana

[Bug target/97726] simd intrinsics tests fail on armeb

2021-04-09 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97726

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #2 from Ramana Radhakrishnan  ---
Fixed now ?

[Bug target/96372] [11 regression] arm/ivopts.c fails since r11-2012

2021-04-09 Thread ramana at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96372

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #4 from Ramana Radhakrishnan  ---
Is this now fixed ?

[Bug tree-optimization/88709] Improve store-merging

2019-05-10 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88709

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #13 from Ramana Radhakrishnan  ---
(In reply to Richard Earnshaw from comment #9)
> (In reply to Jakub Jelinek from comment #7)
> > (In reply to Christophe Lyon from comment #6)
> > > I've noticed that the new test store_merging_29.c fails on
> > > arm-none-eabi --with-cpu cortex-a9
> > > FAIL: gcc.dg/store_merging_29.c scan-tree-dump store-merging "New sequence
> > > of 3 stores to replace old one of 6 stores"
> > 
> > That is because target-supports.exp lies on arm, even for -mcpu=cortex-a9
> > STRICT_ALIGNMENT is 1 on arm (it is 1 unconditionally), but
> > check_effective_target_non_strict_align returns true on arm anyway.
> > I've already added hacks for it in r256783 in another testcase though, guess
> > I'll do something similar now, but I must say I'm not very excited about
> > that.
> 
> Support for misaligned accesses is a three(.5!)-valued problem:
> 
> 1) There's no support in the architecture at all
> 2) There's some support with a limited set of instructions
> 3) There's full support: any memory access can handle any alignment.
> 3.5) There's full support: but some accesses may be very slow
> 
> I would think that these days most CPU architectures actually fall into
> either 1 or 2.  Many architectures have limitations, for example on atomic
> accesses that are unaligned.
> 
> STRICT_ALIGNMENT only covers, in reality case 3.  I'm not even sure if it
> would be defined on a machine with case 3.5.
> 
> I think the real problem here is that it's not clear what question this
> target-supports macro is really asking - does the CPU have the capability to
> do (some) unaligned acceses?  or can it arbitrarily support casts from
> unaligned pointers to standard types?

I agree - it sounds like this should be put in a comment next to the
target-supports query.

[Bug target/89400] [7/8/9/10 Regression] ICE: output_operand: invalid %-code with -march=armv6kz -mthumb -munaligned-access

2019-05-02 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89400

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #4 from Ramana Radhakrishnan  ---
*** Bug 90308 has been marked as a duplicate of this bug. ***

[Bug target/90308] ICE in output_operand: invalid %-code

2019-05-02 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90308

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Ramana Radhakrishnan  ---
Certainly looks like it.

*** This bug has been marked as a duplicate of bug 89400 ***

[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not

2019-05-01 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Ramana Radhakrishnan  ---
Now fixed on trunk and all release branches.

[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not

2019-05-01 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538

--- Comment #7 from Ramana Radhakrishnan  ---
Author: ramana
Date: Wed May  1 15:27:40 2019
New Revision: 270770

URL: https://gcc.gnu.org/viewcvs?rev=270770=gcc=rev
Log:
[Patch AArch64] Add __ARM_FEATURE_ATOMICS



This keeps coming up repeatedly and the ACLE has finally added
__ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of
the latest ACLE release
(https://developer.arm.com/docs/101028/latest/5-feature-test-macros)

I know it's late for GCC-9 but this is a simple macro which need not
wait  for another year.

Ok for trunk and to backport to all release branches ?

Tested with a simple build and a smoke test.

Backport from mainline.
PR target/86538
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_ATOMICS

Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/aarch64/aarch64-c.c

[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not

2019-04-30 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538

--- Comment #6 from Ramana Radhakrishnan  ---
Author: ramana
Date: Tue Apr 30 14:57:50 2019
New Revision: 270702

URL: https://gcc.gnu.org/viewcvs?rev=270702=gcc=rev
Log:
[Patch AArch64] Add __ARM_FEATURE_ATOMICS



This keeps coming up repeatedly and the ACLE has finally added
__ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of
the latest ACLE release
(https://developer.arm.com/docs/101028/latest/5-feature-test-macros)

I know it's late for GCC-9 but this is a simple macro which need not
wait  for another year.

Ok for trunk and to backport to all release branches ?

Tested with a simple build and a smoke test.

Backport from mainline.
PR target/86538
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_ATOMICS

Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/aarch64/aarch64-c.c

[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not

2019-04-30 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538

--- Comment #5 from Ramana Radhakrishnan  ---
Author: ramana
Date: Tue Apr 30 12:02:30 2019
New Revision: 270689

URL: https://gcc.gnu.org/viewcvs?rev=270689=gcc=rev
Log:


[Patch AArch64] Add __ARM_FEATURE_ATOMICS



This keeps coming up repeatedly and the ACLE has finally added
__ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of
the latest ACLE release
(https://developer.arm.com/docs/101028/latest/5-feature-test-macros)

I know it's late for GCC-9 but this is a simple macro which need not
wait  for another year.

Ok for trunk and to backport to all release branches ?

Tested with a simple build and a smoke test.

Backport from mainline.
PR target/86538
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_ATOMICS

Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/config/aarch64/aarch64-c.c

[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not

2019-04-30 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538

--- Comment #4 from Ramana Radhakrishnan  ---
Author: ramana
Date: Tue Apr 30 11:22:11 2019
New Revision: 270686

URL: https://gcc.gnu.org/viewcvs?rev=270686=gcc=rev
Log:
[Patch AArch64] Add __ARM_FEATURE_ATOMICS


This keeps coming up repeatedly and the ACLE has finally added
__ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of
the latest ACLE release
(https://developer.arm.com/docs/101028/latest/5-feature-test-macros)

I know it's late for GCC-9 but this is a simple macro which need not
wait  for another year.

Ok for trunk and to backport to all release branches ?

Tested with a simple build and a smoke test.

PR target/86538
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_ATOMICS

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64-c.c

[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not

2019-04-30 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
   Last reconfirmed||2019-04-30
 CC||ramana at gcc dot gnu.org
 Resolution|WONTFIX |---
   Assignee|unassigned at gcc dot gnu.org  |ramana at gcc dot 
gnu.org
   Target Milestone|--- |7.5
 Ever confirmed|0   |1

--- Comment #3 from Ramana Radhakrishnan  ---
reopening and taking.

[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf

2019-04-23 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan  ---
Seems to have been "fixed" by the commit to fix PR87369,

Richard, is this something to backport ? Prima-facie , it appears not and we
will need an appropriate fix for the release branches.

[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf

2019-04-23 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||ramana at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ramana at gcc dot 
gnu.org

--- Comment #2 from Ramana Radhakrishnan  ---
I'll take a look.

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-04-12 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

--- Comment #45 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #42)
> Thanks for the explanation.
> In that case, I think it would be better to just add
> __attribute__((target("general-regs-only")))
> to the 
> #ifdef __ARM_EABI_UNWINDER__
> _Unwind_Reason_Code
> PERSONALITY_FUNCTION (_Unwind_State, struct _Unwind_Exception *,
>   struct _Unwind_Context *);
> decl in unwind-c.c and similarly for eh_personality.cc and to other
> personality routines that use CONTINUE_UNWINDING as well (plus to
> unwind-arm.c and pr-support.c using pragma for everything).

Thanks for all the analysis, this is what I had  - I've been swamped this week
on a few other things, let me get this wrapped up soonish. (read it as during
next week).(In reply to Bernd Edlinger from comment #44)
> Comment on attachment 46013 [details]
> updated patch.
> 
> @@ -122,12 +122,21 @@ extern tree arm_fp16_type_node;
>  #define TARGET_32BIT_P(flags)  (TARGET_ARM_P (flags) || TARGET_THUMB2_P
> (flags))
>  
>  /* Run-time Target Specification.  */
> -/* Use hardware floating point instructions. */
> +/* Use hardware floating point instructions. -mgeneral-regs-only prevents
> +the use of floating point instructions and registers but does not prevent
> +emission of floating point pcs attributes.  */
>  #define TARGET_HARD_FLOAT(arm_float_abi != ARM_FLOAT_ABI_SOFT\
> +  && bitmap_bit_p (arm_active_target.isa, \
> +   isa_bit_vfpv2) \
> +  && TARGET_32BIT \
> +  && !TARGET_GENERAL_REGS_ONLY)
> +
> +#define TARGET_HARD_FLOAT_SUB(arm_float_abi != ARM_FLOAT_ABI_SOFT
> \
>&& bitmap_bit_p (arm_active_target.isa, \
> isa_bit_vfpv2) \
>&& TARGET_32BIT)
> 
> 
> BTW, you could define TARGET_HARD_FLOAT in terms of TARGET_HARD_FLOAT_SUB and
> !TARGET_GENERAL_REGS_ONLY.

Yep I could - been traveling quite a lot and I haven't managed to find someone
else to catch this - I will pick this up next week .

My fault, apologies.

Ramana

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-03-22 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

Ramana Radhakrishnan  changed:

   What|Removed |Added

  Attachment #45552|0   |1
is obsolete||
  Attachment #45580|0   |1
is obsolete||

--- Comment #32 from Ramana Radhakrishnan  ---
Created attachment 46013
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46013=edit
updated patch.

Having discussed this with Richard further , instead of adding -mfpu=none , we
would prefer a mgeneral-regs-only option which keeps the 2 separate. 

I'm sorry about the time it has taken to get back but this is what I have right
now. The hunk in eh_personality.cc isn't very pleasing for me but that's
because of the warning in the ABI code which triggers on the build for a hard
float build because we do have inline functions consuming floating point. 


Either I drop the warning or I keep the hunk in eh_personality.cc - any
preferences / thoughts ?

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-02-22 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

--- Comment #30 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #29)
> Ramana, any progress on this?

I'm still trying to get the various spec files and the t-multilib bits sorted
and half-term has intervened here in the UK.

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-02-08 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

--- Comment #27 from Ramana Radhakrishnan  ---
(In reply to Bernd Edlinger from comment #25)
> you might consider adding something like that to your patch:
> 
> Index: elf.h
> ===
> --- elf.h (revision 268337)
> +++ elf.h (working copy)
> @@ -64,7 +64,7 @@
>  %{mapcs-*:-mapcs-%*} \
>  %(subtarget_asm_float_spec) \
>  %{mthumb-interwork:-mthumb-interwork} \
> -%{mfloat-abi=*} %{!mfpu=auto: %{mfpu=*}} \
> +%{mfloat-abi=*} %{!mfpu=auto: %{!mfpu=none: %{mfpu=*}}} \
>  %(subtarget_extra_asm_spec)"
>  #endif
>  
> 
> 
> otherwise using -mfpu=none won't work on the command line.
> becuse gas does not understand it.

Yes, that's what I've been playing with. I've run out of time this week because
of other work commitments, I hope to get back to this early next week.

Ramana

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-01-29 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ramana at gcc dot 
gnu.org

--- Comment #21 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #19)
> (In reply to Florian Weimer from comment #18)
> > (In reply to Ramana Radhakrishnan from comment #15)
> > > Testing this and would be grateful for a test run.
> > 
> > Is this hunk needed as well, or will the unwinding information take care of
> > this?  (__cxa_call_unexpected has another d8 register spill.)
> 
> No idea here.

I'll try and analyse that - The key is ensuring that there is absolutely no
floating point code in eh_call.cc , if there is likely to be floating point
anywhere this isn't correct 

> 
> > --- libstdc++-v3/libsupc++/eh_call.cc   (revision 268364)
> > +++ libstdc++-v3/libsupc++/eh_call.cc   (working copy)
> > @@ -22,6 +22,11 @@
> >  // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> >  // <http://www.gnu.org/licenses/>.
> >  
> > +#ifdef __arm__
> > +#pragma GCC target ("fpu=none")
> > +#pragma GCC push_options
> > +#endif
> 
> But why the #pragma GCC push_options?  That makes no sense.
> Either you need to push options before GCC target and pop later on, but if
> you pop at the end of TU and don't really expect anything else to be emitted
> there, only #pragma GCC target should be enough (that applies to the other
> patch too).

I think that's just percolated my quick hack to discuss the issue. 

It should be enough to do #pragma GCC target . The final patch I have does that
. Thanks for confirming that the patch in it's essence fixes up the issue.

regards
Ramana

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-01-29 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

--- Comment #16 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #14)
> We require GNU make, so one can use something like:
> unwind-arm.o unwind-c.o libunwind.o pr-support.o: CFLAGS += -mfpu=none
> or similar in libgcc/config/arm/t-arm (or similar) with a comment explaining
> the reason.  For eh_personality.o that needs to be done elsewhere and there
> are no such makefile fragments (and libtool is used).

Sadly that doesn't work for -mfpu=none in t-arm because we still need gcc-9 to
build with older binutils that don't necessarily support -mfpu=none, thus for
now let's hide this with target pragmas.

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-01-29 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

Ramana Radhakrishnan  changed:

   What|Removed |Added

  Attachment #45547|0   |1
is obsolete||

--- Comment #15 from Ramana Radhakrishnan  ---
Created attachment 45552
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45552=edit
new patch.

Testing this and would be grateful for a test run.

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-01-29 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-29
 Ever confirmed|0   |1

--- Comment #13 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #12)
> If one appends -mfloat-abi=soft to command lines of those files, does that
> imply incompatible ABI even if nothing is passed in float/VFP etc. registers
> nor there is any floating point code?

-mfloat-abi=soft is an interesting option, it means use floating point
emulation code using the base pcs as well as use the base parameter passing
conventions for passing floating point parameters to functions. 

so it would end up failing at link time . That's why we need an -mfpu=none
option which is silent and I've not liked it for a while. 

(In reply to Jakub Jelinek from comment #11)
> Comment on attachment 45547 [details]
> untested prototype patch.
> 
> Doesn't the whole unwinder (so eh_personality.cc (whole, not just one
> function in it), unwind-arm.c, unwind-c.c, maybe some other unwind-*.c))
> need that?

Yes that would be needed. Reading the EHABI again suggests that - I don't see a
macro that would help with that everywhere.

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-01-29 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

--- Comment #10 from Ramana Radhakrishnan  ---
Created attachment 45547
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45547=edit
untested prototype patch.

Not sure if this is complete yet but it gives a framework to dig further.

[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register

2019-01-28 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #7 from Ramana Radhakrishnan  ---
(In reply to Florian Weimer from comment #0)
> In glibc, we have a test, nptl/tst-thread-exit-clobber, that attempts to
> verify if registers are properly restored by unwinding.  (The actual target
> of the the test is pthread_exit, but it covers more than that.)
> 
> This tests fails when running with GCC 9 libstdc++, even if glibc and the
> test were built with GCC 8, and libgcc_s is replaced with the version for
> GCC 8 (which works when running against GCC 8 libstdc++).
> 
> In the test, the d8 register is not restored properly during unwinding, it
> is set to zero.  d9, d10 etc. are restored.
> 
> I noticed that in GCC 9, __gxx_personality_v0 saves the d8 VFP register:
> 
> 0007b620 <__gxx_personality_v0@@CXXABI_1.3>:
>7b620:   e92d4ff0push{r4, r5, r6, r7, r8, r9, sl, fp, lr}
>7b624:   ed2d8b02vpush   {d8}
>7b628:   e3a03000mov r3, #0
>7b62c:   e1a08001mov r8, r1
> 
> And it actually uses s16 and s17, apparently for spilling integer registers.
> Perhaps the unwinder is not prepared to deal with that.

d8 is composed of s16 and s17. That should just be fine. The single precision
FP registers are packed into double precision registers in the VFP
architecture.

Ramana

[Bug target/84923] [8 regression] gcc.dg/attr-weakref-1.c failed on aarch64

2019-01-28 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|RESOLVED|NEW
 Resolution|FIXED   |---

--- Comment #12 from Ramana Radhakrishnan  ---
I don't think this is fixed on GCC-8 as the commit to trunk happened on May 21
18 after the release was made 


Thanks,
Ramana

[Bug target/88734] [8 Regression] AArch64's ACLE intrinsics give an ICE instead of compile error when option mismatch.

2019-01-25 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88734

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #11 from Ramana Radhakrishnan  ---
(In reply to Tamar Christina from comment #10)
> Thanks Jakub! testing hasn't shown any breakages.

I would prefer this to be backported to GCC-8 if it has baked reasonably on
trunk.

[Bug target/88510] GCC generates inefficient U64x2/v2di scalar multiply for NEON32

2019-01-14 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88510

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Target|armv7-a |arm, aarch64
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-14
 CC||ramana at gcc dot gnu.org
   Target Milestone|--- |10.0
 Ever confirmed|0   |1

--- Comment #3 from Ramana Radhakrishnan  ---
We are in stage4 at this point of time and a patch for this between now and
when GCC9 releases isn't appropriate (i.e. April). Hopefully someone will pick
this up afterwards for both backends as the logic required for the expansion
should be pretty much identical give or take backend integration issues. 

Though I wonder if this is better handled in the fall back path for expansion
of v2di multiplications instead of duplicating this logic in both arm and
aarch64 backends.

[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm

2018-12-14 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-12-14
 Ever confirmed|0   |1

--- Comment #7 from Ramana Radhakrishnan  ---
Confirmed.

[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm

2018-12-14 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #6 from Ramana Radhakrishnan  ---
(In reply to Segher Boessenkool from comment #5)
> The first one just needs an xfail.  I don't know if it should be *-*-* there
> or only arm*-*-* should be added.
> 
> The other two need some debugging by someone who knows the target and/or
> these tests.

for the addr-modes-float.c case there are additional vmov's being generated and
thus is certainly a regression. 

--- 8.s 2018-12-14 09:41:04.367843079 +
+++ addr-modes-float.s  2018-12-14 09:40:39.907980812 +
@@ -139,10 +139,13 @@
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
+   vmovq8, q0  @ ti
mov r3, r0
+   vmovq9, q1  @ ti
add r0, r0, #48
-   vst3.8  {d0, d2, d4}, [r3]!
-   vst3.8  {d1, d3, d5}, [r3]
+   vmovq10, q2  @ ti
+   vst3.8  {d16, d18, d20}, [r3]!
+   vst3.8  {d17, d19, d21}, [r3]

[Bug target/88013] can't vectorize rgb to grayscale conversion code

2018-12-14 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-12-14
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #8 from Ramana Radhakrishnan  ---

> vshr.u16q9, q9, #8
> vshr.u16q8, q8, #8
> vmovn.i16   d20, q9
> vmovn.i16   d21, q8

Isn't that "just" a missing combine pattern to get us vshrn in both backends ? 

Ramana

[Bug tree-optimization/88259] vectorization failure for a typical loop for getting max value and index

2018-11-29 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88259

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #2 from Ramana Radhakrishnan  ---
(In reply to ktkachov from comment #1)
> Confirmed.
> Trying to find just the index (not the max value) vectorises as well:
> void test_vec(int *data, int n) {
> int best_i, best = 0;
> 
> for (int i = 0; i < n; i++) {
> if (data[i] > best) {
> //best = data[i];
> best_i = i;
> }
> }
> 
> data[best_i] = data[0];
> data[0] = best;
> }
> 
> 
> -O3:
> .L4:
> ldr q1, [x2], 16
> mov v3.16b, v2.16b
> add v2.4s, v2.4s, v4.4s
> cmlev1.4s, v1.4s, #0
> cmp x2, x3
> bif v0.16b, v3.16b, v1.16b
> bne .L4
> smaxv   s0, v0.4s
> and w3, w1, -4
> umovw2, v0.s[0]
> cmn w2, #1
> cselw2, w2, wzr, ne
> tst x1, 3
> beq .L2
> .L3:
> 
> But their combination seems like it's throwing the machinery off. I'm
> guessing the index-finding needs some if-conversion and masking to happen in
> the vectoriser

ISTR there is some limit in if conversion around the vectorizer where it only
works on very simple if-blocks. But this is from memory and it's a bit fuzzy
now.

[Bug debug/65771] ICE (in loc_list_from_tree, at dwarf2out.c:14964) on arm-linux-gnueabihf

2018-11-19 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65771

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|5.5 |6.0

--- Comment #20 from Ramana Radhakrishnan  ---
Fixed for GCC 6 from the timeline here. Wont fix for GCC 5.

[Bug target/53440] [arm] generic thunk code fails for method which uses '...'

2018-11-19 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53440

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |7.0

--- Comment #10 from Ramana Radhakrishnan  ---
Fixed for GCC 7.

[Bug tree-optimization/43721] Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call

2018-11-19 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43721

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |7.0

--- Comment #12 from Ramana Radhakrishnan  ---
Fixed in GCC7 then.

[Bug target/87867] [7/8 regression] ICE on virtual destructor (-mlong-calls -ffunction-sections) on arm-none-eabi

2018-11-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87867

--- Comment #2 from Ramana Radhakrishnan  ---
Author: ramana
Date: Fri Nov  9 12:50:51 2018
New Revision: 265965

URL: https://gcc.gnu.org/viewcvs?rev=265965=gcc=rev
Log:
[PATCH, arm] Backport -- Fix ICE during thunk generation with -mlong-calls

For Mihail Ionescu.

2018-11-09  Mihail Ionescu  

PR target/87867
Backport from mainline
2018-09-17  Eric Botcazou  

* g++.dg/other/thunk2a.C: New test.
* g++.dg/other/thunk2b.C: Likewise.

2018-11-09  Mihail Ionescu  

Backport from mainiline
2018-09-17  Eric Botcazou  

* g++.dg/other/thunk2a.C: New test.
* g++.dg/other/thunk2b.C: Likewise.
* g++.dg/other/vthunk1.C: Rename as thunk1.C


Added:
branches/gcc-8-branch/gcc/testsuite/g++.dg/other/thunk1.C
  - copied unchanged from r265964,
branches/gcc-8-branch/gcc/testsuite/g++.dg/other/vthunk1.C
branches/gcc-8-branch/gcc/testsuite/g++.dg/other/thunk2a.C
branches/gcc-8-branch/gcc/testsuite/g++.dg/other/thunk2b.C
Removed:
branches/gcc-8-branch/gcc/testsuite/g++.dg/other/vthunk1.C
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/arm/arm.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug target/87330] ICE in scan_rtx_reg, at regrename.c:1097

2018-10-30 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87330

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |9.0

--- Comment #9 from Ramana Radhakrishnan  ---
Fixed then ?

[Bug middle-end/86815] [8/9 regression] ICE on valid code on armhf

2018-10-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86815

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2018-10-11
 Ever confirmed|0   |1

--- Comment #10 from Ramana Radhakrishnan  ---
Tip of gcc-8 isn't failing. So until we have more info - this one is waiting
I'm afraid.

[Bug target/87565] suboptimal memory-indirect tailcalls on arm

2018-10-10 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87565

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan  ---
(In reply to Alexander Monakov from comment #2)
> PLT trampolines all end with 'ldr pc, [ip, xxx]!', so do all calls via PLT
> suffer from poor branch prediction of such indirect jumps?

IIRC you still need to use that in the PLT trampoline for folks to use Linux
like userland on strongarm which has a small user constituency still.

[Bug target/82227] ARM thumb inefficient tailcall return sequence (multiple pops)

2018-10-10 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82227

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Priority|P3  |P4
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-10-10
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #1 from Ramana Radhakrishnan  ---
Confirmed.(In reply to Peter Cordes from comment #0)
> int ext();
> int tailcall_external() { return ext(); }
>  // https://godbolt.org/g/W43fxw
> 
> gcc6.3 -Os -mthumb
> 
> push{r4, lr}
> bl  ext
> pop {r4}
> pop {r1}# two separate pop instructions isn't optimal
> bx  r1
> 
> gcc6.3 -Os -mthumb -mno-thumb-interwork
> 
> push{r4, lr}
> bl  ext
> pop {r4, pc}
> 
> A 16-bit thumb pop instruction can only pop "lo" registers and PC, not back
> into LR.  That's why it can't  pop {r4, lr}  / bx lr  like it does in -marm
> mode.
> 
> But there is a more efficient way:
> 
> pop {r1, r2}
> bx  r2

Yep. 


> 
> We never needed a call-preserved register; r4 was pushed only to keep the
> stack aligned.  So as long as we have 2 call-clobbered regs available, we
> can pop the padding that came from r4, and pop the saved lr, both into
> call-clobbered regs.
> 
> If we did need a call-preserved register for anything, two separate pop
> instructions are presumably better than any combination of pop-multiple and
> reg-reg moves.
> 
> 
> 
> This also happens with two identical functions with different names, with
> -Os.  One compiles into a call to the other, done exactly the same way as to
> an external function.  (See the godbolt link above).
> 
> In that case, I don't understand why we can't just tail-call with a `b`
> instruction (like we get with -marm).  Both functions are compiled to Thumb2
> code, so we can jump to the other and let it do an interworking return,
> right?  Especially with -mno-thumb-interwork, I don't understand why
> tail-calls aren't optimized to a jump.

You need to read up on the various levels of the architecture and the command
line options. Thumb2 doesn't show up at the default level of the architecture
and needs atleast -mthumb -march=armv6t2 . Try reading this for a beginners
guide to the architecture. 

https://community.arm.com/tools/b/blog/posts/arm-cortex-a-processors-and-gcc-command-lines?CommentSortBy=CreatedDate=Descending

We don't tail call in general for Thumb1 which is what your options imply
because the branches are just too short (encoded in 16bits ) IIRC.


> 
> (I'm not an expert on ARM / Thumb stuff, so there might be a reason I'm
> missing.)

[Bug bootstrap/84199] Error building gcc 7.3.0 on Odroid XU4 (ARM, Ubuntu): cannot load liblto_plugin.so

2018-10-10 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84199

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ramana at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #1 from Ramana Radhakrishnan  ---
I don't think anyone is going to go fetch an odroid for this - it sounds like a
problem in your environment as many folks are building / able to build gcc 7.x
on an armhf ubuntu system.

Looking at the build log - try with appropriate --with-arch --with-float and
--with-fpu options to do your build.

In general this on armhf is

--with-arch=armv7-a --with-fpu=neon --with-float=hard though you could get
better options specifically for the odroid.

[Bug sanitizer/86755] [ASAN] Libasan failed to be build for arm with -mthumb and -fno-omit-frame-pointer

2018-10-10 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86755

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-10-10
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Ramana Radhakrishnan  ---
Confirmed.

[Bug middle-end/86815] [8/9 regression] ICE on valid code on armhf

2018-10-10 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86815

--- Comment #9 from Ramana Radhakrishnan  ---
(In reply to Martin Liška from comment #8)
> Unfortunately I can't reproduce that with cross compiler.

Me neither today. 

Gianfranco , could you check if you are running out of memory on the machine
that you are doing this on with GCC-8 ? 

Is there a chance that the OOM killer came along when building this file on the
native arm machine that you were running this on ? 

I've tried running this with stock gcc 8 in debian on an armhf docker image and
will try building something up later today on my machine , but it may be worth
double checking that something like an OOM killer or swap isn't what's
throttling the build here.

[Bug middle-end/86815] [8/9 regression] ICE on valid code on armhf

2018-10-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86815

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Keywords||needs-reduction
 CC||ramana at gcc dot gnu.org

--- Comment #7 from Ramana Radhakrishnan  ---
(In reply to Gianfranco from comment #6)
> Created attachment 44485 [details]
> another failing output
> 
> I'm attaching another file suffering from the same issue (mostly every cpp
> file has this failure)
> this file is only ~2Mb, so maybe reducing it might be easier

Needs reduction.

[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev

2018-10-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #5 from Ramana Radhakrishnan  ---
(In reply to jos...@codesourcery.com from comment #4)
> Any unaligned access things that don't work for big-endian ARM are 
> probably fallout from the issues with big-endian NEON (NEON architectural 
> lane numbers are different from the architecture-independent lane numbers 
> in GNU C vector extensions and GCC IR, and GCC expects each machine mode 
> to have a single defined memory layout and a single defined layout in any 
> given register, and to be able to move between core and NEON registers, 
> and between core registers and memory, in the respective layouts used for 
> those registers, but some NEON loads and stores for big-endian don't work 
> with those expectations, so unaligned vector operations are limited for 
> big-endian ARM).

Correct, we don't allow misaligned access for Neon because of exactly the above
mentioned reasons. 


I would have however expected misaligned access to work with -march=armv7-a
-munaligned-access -mfpu=vfpv3-d16 -mfloat-abi=softfp/hard on the command line
for the afore mentioned testcase as we do have a movmisalign pattern in arm.md
that should kick in overriding the movmisalign pattern in neon.md.  It probably
needs a little more detailed investigation.

[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794

2018-10-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870

--- Comment #8 from Ramana Radhakrishnan  ---
(In reply to Martin Liška from comment #5)
> (In reply to Ramana Radhakrishnan from comment #4)
> > (In reply to Martin Liška from comment #3)
> > > Can't reproduce with GCC 7.3.0 on x86_64:
> > > 
> > > + gcc-7 -O2 -flto -c test_1.i -o test_1.o
> > > + gcc-7 -O2 -flto -c test_2.i -o test_2.o
> > > + gcc-7 test_1.o test_2.o
> > > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld:
> > > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crt1.o: in function
> > > `_start':
> > > /home/abuild/rpmbuild/BUILD/glibc-2.27/csu/../sysdeps/x86_64/start.S:104:
> > > undefined reference to `main'
> > > collect2: error: ld returned 1 exit status
> > > 
> > > Richi how did you achieve to reproduce that?
> > 
> > It's still failing on aarch64-none-linux-gnu. So that doesn't mean this goes
> > waiting.
> 
> Native or cross compiler? Because cross compiler works fine for me:
> 
> $ aarch64-suse-linux-g++-8 -c test_1.i -c -flto
> $ aarch64-suse-linux-g++-8 -c test_2.i -c -flto
> $ /usr/lib64/gcc/aarch64-suse-linux/8/lto1 test_1.o test_2.o
> Reading object files: test_1.o test_2.o {GC start 1697k} 
> Reading the callgraph
> Merging declarations
> Reading summaries
> Reading function bodies:
> Performing interprocedural optimizations
>
> 
> Assembling functions:
>   init_xyz_0 init_xyz_1
> Time variable   usr   sys 
> wall   GGC
>  phase setup:   0.00 (  0%)   0.00 (  0%)   0.00 ( 
> 0%)1847 kB (  1%)
>  phase opt and generate :   2.11 (100%)   0.12 ( 92%)   2.23
> (100%)  188629 kB ( 99%)
>  phase finalize :   0.00 (  0%)   0.01 (  8%)   0.01 ( 
> 0%)   0 kB (  0%)
>  lto stream inflate :   0.12 (  6%)   0.03 ( 23%)   0.15 ( 
> 7%)   0 kB (  0%)
>  ipa lto constructors in:   0.65 ( 31%)   0.03 ( 23%)   0.69 (
> 31%)  188513 kB ( 99%)
>  TOTAL  :   2.11  0.13  2.24
> 190523 kB

cross-compiler built with revision r264905 and note that we have
--enable-checking=yes turned on. Maybe that makes a difference ?

[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794

2018-10-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #4 from Ramana Radhakrishnan  ---
(In reply to Martin Liška from comment #3)
> Can't reproduce with GCC 7.3.0 on x86_64:
> 
> + gcc-7 -O2 -flto -c test_1.i -o test_1.o
> + gcc-7 -O2 -flto -c test_2.i -o test_2.o
> + gcc-7 test_1.o test_2.o
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld:
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crt1.o: in function
> `_start':
> /home/abuild/rpmbuild/BUILD/glibc-2.27/csu/../sysdeps/x86_64/start.S:104:
> undefined reference to `main'
> collect2: error: ld returned 1 exit status
> 
> Richi how did you achieve to reproduce that?

It's still failing on aarch64-none-linux-gnu. So that doesn't mean this goes
waiting.

[Bug target/87563] [9 regression ] ICE with -march=armv8-a+sve

2018-10-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Target||aarch64-none-elf
   Target Milestone|--- |9.0

--- Comment #2 from Ramana Radhakrishnan  ---
Fix target and milestone.

[Bug target/87563] [9 regression ] ICE with -march=armv8-a+sve

2018-10-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-10-09
 Ever confirmed|0   |1

--- Comment #1 from Ramana Radhakrishnan  ---
Confirmed.

[Bug target/87563] New: [9 regression ] ICE with -march=armv8-a+sve

2018-10-09 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563

Bug ID: 87563
   Summary: [9 regression ] ICE with -march=armv8-a+sve
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ramana at gcc dot gnu.org
  Target Milestone: ---

Somewhere between r261702 and r262881 the following testcase ICEs with -Ofast
-O3 -march=armv8-a+sve. 


int a, b, c, *e;
int d[2];
void f() {
  while (c) {
d[0] = 4;
d[1] = 4;
*e = b == 0 ? 0 : a / b;
  }
}

/tmp/sve.c:7:21: internal compiler error: in maybe_gen_insn, at optabs.c:7307
 *e = b == 0 ? 0 : a / b;
  ~~~^~~
0xb06c73 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/optabs.c:7307
0xb072be maybe_expand_insn(insn_code, unsigned int, expand_operand*)
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/optabs.c:7351
0xb095ef expand_insn(insn_code, unsigned int, expand_operand*)
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/optabs.c:7382
0x9d586a expand_direct_optab_fn
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.c:2921
0x9d6143 expand_COND_DIV
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.def:155
0x9d76bd expand_internal_call(internal_fn, gcall*)
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.c:3524
0x9d76eb expand_internal_call(gcall*)
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.c:3532
0x757bb2 expand_call_stmt
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:2596
0x757bb2 expand_gimple_stmt_1
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:3575
0x757bb2 expand_gimple_stmt
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:3734
0x75b8a5 expand_gimple_basic_block
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:5769
0x75f950 execute
   
/tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:6372

[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Target||arm-none-linux-gnueabi ,
   ||arm-none-eabi
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-25
 CC||ramana at gcc dot gnu.org
  Known to work||7.2.0
Summary|inline asm sometimes|[8/9 regression] inline asm
   |ignores 'register   |sometimes ignores 'register
   |asm("reg")' declarations|asm("reg")' declarations
 Ever confirmed|0   |1
  Known to fail||8.1.0, 9.0

--- Comment #3 from Ramana Radhakrishnan  ---
Confirmed.

[Bug middle-end/86640] [8/9 regression] ICE in combine

2018-07-23 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86640

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 Target||arm-none-linux-gnueabihf
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-23
 Ever confirmed|0   |1

--- Comment #1 from Ramana Radhakrishnan  ---
confirmed.

[Bug middle-end/86640] New: [8/9 regression] ICE in combine

2018-07-23 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86640

Bug ID: 86640
   Summary: [8/9 regression] ICE in combine
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ramana at gcc dot gnu.org
  Target Milestone: ---

char fn1() {

  long long b[5];

  for (int a = 0; a < 5; a++)

b[a] = ~0ULL;

  return b[3];

}


$> arm-none-linux-gnueabihf-gcc -c -O3 -mfpu=neon -mfloat-abi=hard
-march=armv7-a /tmp/crash.c 
during RTL pass: combine
/tmp/crash.c: In function ‘fn1’:
/tmp/crash.c:11:1: internal compiler error: in do_SUBST, at combine.c:731
 }
 ^
0x12e637c do_SUBST
   
/tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:730
0x12f913e subst
   
/tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:5589
0x12fb2d1 try_combine
   
/tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:3359
0x1301398 combine_instructions
   
/tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:1299
0x1301398 rest_of_handle_combine
   
/tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:14898
0x1301398 execute
   
/tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:14943

[Bug target/86555] unaligned address for ldrd/strd on armv5e

2018-07-18 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86555

--- Comment #4 from Ramana Radhakrishnan  ---

(In reply to Khem Raj from comment #2)
> we can avoid the problem by altering the structure, thats not an issue, but
> do you think compiler is right here by assuming to generate LDRD on a 4byte
> aligned address when it is told that architecture (-march=armv5te) its
> building for does not support 4byte aligned address for LDRD but only 8-byte
> aligned ?

It is correct for the compiler to be doing this - the compiler has just not
been given enough information. buf can only get aligned to 8 bytes if there is
an input attribute setting the alignment properly otherwise it's a char array
and the compiler is within it's rights not to have to force align upwards to 8
bytes in this case. When the compiler is derefencing de->d_off it expects it to
be naturally 8 byte aligned. 


Fix the source.

[Bug tree-optimization/80641] missed optimization with with std::vector resize in loop

2018-07-16 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80641

Ramana Radhakrishnan  changed:

   What|Removed |Added

  Known to fail||7.3.1

--- Comment #14 from Ramana Radhakrishnan  ---
7.3.1 appears to fail the original testcase for an aarch64 cross compiler to
Linux with -O3 and -Wall.

[Bug tree-optimization/80641] missed optimization with with std::vector resize in loop

2018-07-16 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80641

Ramana Radhakrishnan  changed:

   What|Removed |Added

  Known to work||6.4.1, 8.1.0
  Known to fail||7.2.1

--- Comment #13 from Ramana Radhakrishnan  ---
With the original testcase I can still see a warning come out for a reasonably
recent GCC 7 snapshot on aarch64 while it appears to work find on gcc 8 and gcc
6. 

Thanks
Ramana

[Bug tree-optimization/80641] missed optimization with with std::vector resize in loop

2018-07-16 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80641

--- Comment #12 from Ramana Radhakrishnan  ---
(In reply to Martin Sebor from comment #11)
> *** Bug 86516 has been marked as a duplicate of this bug. ***

(In reply to Paul Gotch from comment #10)
> I'm afraid the changes made to libstdc++ have only solved part of the
> regression if you say something like
> 
> std::vector v;
> 
> if(c.size() > 0)
>  c.resize(c.size() - 1);
> 
> then you no longer get a warning in 7.3 however if instead you do
> 
> if(! c.empty())
>  c.resize(c.size() -1);
> 
> the warning is produced just as in early 7.x releases. No warning is
> produced in 6.x so this is still a regression.
> 
> I presume this happens as empty wasn't annotated in libstdc++ and the
> underlying data flow analysis bug is yet to be fixed.

So why is this not a regression ? It's quite clear that the annotations did not
do enough to workaround the issue.

[Bug tree-optimization/85804] [8/9 Regression][AArch64] Mis-compilation of loop with strided array access and xor reduction

2018-07-13 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85804

--- Comment #3 from Ramana Radhakrishnan  ---
(In reply to Ramana Radhakrishnan from comment #2)
> Patch being discussed here.
> https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01026.html


Bin are you still working on this ?

[Bug target/85854] Performance regression from gcc 4.9.2

2018-07-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85854

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2018-07-11
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Ramana Radhakrishnan  ---
I'm unable to build the pre-processed file with 4.9  - is it possible for you
to attach a non-preprocessed version as well ? 

The reason this happens compared to the usual instructions is that the
implementation of the intrinsic can well change between compiler versions. 

regards
Ramana

[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.

2018-07-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209

--- Comment #13 from Ramana Radhakrishnan  ---
Sameera,

If you are working on this , can you please assign this to yourself ? 

Ramana

[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.

2018-07-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-11
 Ever confirmed|0   |1

--- Comment #12 from Ramana Radhakrishnan  ---
Confirmed then.

[Bug target/85910] config/aarch64/aarch64.c:15653:12: warning: duplicated ‘if’ condition

2018-07-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85910

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-11
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Ramana Radhakrishnan  ---
Confirmed.

[Bug libgcc/85967] [ARM] No unwinding support for division functions

2018-07-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85967

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-11
 CC||ramana at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ramana at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Ramana Radhakrishnan  ---
This patch would fit under the 10 line rule but for the future can I also
confirm that you have a copyright assignment in place with the FSF ? 

It would be good to test this and push this into the tree after a testrun, I
can do that for you.

Ramana

[Bug middle-end/83623] [8 Regression] ICE: in convert_move, at expr.c:248 with -march=knl and 16bit vector bswap/rotate

2018-06-20 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83623

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 CC||ramana at gcc dot gnu.org
 Resolution|FIXED   |---

--- Comment #8 from Ramana Radhakrishnan  ---
Seems to need a fix for gcc 6 branch based on PR86166

[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.

2018-06-19 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209

--- Comment #3 from Ramana Radhakrishnan  ---
(In reply to sameerad from comment #2)
> Ramana, it is another peephole that I am trying to explore for falkor. It
> combines loads/stores of shorter types (QI/HI/SI) into single load/store of
> larger type (SI/DI).

Ah I see. Sorry , not enough coffee yet.

[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.

2018-06-19 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #1 from Ramana Radhakrishnan  ---
(In reply to sameerad from comment #0)
> While implementing peephole2 for combining shorter types loads/stores into
> larger type load/store, following testcase was found for aarch64 for which
> peephole does not happen because the type of zero/sign extended operands is
> not the same.
> 
> Test program:
> unsigned short
> subus (unsigned short *array)
> {
>   return array[0] + array[1];
> }
> 
> Expander generated RTL:
> (insn 6 3 7 2 (set (reg:HI 96)
> (mem:HI (reg/v/f:DI 94 [ array ]) [1 *array_4(D)+0 S2 A16]))
>  (nil))
> (insn 7 6 8 2 (set (reg:HI 97)
> (mem:HI (plus:DI (reg/v/f:DI 94 [ array ])
> (const_int 2 [0x2])) [1 MEM[(short unsigned int *)array_4(D)
> + 2B]+0 S2 A16]))
>  (nil))
> (insn 8 7 9 2 (set (reg:SI 99)
> (subreg:SI (reg:HI 97) 0))
>  (nil))
> (insn 9 8 10 2 (set (reg:SI 98)
> (plus:SI (subreg:SI (reg:HI 96) 0)
> (reg:SI 99)))
>  (expr_list:REG_EQUAL (plus:SI (subreg:SI (reg:HI 96) 0)
> (subreg:SI (reg:HI 97) 0))
> (nil)))
> 
> The combiner combines insn 7 and 8 to generate zero extension to SI mode.
>  
> (insn 8 7 9 2 (set (reg:SI 99 [ MEM[(short unsigned int *)array_4(D) + 2B] ])
> (zero_extend:SI (mem:HI (plus:DI (reg/v/f:DI 94 [ array ])
> (const_int 2 [0x2])) [1 MEM[(short unsigned int
> *)array_4(D) + 2B]+0 S2 A16]))) {*zero_extendhisi2_aarch64}
>  (expr_list:REG_DEAD (reg/v/f:DI 94 [ array ])
> (nil)))
> 
>  The reload pass removes SUBREGs, which holds information about desired
> type, because of which HImode regs are zero extended to DImode.
> 
> (insn 8 7 6 2 (set (reg:SI 1 x1 [orig:99 MEM[(short unsigned int
> *)array_4(D) + 2B] ] [99])
> (zero_extend:SI (mem:HI (plus:DI (reg/v/f:DI 0 x0 [orig:94 array ]
> [94])
> (const_int 2 [0x2])) [1 MEM[(short unsigned int
> *)array_4(D) + 2B]+0 S2 A16]))) {*zero_extendhisi2_aarch64}
>  (nil))
> (insn 6 8 9 2 (set (reg:DI 0 x0)
> (zero_extend:DI (mem:HI (reg/v/f:DI 0 x0 [orig:94 array ] [94]) [1
> *array_4(D)+0 S2 A16]))) {*zero_extendhidi2_aarch64}
>  (nil))
> (insn 9 6 14 2 (set (reg:SI 0 x0 [98])
> (plus:SI (reg:SI 0 x0 [orig:96 *array_4(D) ] [96])
> (reg:SI 1 x1 [orig:99 MEM[(short unsigned int *)array_4(D) + 2B]
> ] [99]))){*addsi3_aarch64}
>  (nil))
> (insn 14 9 15 2 (set (reg/i:HI 0 x0)
> (reg:HI 0 x0 [98])) {*movhi_aarch64}
>  (nil))
> (insn 15 14 17 2 (use (reg/i:HI 0 x0)) 
>  (nil))
> (note 17 15 18 NOTE_INSN_DELETED)
> (note 18 17 0 NOTE_INSN_DELETED)
> 
> Now as both memory accesses have different extended types, they cannot be
> combined by peephole.
> 
> Because of this, even when sched_fusion has brought the loads/stores closer,
> they cannot be merged.

Hmmm,

ldr w0, [x0]
ldr w1, [x0, 2]

is not the same as 

ldp w0, w1, [x0]

ldp w0, w1, [x0] is the same as merging

ldr w0, [x0]
ldr w1, [x0, 4]

Am I missing something ? That would mean it isn't possible to merge this
combination. 

Thoughts ...

[Bug tree-optimization/64946] [AArch64] gcc.target/aarch64/vect-abs-compile.c - "abs" vectorization fails for char/short types

2018-06-18 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64946

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Target Milestone|--- |9.0

[Bug tree-optimization/64946] [AArch64] gcc.target/aarch64/vect-abs-compile.c - "abs" vectorization fails for char/short types

2018-06-18 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64946

--- Comment #25 from Ramana Radhakrishnan  ---
(In reply to kugan from comment #24)
> Author: kugan
> Date: Sat Jun 16 21:34:29 2018
> New Revision: 261681
> 
> URL: https://gcc.gnu.org/viewcvs?rev=261681=gcc=rev
> Log:
> gcc/ChangeLog:
> 
> 2018-06-16  Kugan Vivekanandarajah  
> 
>   PR middle-end/64946
>   * cfgexpand.c (expand_debug_expr): Hande ABSU_EXPR.
>   * config/i386/i386.c (ix86_add_stmt_cost): Likewise.
>   * dojump.c (do_jump): Likewise.
>   * expr.c (expand_expr_real_2): Check operand type's sign.
>   * fold-const.c (const_unop): Handle ABSU_EXPR.
>   (fold_abs_const): Likewise.
>   * gimple-pretty-print.c (dump_unary_rhs): Likewise.
>   * gimple-ssa-backprop.c (backprop::process_assign_use): Likesie.
>   (strip_sign_op_1): Likesise.
>   * match.pd: Add new pattern to generate ABSU_EXPR.
>   * optabs-tree.c (optab_for_tree_code): Handle ABSU_EXPR.
>   * tree-cfg.c (verify_gimple_assign_unary): Likewise.
>   * tree-eh.c (operation_could_trap_helper_p): Likewise.
>   * tree-inline.c (estimate_operator_cost): Likewise.
>   * tree-pretty-print.c (dump_generic_node): Likewise.
>   * tree-vect-patterns.c (vect_recog_sad_pattern): Likewise.
>   * tree.def (ABSU_EXPR): New.
> 
> gcc/c-family/ChangeLog:
> 
> 2018-06-16  Kugan Vivekanandarajah  
> 
>   * c-common.c (c_common_truthvalue_conversion): Handle ABSU_EXPR.
> 
> gcc/c/ChangeLog:
> 
> 2018-06-16  Kugan Vivekanandarajah  
> 
>   * c-typeck.c (build_unary_op): Handle ABSU_EXPR;
>   * gimple-parser.c (c_parser_gimple_statement): Likewise.
>   (c_parser_gimple_unary_expression): Likewise.
> 
> gcc/cp/ChangeLog:
> 
> 2018-06-16  Kugan Vivekanandarajah  
> 
>   * constexpr.c (potential_constant_expression_1): Handle ABSU_EXPR.
>   * cp-gimplify.c (cp_fold): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-06-16  Kugan Vivekanandarajah  
> 
>   PR middle-end/64946
>   * gcc.dg/absu.c: New test.
>   * gcc.dg/gimplefe-29.c: New test.
>   * gcc.target/aarch64/pr64946.c: New test.
> 
> 
> Added:
> trunk/gcc/testsuite/gcc.dg/absu.c
> trunk/gcc/testsuite/gcc.dg/gimplefe-29.c
> trunk/gcc/testsuite/gcc.target/aarch64/pr64946.c
> Modified:
> trunk/gcc/ChangeLog
> trunk/gcc/c-family/ChangeLog
> trunk/gcc/c-family/c-common.c
> trunk/gcc/c/ChangeLog
> trunk/gcc/c/c-typeck.c
> trunk/gcc/c/gimple-parser.c
> trunk/gcc/cfgexpand.c
> trunk/gcc/config/i386/i386.c
> trunk/gcc/cp/ChangeLog
> trunk/gcc/cp/constexpr.c
> trunk/gcc/cp/cp-gimplify.c
> trunk/gcc/dojump.c
> trunk/gcc/expr.c
> trunk/gcc/fold-const.c
> trunk/gcc/gimple-pretty-print.c
> trunk/gcc/gimple-ssa-backprop.c
> trunk/gcc/match.pd
> trunk/gcc/optabs-tree.c
> trunk/gcc/testsuite/ChangeLog
> trunk/gcc/tree-cfg.c
> trunk/gcc/tree-eh.c
> trunk/gcc/tree-inline.c
> trunk/gcc/tree-pretty-print.c
> trunk/gcc/tree-vect-patterns.c
> trunk/gcc/tree.def

Doesn't this mean we unxfail the vect-abs-compile.c test ?

[Bug tree-optimization/85804] [8/9 Regression][AArch64] Mis-compilation of loop with strided array access and xor reduction

2018-06-18 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85804

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #2 from Ramana Radhakrishnan  ---
Patch being discussed here.
https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01026.html

[Bug debug/84342] Location views breaks cross builds of arm including gnueabihf

2018-06-12 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84342

--- Comment #13 from Ramana Radhakrishnan  ---
(In reply to Jeffrey A. Law from comment #12)
> I'm not familiar enough with the ccfsm bits to know if there's something we
> ought to be doing generically to improve CC handling further.  I think
> downgrading to P2 certainly makes sense though.
> 
> However, I wouldn't be surprised if we find other instances of this kind of
> problem confusing the hell out of the location view support.  So I wouldn't
> dig at the ccfsm stuff just to allow the compiler to handle view support
> slightly more efficiently -- removal of ccfsm should stand on its own.

Agreed about the removal of ccfsm support standing on it's own. I do wonder if
doing that will have the side benefit of not having these kinds of issues. 

IIRC arc (from the days I worked on it) has ccfsm similar to arm, so maybe the
same problem will bite us on other ports as well.

[Bug debug/84342] Location views breaks cross builds of arm including gnueabihf

2018-06-07 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84342

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #11 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #10)
> GCC 8.1 has been released.

(In reply to Jeffrey A. Law from comment #9)
> Alex: I realize that's the point of the hook.  But I'm pretty sure there's
> no way to fix the ARM port given the point at which lengths are set and the
> point at which ccfsm is valid are at two different times.  We'd either need
> a revamp of ccfsm or some layering violations to allow dwarf2out to access
> the underlying routines for length query and bypass the cache.
> 
> It's my view this BZ is resolved.  But if you want to keep it open to track
> the incorrect lengths in the ARM port, that's fine.  But it's certainly no
> longer a regression for gcc-8.

In which case should this still retain the P1 status ? 

The builds are ok since Alex's patch and we need to look at this at some point
of time. 

Richard E and I have been talking about whether ccfsm actually makes sense in
2018 and whether we should just rip it out anyway and make sure that the rtl
optimizers get it right rather than carry this in the far future. It's only
used in A32 state, it's probably not got a huge amount going for it and maybe
we should just rip it out.

[Bug target/85733] [8 regression] ARM -mbe8 behaviour doesn't match documentation

2018-05-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85733

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rearnsha at gcc dot 
gnu.org

[Bug target/85733] [8 regression] ARM -mbe8 behaviour doesn't match documentation

2018-05-11 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85733

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-05-11
 CC||ramana at gcc dot gnu.org
   Target Milestone|--- |8.2
Summary|ARM -mbe8 behaviour doesn't |[8 regression] ARM -mbe8
   |match documentation |behaviour doesn't match
   ||documentation
 Ever confirmed|0   |1

[Bug target/85593] [5,6,7,8 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

2018-05-04 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-05-04
Version|9.0 |5.4.1
Summary|GCC on ARM allocates R3 for |[5,6,7,8 Regression] GCC on
   |local variable when calling |ARM allocates R3 for local
   |naked function with O2  |variable when calling naked
   |optimizations enabled   |function with O2
   ||optimizations enabled
 Ever confirmed|0   |1

--- Comment #5 from Ramana Radhakrishnan  ---
(In reply to Austin Morton from comment #4)
> In my particular case I was able to work around the issue by removing the
> naked attribute and using extended assembly with a clobbers list.

Removing the naked attribute and using the extended assembler with a clobbers
list is absolutely the correct thing to do.

> 
> The resulting code is nearly identical (allowing GCC to generate the correct
> pro/epilog instead of hand writing it), and gcc correctly allocates R4
> instead of R3.
> 
> This still feels like a bug in GCC.  In the example I gave, if you compiled
> the naked function in a separate C file and linked them it would generate
> the correct code.  The issue is that GCC is able to "see" the naked function
> and is performing optimizations that it shouldn't as a result.
> 
> I believe that GCC should treat naked functions as opaque as far as
> optimizations are concerned.
> At the very least, there should be a note about this kind of issue included
> in the documentation of the naked attribute.

Yes, it should be opaque as far as this IPA-RA optimization is concerned - I
don't think there are many other optimizations that need to treat this as
opaque.  That's what I alluded to in my previous comment


> IIRC there is a hook for ipa-ra that says what
> registers can be clobbered : can't find it immediately. I suppose for naked
> functions it is *all* registers.

I wasn't looking in the backend when I responded earlier, there is no such hook
- I think the correct fix would be to get arm_emit_call_insn to mark *all*
registers as clobbered if the target of the call insn is a naked function i.e.
effectively disabling ipa-ra for naked functions. You'd have to figure out that
the DECL for the target of the call had a "naked" attribute attached to it ... 

Do you feel up to writing up a patch assuming you have copyright assignments et
al sewn up ? 




> 
> 
> 
> regards
> Ramana

[Bug target/85593] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

2018-05-02 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan  ---
Changing the testcase to indicate the clobbered register r3 makes the test pass
but that is wrong in a naked function. Extended inline assembler *cannot* be
used within a naked function because you inadvertently could end up requiring a
stack slot and therefore cause the function to require a prologue and epilogue
which is exactly contrary to what we want with the naked function ! 

I *think* the problem really is the fact that you have ipa-ra coming along and
deciding that r3 isn't used at all in the naked function . You can see the
problem disappear with -fno-ipa-ra but that is not a workaround I would
recommend using in your general C flags , because you are using a hammer
disabling a nice optimization to make something like the example "work". 

Thus I think what you want is to get rid of naked functions in general and
write the whole thing in assembler and stop faffing about with naked functions
in general. IIRC there is a hook for ipa-ra that says what registers can be
clobbered : can't find it immediately. I suppose for naked functions it is
*all* registers.



regards
Ramana

[Bug target/84923] [8/9 regression] gcc.dg/attr-weakref-1.c failed on aarch64

2018-04-25 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923

--- Comment #4 from Ramana Radhakrishnan  ---
(In reply to Richard Biener from comment #3)
> For x86_64 if I append
> 
> const int *dat[] = { ,  };
> 
> the testcase links fine irrespective of where I place the
> 
> .weakrefWv12,wv12
> .weak   wv12
> 
> assembler declarations.
> 
> When I look at the assembler generated by a cross from x86_64 I do not
> see any .weak wv12 directive which is likely the issue.

Yep, the .weak wv12 directive is the one that disappears. Putting that back in
by hand fixes the issue.

[Bug target/68256] Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.

2018-04-25 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68256

--- Comment #12 from Ramana Radhakrishnan  ---
(In reply to Steve Ellcey from comment #11)
> FYI: This caused a regression on aarch64.
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923


I have marked 84923 as an 8 regression as it wasn't done earlier and probably
slipped in the cracks. 

The regression was noticed and a patch was posted but it appears that this
wasn't reviewed.

Ramana

[Bug ada/85380] gnatbind fails with small executable & restricted runtime

2018-04-17 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85380

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org,
   ||ramana at gcc dot gnu.org

--- Comment #1 from Ramana Radhakrishnan  ---
Adding Eric to the CC list as someone who could comment on this ?

[Bug target/85203] cmse_nonsecure_caller intrinsic returns incorrect results

2018-04-17 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85203

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ramana at gcc dot gnu.org
 Resolution|--- |FIXED
   Target Milestone|--- |7.4

--- Comment #4 from Ramana Radhakrishnan  ---
Fixed I'm assuming ?

[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input

2018-04-06 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #2 from Ramana Radhakrishnan  ---
What about earlier branches ?

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2018-03-28 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-03-28
 Ever confirmed|0   |1

--- Comment #3 from Ramana Radhakrishnan  ---
Confirmed.

[Bug target/81863] [7 regression] -mword-relocations is unreliable

2018-03-27 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81863

--- Comment #21 from Ramana Radhakrishnan  ---
Author: ramana
Date: Tue Mar 27 14:06:20 2018
New Revision: 258886

URL: https://gcc.gnu.org/viewcvs?rev=258886=gcc=rev
Log:
[Patch ARM] Fix PR target/81863

This has been in my patch stack for quite some time. The problem here
was that we weren't handling arm_word_relocations in
arm_valid_symbolic_address and is the surest fix for this
for GCC8 and GCC7.

Regression tested on arm-none-linux-gnueabihf . Applying to
trunk and backporting to GCC-7 in a day or so.

regards
Ramana

2018-03-27  Ramana Radhakrishnan  

PR target/81863
* config/arm/arm.c (arm_valid_symbolic_address): Handle
arm_word_relocations


2018-03-27  Ramana Radhakrishnan  

PR target/81863
* gcc.target/arm/pr81863.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/arm/pr81863.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2018-03-15 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Target||arm-none-eabi
 CC||ramana at gcc dot gnu.org

--- Comment #1 from Ramana Radhakrishnan  ---
Isn't this something you said you could see from 6.x ?

[Bug target/68256] Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.

2018-03-15 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68256

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|7.4 |8.0

--- Comment #10 from Ramana Radhakrishnan  ---
Fixed for gcc-8

[Bug target/59833] ARM soft-float extendsfdf2 fails to quiet signaling NaN

2018-03-13 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59833

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |7.0

--- Comment #17 from Ramana Radhakrishnan  ---
Fixed for GCC-7

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-02-26 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #15 from Ramana Radhakrishnan  ---
Author: ramana
Date: Mon Feb 26 09:25:21 2018
New Revision: 257984

URL: https://gcc.gnu.org/viewcvs?rev=257984=gcc=rev
Log:
[Patch AArch64] Turn on frame pointer / partial fix for PR84521

This fixes a GCC-8 regression that we accidentally switched off frame
pointers in the AArch64 backend when changing the defaults in the common
parts of the code. This breaks an ABI decision that was made in GCC at
the dawn of the port with respect to having a frame pointer at all
times.  If we really want to turn this off lets have a discussion around
that separately.

For now turn this back on and I believe this will leave PR84521 latent
again with -fomit-frame-pointer and (hopefully) make the ruby issue go
away. I'm asking Sudi to pick that up.

Bootstrapped and regression tested on AArch64-none-linux-gnu but I see
one regression in gcc.c-torture/execute/960419-2.c which needs to be
looked at next (PR84528, thanks Kyrill).

Ok to put in and then look at PR84528 ?

2018-02-26  Ramana Radhakrishnan  

PR target/84521
* common/config/aarch64/aarch64-common.c
(aarch_option_optimization_table[]): Switch
off fomit-frame-pointer

2018-02-26  Ramana Radhakrishnan  

PR target/84521
* gcc.target/aarch64/lr_free_2.c: Revert changes in
r254814 disabling -fomit-frame-pointer by default.
* gcc.target/aarch64/spill_1.c: Likewise.
* gcc.target/aarch64/test_frame_11.c: Likewise.
* gcc.target/aarch64/test_frame_12.c: Likewise.
* gcc.target/aarch64/test_frame_13.c: Likewise.
* gcc.target/aarch64/test_frame_14.c: Likewise.
* gcc.target/aarch64/test_frame_15.c: Likewise.
* gcc.target/aarch64/test_frame_3.c: Likewise.
* gcc.target/aarch64/test_frame_5.c: Likewise.
* gcc.target/aarch64/test_frame_9.c: Likewise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/common/config/aarch64/aarch64-common.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/lr_free_2.c
trunk/gcc/testsuite/gcc.target/aarch64/spill_1.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_11.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_12.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_13.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_14.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_15.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_3.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_5.c
trunk/gcc/testsuite/gcc.target/aarch64/test_frame_9.c

[Bug target/84528] [8 Regression] gcc.c-torture/execute/960419-2.c -O3 fails with -fno-omit-frame-pointer

2018-02-23 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84528

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Priority|P1  |P3
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-02-23
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Ramana Radhakrishnan  ---
We are about to turn fno-omit-frame-pointer back on for gcc-8 so this is
probably important.

Usually the RM's need to set this to P1 - so keeping this at P3 for an RM to
confirm this goes up to P1 or P2.

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-02-22 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #10 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #4)
> Is the requirement just for functions that contain setjmp?  If so, the
> backend could just force frame pointers in cfun->calls_setjmp functions.

I think we should flip back fno-omit-frame-pointer on for gcc-8 as that breaks
the guarantee that we've had in the port for quite a while. I'm testing a patch
currently that I will get out first thing tomorrow to turn this back on.

If we want to turn it off that should be a conscious decision.


> 
> If not, even if the default is tweaked again to be -fno-omit-frame-pointer
> on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer,
> even before that change.

I think we should treat that as a separate but related issue.


Ramana

[Bug tree-optimization/83543] strlen of a local array member not optimized on some targets

2018-02-20 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83543

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-02-20
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from Ramana Radhakrishnan  ---

(In reply to Martin Sebor from comment #0)
> Bug 83462 reports (among others) a failure in the new
> c-c++-common/Warray-bounds-4.c test on powerpc64le.  The failure is due to a
> strlen optimization that's for some reason not working on this target (and
> on some others, including arm-none-eabi) but that works fine on
> x86_64-linux.  The test case below shows the difference in cross-compiler
> output between these three architectures.

Is this because arm-none-eabi by default is a STRICT_ALIGNMENT target ? What
happens if for instance you try this with -march=armv7-a where we allow some
limited misaligned access ? 

Ramana

> 
> $ (set -x && cat z.c && for arch in '' arm-none-eabi powerpc64le-linux; do
> /ssd/build/$arch/gcc-git/gcc/xgcc -B /ssd/build/$arch/gcc-git/gcc -O2 -S
> -Wall -fdump-tree-optimized=/dev/stdout z.c; done)
> + cat z.c
> struct S { char a[7]; };
> 
> void f (void)
> {
>   struct S s = { "12345" };
>   if (__builtin_strlen (s.a) != 5)
> __builtin_abort ();
> }
> + for arch in ''\'''\''' arm-none-eabi powerpc64le-linux
> + /ssd/build//gcc-git/gcc/xgcc -B /ssd/build//gcc-git/gcc -O2 -S -Wall
> -fdump-tree-optimized=/dev/stdout z.c
> 
> ;; Function f (f, funcdef_no=0, decl_uid=1894, cgraph_uid=0, symbol_order=0)
> 
> f ()
> {
>[local count: 1073741825]:
>   return;
> 
> }
> 
> 
> + for arch in ''\'''\''' arm-none-eabi powerpc64le-linux
> + /ssd/build/arm-none-eabi/gcc-git/gcc/xgcc -B
> /ssd/build/arm-none-eabi/gcc-git/gcc -O2 -S -Wall
> -fdump-tree-optimized=/dev/stdout z.c
> 
> ;; Function f (f, funcdef_no=0, decl_uid=4155, cgraph_uid=0, symbol_order=0)
> 
> f ()
> {
>   struct S s;
>   unsigned int _1;
> 
>[local count: 1073741825]:
>   s = *.LC0;
>   _1 = __builtin_strlen ();
>   if (_1 != 5)
> goto ; [0.00%]
>   else
> goto ; [99.96%]
> 
>[count: 0]:
>   __builtin_abort ();
> 
>[local count: 1073312327]:
>   s ={v} {CLOBBER};
>   return;
> 
> }
> 
> 
> + for arch in ''\'''\''' arm-none-eabi powerpc64le-linux
> + /ssd/build/powerpc64le-linux/gcc-git/gcc/xgcc -B
> /ssd/build/powerpc64le-linux/gcc-git/gcc -O2 -S -Wall
> -fdump-tree-optimized=/dev/stdout z.c
> 
> ;; Function f (f, funcdef_no=0, decl_uid=2784, cgraph_uid=0, symbol_order=0)
> 
> f ()
> {
>   struct S s;
>   long unsigned int _1;
> 
>[local count: 1073741825]:
>   s = *.LC0;
>   _1 = __builtin_strlen ();
>   if (_1 != 5)
> goto ; [0.00%]
>   else
> goto ; [99.96%]
> 
>[count: 0]:
>   __builtin_abort ();
> 
>[local count: 1073312327]:
>   s ={v} {CLOBBER};
>   return;
> 
> }

[Bug ipa/83178] [8 regression] g++.dg/ipa/devirt-22.C fail

2017-12-12 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83178

Ramana Radhakrishnan  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-12-12
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Ramana Radhakrishnan  ---
Confirmed then.

[Bug target/82248] probe_stack can generate unpredictable STR on arm

2017-12-05 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82248

--- Comment #6 from Ramana Radhakrishnan  ---
Author: ramana
Date: Tue Dec  5 16:32:55 2017
New Revision: 255428

URL: https://gcc.gnu.org/viewcvs?rev=255428=gcc=rev
Log:
[Patch ARM] Fix probe_stack constraint.

The probe_stack pattern uses r0 as a fixed register. This can cause issues if
we have auto-increment instructions coming out that have r0 as the base
register. 

Tested with a bootstrap and regression run. richi reports that the original
issue was fixed in the run. I did consider whether probe_stack_range was
affected but it all comes back to probe_stack pattern so I think we are ok.

I don't have a testcase that seems to provoke this but it seems to be default
on most distributions so I'm expecting the testcoverage to come from there.

Applied.

Ramana

PR target/82248

* config/arm/arm.md (probe_stack) : Use the 'o' constraint.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.md

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1212 matches

Mail list logo