[Bug middle-end/100363] gcc generating wider load/store than warranted at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Linus Torvalds changed: What|Removed |Added CC||torvalds@linux-foundation.o ||rg --- Comment #4 from Linus Torvalds --- (In reply to Andrew Pinski from comment #1) > The loop gets vectorized, I don't see the problem really. See https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/issues/372 and in particular the comment "In the first 8-byte copy, src and dst overlap" so apparently gcc has decided that they can't overlap, despite the two pointers being literally generated from the same base pointer. But I don't real arc assembly, so I'll have to take Vineet's word for it. Vineet, have you been able to generate a smaller test-case?
[Bug middle-end/100363] gcc generating wider load/store than warranted at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 --- Comment #3 from Vineet Gupta --- Created attachment 50723 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50723=edit preprocessed source file (with extra nop annotation)
[Bug middle-end/100363] gcc generating wider load/store than warranted at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 --- Comment #2 from Andrew Pinski --- Note in the tar file there is only: inffast2.s inffast2.s.aarch64.gcc10.O3 inffast2.s.aarch64.gcc9.O3 inffast2.s.arc.gcc10.O3
[Bug middle-end/100363] gcc generating wider load/store than warranted at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2021-04-30 Status|UNCONFIRMED |WAITING --- Comment #1 from Andrew Pinski --- The loop gets vectorized, I don't see the problem really. Also I don't see the preprocessed source. Can you attach that? Is the problem that the loads have to be done in 2 bytes always from the hardware? If so then you need to mark the pointer as volatile.