[Bug target/84719] gcc's __builtin_memcpy performance with certain number of bytes is terrible compared to clang's

2018-03-06 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84719 --- Comment #11 from gpnuma at centaurean dot com --- Yes it's not the init loop the problem. Just to make sure, with the following code : #include #include #include #include #include #include #include int main(int argc, char *argv

[Bug target/84719] gcc's __builtin_memcpy performance with certain number of bytes is terrible compared to clang's

2018-03-05 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84719 --- Comment #8 from gpnuma at centaurean dot com --- Just to make sure I commented out bit masking : #include #include #include #include #include #include #include int main(int argc, char *argv[]) { const uint64_t size = 10

[Bug target/84719] gcc's __builtin_memcpy performance with certain number of bytes is terrible compared to clang's

2018-03-05 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84719 --- Comment #6 from gpnuma at centaurean dot com --- If you compile the following code (-O3 being the only flag used) : #include #include #include #include #include #include #include int main(int argc, char *argv[]) { const uint64_t

[Bug target/84719] gcc's __builtin_memcpy performance with certain number of bytes is terrible compared to clang's

2018-03-05 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84719 --- Comment #5 from gpnuma at centaurean dot com --- Which gcc and which clang ? Because on my platform, in the above code, if you isolate 3 bytes at a time and 5 bytes at a time it is way slower than clang (by doing manual unrolling). Or maybe

[Bug target/84719] gcc's __builtin_memcpy performance with certain number of bytes is terrible compared to clang's

2018-03-05 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84719 --- Comment #2 from gpnuma at centaurean dot com --- (In reply to Andrew Pinski from comment #1) > Does -mcpu=native improve it? > Also is GCC calling memcpy instead of doing an inline version? No -march=native does not make any diff

[Bug bootstrap/84719] New: gcc's __builtin_memcpy performance with certain number of bytes is terrible compared to clang's

2018-03-05 Thread gpnuma at centaurean dot com
Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: gpnuma at centaurean dot com Target Milestone: --- I post this bug report as an echo to my post here : https://stackoverflow.com/questions/49098453

[Bug c/66230] Using optimizations causes program to segfault

2015-05-21 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #2 from gpnuma at centaurean dot com --- I understand you're short of time but this problem is very difficult to reproduce !! I did try to compile and link with -fsanitize=undefined this morning, now here's the interesting part

[Bug c/66230] Using optimizations causes program to segfault

2015-05-21 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #5 from gpnuma at centaurean dot com --- Ok I did just try -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations and the issue is still there. If I add the printf(something); at the top of the function, everything works

[Bug c/66230] Using optimizations causes program to segfault

2015-05-21 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #9 from gpnuma at centaurean dot com --- What I mean is the structs I was using the pointer casts allocations with are instanciated by the program itself, so there could be a way to instanciate them with the required alignment I

[Bug c/66230] Using optimizations causes program to segfault

2015-05-21 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #4 from gpnuma at centaurean dot com --- Sorry I meant gcc 4.9.2 / -O3 of course, 4.8 works fine.

[Bug c/66230] Using optimizations causes program to segfault

2015-05-21 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #8 from gpnuma at centaurean dot com --- Thanks Markus I didn't think these alignment issues were actually the problem, it goes a long way. By doing memmoves instead of pointer cast allocations I got rid of the segfault

[Bug c/66230] Using optimizations causes program to segfault

2015-05-21 Thread gpnuma at centaurean dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #11 from gpnuma at centaurean dot com --- (In reply to Markus Trippelsdorf from comment #10) (In reply to gpnuma from comment #8) Thanks Markus I didn't think these alignment issues were actually the problem, it goes a long way

[Bug c/66230] New: Using optimizations causes program to segfault

2015-05-20 Thread gpnuma at centaurean dot com
Assignee: unassigned at gcc dot gnu.org Reporter: gpnuma at centaurean dot com Target Milestone: --- Hello, First I'd like to point out that the code producing this error compiles and runs fine in gcc 4.8.4-1 for Linux and OS/X and Clang 3.5, 3.6 (Linux) and 6.1 (OS/X