from:"linux at carewolf dot com"

[Bug target/59422] New: Support more targets for function multi versioning

2013-12-08 Thread linux at carewolf dot com

: target Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Created attachment 31399 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31399action=edit Patch Trying to compile a function with an xop multiversion fails with a No dispatcher found for xop

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2013-02-17 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033 Allan Jensen linux at carewolf dot com changed: What|Removed |Added CC||linux

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2013-02-17 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033 --- Comment #32 from Allan Jensen linux at carewolf dot com 2013-02-17 15:23:49 UTC --- (In reply to comment #31) (In reply to comment #30) Another example is binary operators between scalar and vectors. In C the scalar

[Bug middle-end/53460] New: Internal compiler error: in calc_dfs_tree, at dominance.c:395

2012-05-23 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53460 Bug #: 53460 Summary: Internal compiler error: in calc_dfs_tree, at dominance.c:395 Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED

[Bug middle-end/53460] Internal compiler error: in calc_dfs_tree, at dominance.c:395

2012-05-23 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53460 --- Comment #1 from Allan Jensen linux at carewolf dot com 2012-05-23 15:34:35 UTC --- Created attachment 27481 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27481 FontFastPath.ii.gz

[Bug middle-end/53460] Internal compiler error: in calc_dfs_tree, at dominance.c:395

2012-05-23 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53460 --- Comment #2 from Allan Jensen linux at carewolf dot com 2012-05-23 15:37:32 UTC --- It appears I am not allowed to make more than one attachment so you will have to do with one example. Here is the console output: Using built-in specs

[Bug c++/48026] #pragma optimize ignored for C++

2012-07-25 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48026 Allan Jensen linux at carewolf dot com changed: What|Removed |Added CC||linux at carewolf

[Bug libgcc/60429] New: Miscompilation (aliasing) with -finline-functions

2014-03-05 Thread linux at carewolf dot com

: libgcc Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com After recently trying to build Qt with -O3, I found one of our tests failing. After investigating I narrowed it down to qregion.cpp and the flag -finline-functions (using -O2 -finline-functions

[Bug libgcc/60429] Miscompilation (aliasing) with -finline-functions

2014-03-05 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #1 from Allan Jensen linux at carewolf dot com --- Created attachment 32268 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32268action=edit qregion.cpp intermediate compiled with G++ 4.4 (working)

[Bug libgcc/60429] Miscompilation (aliasing) with -finline-functions

2014-03-05 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #2 from Allan Jensen linux at carewolf dot com --- Created attachment 32269 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32269action=edit qregion.cpp intermediate compiled with gcc 4.8

[Bug libgcc/60429] Miscompilation (aliasing) with -finline-functions

2014-03-05 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #3 from Allan Jensen linux at carewolf dot com --- Created attachment 32270 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32270action=edit qregion.cpp assembler compiled with gcc 4.8

[Bug libgcc/60429] Miscompilation (aliasing) with -finline-functions

2014-03-05 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #4 from Allan Jensen linux at carewolf dot com --- Created attachment 32271 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32271action=edit qregion.cpp assembler compiled with gcc 4.4

[Bug middle-end/60429] Miscompilation (aliasing) with -finline-functions

2014-03-06 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #6 from Allan Jensen linux at carewolf dot com --- (In reply to Richard Biener from comment #5) Can you identify the inlined call? Is it if (pSLL y == pSLL-scanline) { loadAET(AET, pSLL

[Bug middle-end/60429] Miscompilation (aliasing) with -finline-functions

2014-03-07 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #8 from Allan Jensen linux at carewolf dot com --- Created attachment 32303 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32303action=edit Reduced test

[Bug middle-end/60429] Miscompilation (aliasing) with -finline-functions

2014-03-07 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #9 from Allan Jensen linux at carewolf dot com --- Created attachment 32304 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32304action=edit Reduced test assembler

[Bug middle-end/60429] Miscompilation (aliasing) with -finline-functions

2014-03-07 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #10 from Allan Jensen linux at carewolf dot com --- I have uploaded a reduced test. Compiled with -O0 or -O1 it outputs 180, compiled with -O2 or higher it outputs 179.

[Bug middle-end/60429] Miscompilation (aliasing) with -finline-functions

2014-03-07 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #11 from Allan Jensen linux at carewolf dot com --- Note that to run it, it links against Qt5Core.

[Bug middle-end/60429] Miscompilation (aliasing) with -finline-functions

2014-03-07 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #13 from Allan Jensen linux at carewolf dot com --- (In reply to Andrew Pinski from comment #12) tmpPtBlock-pts = reinterpret_castQPoint *(tmpPtBlock-data); Does this not violate C/C++ aliasing rules later on? I

[Bug middle-end/60429] [4.7/4.8 Regression] Miscompilation (aliasing) with -finline-functions

2014-03-11 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #24 from Allan Jensen linux at carewolf dot com --- I just tested the latest subversion head of gcc 4.9 and can confirm it fixes the original problem (tst_qregion in Qt 5.2.1 compiled with -O3).

[Bug middle-end/60429] [4.7/4.8 Regression] Miscompilation (aliasing) with -finline-functions

2014-03-15 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #25 from Allan Jensen linux at carewolf dot com --- Will it be backported to 4.8?

[Bug target/60788] New: Miscompilation of __builtin_clz with -mlzcnt

2014-04-08 Thread linux at carewolf dot com

: target Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Created attachment 32567 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32567action=edit Test case If you compile the attached program with -O0 and -mlzcnt on x86, it will produce wrong results

[Bug target/60788] Miscompilation of __builtin_clz with -mlzcnt

2014-04-08 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60788 --- Comment #1 from Allan Jensen linux at carewolf dot com --- Sorry. The optimization has nothing to do with it, it just causes the constant expressions used for testing to be evaluated at compile time. The real issue is that the lzcnt

[Bug target/60788] Miscompilation of __builtin_clz with -mlzcnt

2014-04-08 Thread linux at carewolf dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60788 --- Comment #3 from Allan Jensen linux at carewolf dot com --- Sorry for the confusion. I thought Intel had added it from Ivy Bridge, but it was Haswell.

[Bug target/64806] [5 Regression] FAIL: g++.dg/ext/mv1.C

2015-01-26 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64806 --- Comment #3 from Allan Jensen linux at carewolf dot com --- I refer to this: /* Handle arch= if specified. For priority, set it to be 1 more than the best instruction set the processor can handle. For instance

[Bug target/64806] [5 Regression] FAIL: g++.dg/ext/mv1.C

2015-01-26 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64806 Allan Jensen linux at carewolf dot com changed: What|Removed |Added CC||linux

[Bug tree-optimization/65492] Bad optimization in -O3 due to if-conversion and/or unrolling

2015-03-21 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #10 from Allan Jensen linux at carewolf dot com --- Just make things more complicated, I just tried the test on a Haswell, and surprisingly disabling if-convert or tree-vectorize on -O3 has no effect on performance, but activating

[Bug tree-optimization/65492] New: Bad optimization in -O3 on SSE intrinsics

2015-03-20 Thread linux at carewolf dot com

-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com After investigating a loop using SSE intrinsics that was significantly faster in clang than in gcc, I discovered gcc had the same performance as clang in -O2, and only performed

[Bug tree-optimization/65492] Bad optimization in -O3 on SSE intrinsics

2015-03-20 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #1 from Allan Jensen linux at carewolf dot com --- Created attachment 35070 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35070action=edit main

[Bug tree-optimization/65492] Bad optimization in -O3 on SSE intrinsics

2015-03-20 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #2 from Allan Jensen linux at carewolf dot com --- Created attachment 35071 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35071action=edit vector union test

[Bug tree-optimization/65492] Bad optimization in -O3 on SSE intrinsics

2015-03-20 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #3 from Allan Jensen linux at carewolf dot com --- The -O3 regression seems to go back a long way, but has become lesser over time. With gcc 4.6 and older it runs at 3.1s with -O3, and still at 1.8s with -O2.

[Bug tree-optimization/65492] Bad optimization in -O3 due to if-conversion and/or unrolling

2015-03-20 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #8 from Allan Jensen linux at carewolf dot com --- You can remove the branches in the inner loop and still reproduce the issue. There were no branches in the original code, I only added them to the reduced case because I was using

[Bug tree-optimization/65492] Bad optimization in -O3 due to if-conversion and/or unrolling

2015-03-20 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #9 from Allan Jensen linux at carewolf dot com --- Looking at the assembler, it does indeed appear that the only difference just loop unrolling and if conversion. After testing on another machine (and old PhenomII as opposed

[Bug tree-optimization/65492] Bad optimization in -O3 due to if-conversion and/or unrolling

2015-03-24 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #11 from Allan Jensen linux at carewolf dot com --- Issues with slow cmov has been seen in several bug reports: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53346 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54073 https://gcc.gnu.org

[Bug tree-optimization/65492] Bad optimization in -O3 due to if-conversion and/or unrolling

2015-03-31 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65492 --- Comment #12 from Allan Jensen linux at carewolf dot com --- I have a very crude fix for this. First though, according to comments in tree-if-conv.c and earlier bugs on the issues. If-conversion is suppposed to be conditional. It performed

[Bug lto/65274] New: Internal compiler error: should die in combat

2015-03-02 Thread linux at carewolf dot com

: lto Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com When trying to build QtWebkit with LTO I get the internal error: lto1: internal compiler error: in should_move_die_to_comdat, at dwarf2out.c:6846 Note. I do not actually expect an LTO debug build

[Bug lto/65274] Internal compiler error: should die in combat

2015-03-03 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65274 --- Comment #2 from Allan Jensen linux at carewolf dot com --- Yes, it appears to complete the linktime compilation when using GCC trunk.

[Bug c++/65211] Type alignment lost inside templated function

2015-02-25 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65211 --- Comment #2 from Allan Jensen linux at carewolf dot com --- Created attachment 34873 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34873action=edit Intermediate

[Bug c++/65211] Type alignment lost inside templated function

2015-02-25 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65211 Allan Jensen linux at carewolf dot com changed: What|Removed |Added Attachment #34873|0 |1

[Bug c++/65211] Type alignment lost inside templated function

2015-02-25 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65211 --- Comment #1 from Allan Jensen linux at carewolf dot com --- Created attachment 34872 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34872action=edit Assembler intermediate It is the movdqa(%rdx), %xmm1 instruction on line 19

[Bug c++/65211] New: Type alignment lost inside templated function

2015-02-25 Thread linux at carewolf dot com

++ Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Created attachment 34871 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34871action=edit C++ source A specific combination of local typedef inside a templated function causes gcc to lose

[Bug c++/65211] Type alignment lost inside templated function

2015-02-25 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65211 --- Comment #4 from Allan Jensen linux at carewolf dot com --- Note either removing the template argument or moving the typedef out of the function both solve the issue, and makes gcc use an unaligned load.

[Bug middle-end/67351] Missed optimisation on 64-bit field compared to 32-bit

2015-08-25 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67351 --- Comment #1 from Allan Jensen linux at carewolf dot com --- Created attachment 36254 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36254action=edit Compiled test assembler

[Bug middle-end/67351] New: Missed optimisation on 64-bit field compared to 32-bit

2015-08-25 Thread linux at carewolf dot com

Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 36253 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36253action=edit Test Gcc will expand and detect field setting on 32-bit

[Bug tree-optimization/67351] Missed optimisation on 64-bit field compared to 32-bit

2015-09-17 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67351 Allan Jensen changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/68793] Bad optimization by split-wide-type on NEON

2015-12-08 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68793 --- Comment #3 from Allan Jensen --- Created attachment 36959 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36959=edit neon-test-no-split-wide-types.s

[Bug target/68793] Bad optimization by split-wide-type on NEON

2015-12-08 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68793 --- Comment #6 from Allan Jensen --- I mean the neon64 case, not 32-bit.

[Bug target/68793] Bad optimization by split-wide-type on NEON

2015-12-08 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68793 --- Comment #1 from Allan Jensen --- Created attachment 36957 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36957=edit neon-test.cpp

[Bug target/68793] Bad optimization by split-wide-type on NEON

2015-12-08 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68793 --- Comment #2 from Allan Jensen --- Created attachment 36958 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36958=edit neon-test-split-wide-types.s

[Bug target/68793] New: Bad optimization by split-wide-type on NEON

2015-12-08 Thread linux at carewolf dot com

: target Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Enabling the optimization 'split-wide-types' causes worse code for NEON intrinsics than disabling it, and it is enabled by default by -O1. It is triggered by multi-register

[Bug target/68793] Bad optimization by split-wide-type on NEON

2015-12-08 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68793 --- Comment #5 from Allan Jensen --- The test-case uses C++11 initialization. I haven't tested gcc 6, so if you say it is solved, I would trust you. Note the 32-bit case is also suboptimal in both cases (not affected by split-wide-types). Is

[Bug target/68793] Bad optimization by split-wide-type on NEON

2015-12-09 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68793 Allan Jensen changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/51509] Inefficient neon intrinsic code sequence

2015-11-26 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51509 Allan Jensen changed: What|Removed |Added CC||linux at carewolf dot com --- Comment #6

[Bug other/70118] New: UBSan claims misaligned access in SSE instrinsics

2016-03-07 Thread linux at carewolf dot com

: other Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- The intrinsics _mm_loadl_epi64 and _mm_storel_epi64 triggers UBSan warnings on unaligned access because the instrinsics definitions in emmintrin.h are using __m64

[Bug c++/77796] New: tautological compare warning emitted for inherited static method comparisons

2016-09-29 Thread linux at carewolf dot com

: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- We have been running into several issues with the tautological compare warning in qtdeclarative, first there was https

[Bug tree-optimization/77902] New: Auto-vectorizes epilogue loops or manually vectorized functions

2016-10-08 Thread linux at carewolf dot com

Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 39774 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39774=edit Example that trigger the pointl

[Bug lto/65274] Internal compiler error: should die in combat

2016-08-29 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65274 --- Comment #4 from Allan Jensen --- It works now.

[Bug tree-optimization/77902] Auto-vectorizes epilogue loops of manually vectorized functions

2016-10-18 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77902 Allan Jensen changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/70118] UBSan claims misaligned access in SSE instrinsics

2016-11-23 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70118 --- Comment #3 from Allan Jensen --- Or r217608

[Bug target/70118] UBSan claims misaligned access in SSE instrinsics

2016-11-23 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70118 --- Comment #2 from Allan Jensen --- I believe this to be fixed by r239889

[Bug target/70118] UBSan claims misaligned access in SSE instrinsics

2016-11-23 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70118 --- Comment #4 from Allan Jensen --- Created attachment 40130 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40130=edit Proposed patch On closer inspection, we are only almost there, two minor changes are still needed. (testing patch).

[Bug target/70118] UBSan claims misaligned access in SSE instrinsics

2016-11-24 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70118 Allan Jensen changed: What|Removed |Added Attachment #40130|0 |1 is obsolete|

[Bug target/31667] Integer extensions vectorization could be improved

2016-11-28 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31667 Allan Jensen changed: What|Removed |Added CC||linux at carewolf dot com --- Comment #3

[Bug target/78563] SSE4.1 pmovzx shuffle pattern not recognized

2016-11-28 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78563 --- Comment #1 from Allan Jensen --- Created attachment 40177 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40177=edit Test

[Bug target/31667] Integer extensions vectorization could be improved

2016-11-28 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31667 --- Comment #4 from Allan Jensen --- (In reply to Allan Jensen from comment #3) > Gcc 5 and 6 produces code with pmovzx when compiling the example with -O3 > -msse4.1 > > I assume this can be closed. Note like comment 1 saids, it will not use

[Bug target/78563] New: SSE4.1 pmovzx shuffle pattern not recognized

2016-11-28 Thread linux at carewolf dot com

: target Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- An unpack pattern with 0 constant are neither folded nor recognized as a pmovzx instruction. SSE2 code: _mm_unpacklo_epi32(X, _mm_setzero_si128()) GCC code

[Bug tree-optimization/78394] False positives of maybe-uninitialized with -Og

2016-11-17 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 Allan Jensen changed: What|Removed |Added Attachment #40064|0 |1 is obsolete|

[Bug tree-optimization/78394] New: False positives of maybe-uninitialized with -Og

2016-11-17 Thread linux at carewolf dot com

: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 40064 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40064=edit maybe_uninitialized.cpp Compiling with -Og produces a number of uni

[Bug pch/63319] [5 Regression] ICE: Segmentation fault building qt5 with pch

2016-11-03 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63319 Allan Jensen changed: What|Removed |Added CC||linux at carewolf dot com --- Comment

[Bug tree-optimization/77902] Auto-vectorizes epilogue loops of manually vectorized functions

2016-10-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77902 --- Comment #1 from Allan Jensen --- Further experimentation shows that GCC can sometimes reason about the remaining range but does so inconsistenly. For instance this examplse also fails: int result = 0; for (; count >= 4; count -= 4)

[Bug tree-optimization/77902] Auto-vectorizes epilogue loops of manually vectorized functions

2016-10-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77902 --- Comment #2 from Allan Jensen --- While this have been the case in both GCC 5 and GCC 6, it appears to both failing cases previously meantioned already produced the best case result in using a half recent GCC 7. gcc version 7.0.0 20160923

[Bug target/47754] [missed optimization] AVX allows unaligned memory operands but GCC uses unaligned load and register operand

2016-12-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47754 --- Comment #8 from Allan Jensen --- Note this happens with -mavx2, but not with -march=haswell. It appears the tuning is a bit too pessimistic when avx2 is enabled on generic x64.

[Bug target/47754] [missed optimization] AVX allows unaligned memory operands but GCC uses unaligned load and register operand

2016-12-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47754 Allan Jensen changed: What|Removed |Added CC||linux at carewolf dot com --- Comment #7

[Bug target/47754] [missed optimization] AVX allows unaligned memory operands but GCC uses unaligned load and register operand

2016-12-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47754 --- Comment #10 from Allan Jensen --- No I mean it triggers when you compile with -mavx2, it is solved with -march=haswell. It appears the issue is the tune flag X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL is set for all processors that support avx2,

[Bug target/78762] New: Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-10 Thread linux at carewolf dot com

Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 40295 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40295=edit Test In gcc 7 when not optimizing for sp

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #3 from Allan Jensen --- Created attachment 40298 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40298=edit Test compiled with gcc 6

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #1 from Allan Jensen --- Created attachment 40296 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40296=edit Test compiled with -mavx2

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #2 from Allan Jensen --- Created attachment 40297 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40297=edit Test compiled with -march=haswell

[Bug target/47754] [missed optimization] AVX allows unaligned memory operands but GCC uses unaligned load and register operand

2016-12-10 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47754 --- Comment #11 from Allan Jensen --- The think the issue I noted is completely separate from this one, so I opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 to deal with it. I think this one could probably be closed though.

[Bug target/59874] Missing builtin (__builtin_clzs) when compiling with g++

2016-12-12 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59874 Allan Jensen changed: What|Removed |Added CC||linux at carewolf dot com --- Comment #5

[Bug c/66970] Add __has_builtin() macro

2016-12-12 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66970 Allan Jensen changed: What|Removed |Added CC||linux at carewolf dot com --- Comment #5

[Bug target/59874] Missing builtin (__builtin_clzs) when compiling with g++

2016-12-15 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59874 --- Comment #15 from Allan Jensen --- Yes, the patch works and it also evaluates at compile time.

[Bug target/59874] Missing builtin (__builtin_clzs) when compiling with g++

2016-12-13 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59874 --- Comment #8 from Allan Jensen --- Thanks that looks good. I will test it when I have a chance. I am changing the Qt sources to not assume the presence of __builtin_clzs when __BMI__ is defined. It can use __builtin_clz() and

[Bug target/70118] UBSan claims misaligned access in SSE instrinsics

2016-12-12 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70118 Allan Jensen changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-21 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #10 from Allan Jensen --- That would solve the problem, but also leave the behavior as Sandybridge only (nehalem didn't have AVX).

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-21 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #11 from Allan Jensen --- Btw, did you benchmark store splitting on AMD? It is also enabled for BDVER and ZNVER1.

[Bug target/78762] Regression: Splitting unaligned AVX loads also when AVX2 is enabled

2016-12-21 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78762 --- Comment #13 from Allan Jensen --- The question is if the unaligned store is still slow on Excavator and Ryzen which support AVX2. As far as I understand the bulldozer architectures just prefer split AVX because it was basically emulating

[Bug target/78921] New: SSE/AVX shuffle intrinsics uses builtins instead of __builtin_shuffle

2016-12-24 Thread linux at carewolf dot com

Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- The intrinsics for x86 SIMD shuffle instructions could be redeclared using __builtin_shuffle. This would help folding and better

[Bug ipa/80277] New: ipa-icf missing overlooking functions

2017-04-01 Thread linux at carewolf dot com

Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 41100 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41100=edit icf.cc Several functions that produce identical assembler are not merged by ipa-icf. I h

[Bug target/80040] SSE4.1 ptest not always merged

2017-03-14 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80040 --- Comment #2 from Allan Jensen --- Created attachment 40973 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40973=edit Assembler output from gcc 6 Easier to compare

[Bug target/80040] SSE4.1 ptest not always merged

2017-03-14 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80040 --- Comment #1 from Allan Jensen --- Created attachment 40972 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40972=edit Assembler output

[Bug target/80040] New: SSE4.1 ptest not always merged

2017-03-14 Thread linux at carewolf dot com

Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 40971 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40971=edit Example The intrinsics _mm_testz_si128 and _mm_testc_si128 both map to the exact same instruct

[Bug rtl-optimization/81174] New: bswap not recognized in |= statement

2017-06-22 Thread linux at carewolf dot com

-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 41610 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41610=edit bswap-issue.cc In writting a big-endian bitfield accessor I noticed that bs

[Bug rtl-optimization/81174] bswap not recognized in |= statement

2017-06-22 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81174 Allan Jensen changed: What|Removed |Added Version|6.1.1 |7.1.0 --- Comment #1 from Allan Jensen

[Bug tree-optimization/82426] Missed tree-slp-vectorization on -O2 and -O3

2017-10-04 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82426 --- Comment #3 from Allan Jensen --- Note it appears the fact it can do it at all in -Os is new in gcc 7

[Bug tree-optimization/82426] New: Missed tree-slp-vectorization on -O2 and -O3

2017-10-04 Thread linux at carewolf dot com

: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 42299 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42299=edit vectslp.cpp The attached example is a simple matrix multiplicat

[Bug tree-optimization/82426] Missed tree-slp-vectorization on -O2 and -O3

2017-10-04 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82426 --- Comment #1 from Allan Jensen --- Created attachment 42300 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42300=edit Assembler output with -O3

[Bug tree-optimization/82426] Missed tree-slp-vectorization on -O2 and -O3

2017-10-04 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82426 --- Comment #2 from Allan Jensen --- Created attachment 42301 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42301=edit Assembler output with -Os -ftree-slp-vectorize

[Bug tree-optimization/85692] New: Two source permute not used for vector initialization

2018-05-08 Thread linux at carewolf dot com

Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- If a vector initialization is using elements from only a single vector source, it will be optimized as a shuffle, but if it is using elements from two

[Bug tree-optimization/85692] Two source permute not used for vector initialization

2018-05-08 Thread linux at carewolf dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85692 --- Comment #1 from Allan Jensen --- Created attachment 44084 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44084=edit construct.cc Motivating examples. Compile with -msse4.1 for the second case.

[Bug rtl-optimization/85551] New: No strength reduction of modulo and integer vision

2018-04-27 Thread linux at carewolf dot com

: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- Created attachment 44030 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44030=edit strmod.cpp Many simple loops using modulo naively can be optimized

1 2 >

1 - 100 of 153 matches

Mail list logo