[Bug libstdc++/77776] C++17 std::hypot implementation is poor

2019-01-05 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6 --- Comment #3 from Matthias Kretz --- Did you consider the error introduced by scaling with __amax? I made sure that the division is without error by zeroing the mantissa bits. Here's a motivating example that shows an error of 1 ulp otherwise:

[Bug c++/85052] Implement support for clang's __builtin_convertvector

2019-01-02 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85052 --- Comment #5 from Matthias Kretz --- Thank you Jakub! Here's a tested x86 library implementation for all conversions and different ISA extension support for reference:

[Bug libstdc++/84949] -ffast-math bugged with respect to NaNs

2018-12-11 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84949 --- Comment #7 from Matthias Kretz --- Example showing the discrepancy: https://godbolt.org/z/D15m71 Also PR83875 is relevant wrt. giving different answers depending on function attributes.

[Bug libstdc++/84949] -ffast-math bugged with respect to NaNs

2018-12-11 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84949 Matthias Kretz changed: What|Removed |Added CC||kretz at kde dot org --- Comment #6

[Bug libstdc++/77776] C++17 std::hypot implementation is poor

2018-12-06 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6 Matthias Kretz changed: What|Removed |Added CC||kretz at kde dot org --- Comment #1

[Bug target/88152] optimize SSE & AVX char compares with subsequent movmskb

2018-11-29 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88152 --- Comment #5 from Matthias Kretz --- > -fno-signed-zeros isn't a guarantee the operand will not be -0.0 and having > x < 0.0 behave differently based on whether x is -0.0 or 0.0 (with > -fno-signed-zeros quite randomly) is IMHO very bad. I

[Bug target/88152] optimize SSE & AVX char compares with subsequent movmskb

2018-11-23 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88152 --- Comment #2 from Matthias Kretz --- I just realized, the movmsk(x<0) => movmsk(x) transformation also applies to float and double if -ffinite-math-only (i.e. no NaN, it's alright for inf) and -fno-signed-zeros are active.

[Bug target/88152] New: optimize SSE & AVX char compares with subsequent movmskb

2018-11-22 Thread kretz at kde dot org
tion Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Target: x86_64-*-*, i?86-*-* Testcase (https://godbolt.org/z/YNPZyf): #include template u

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #20 from Matthias Kretz --- The original issue I meant to report is fixed. There are many more missed optimizations in the original example, though. I.e. https://godbolt.org/z/7P1o3O should compile to: use_insert_extract():

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #18 from Matthias Kretz --- FWIW, the issue is resolved on trunk. GCC8.2 still has the missed optimization: https://godbolt.org/z/hbgIIi

[Bug c++/87989] [8/9 Regression] Calling operator T() invokes wrong conversion operator overload

2018-11-15 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87989 --- Comment #4 from Matthias Kretz --- Yes, looks like a duplicate of 86246.

[Bug c++/87989] Calling operator T() invokes wrong conversion operator overload

2018-11-12 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87989 Matthias Kretz changed: What|Removed |Added Known to work||7.3.0 Known to fail|

[Bug c++/87989] New: Calling operator T() invokes wrong conversion operator overload

2018-11-12 Thread kretz at kde dot org
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Testcase (https://godbolt.org/z/sStNGV): struct X { template operator T() const; operator float() const; }; template T f(const X

[Bug c++/87631] new attribute for passing structures with multiple SIMD data members in registers

2018-10-17 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87631 --- Comment #2 from Matthias Kretz --- My (current) use case is structures (nested) of builtin types and vector types. These structures have a trivial copy constructor. Generalization --- I believe generalization of this approach

[Bug other/87631] New: new attribute for passing structures with multiple SIMD data members in registers

2018-10-17 Thread kretz at kde dot org
Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Consider: using V [[gnu::vector_size(16)]] = float; struct X1 { V

[Bug target/86896] New: invalid vmovdqa64 instruction for KNL emitted

2018-08-09 Thread kretz at kde dot org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Test case (unreduced) at https://web-docs.gsi.de/~mkretz/invalid_knl_instruction.cpp Compile the test case with `g++ -std=c++17 -O1 -march=knl

[Bug libstdc++/86655] std::assoc_legendre should not constrain the value of m

2018-07-25 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86655 --- Comment #2 from Matthias Kretz --- http://eel.is/c++draft/c.math#sf.cmath-1.3 might be the reason why `m <= l` is enforced. But unless I'm confused the footnote on "mathematically defined" tells us it should work: - "(a) if it is explicitly

[Bug libstdc++/86655] New: std::assoc_legendre should not constrain the value of m

2018-07-24 Thread kretz at kde dot org
Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- https://wg21.link/c.math#sf.cmath.assoc_legendre leaves m unconstrained. __detail::__assoc_legendre_p documents "@param m The

[Bug target/86267] detect conversions between bitmasks and vector masks

2018-07-24 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86267 --- Comment #2 from Matthias Kretz --- Sorry for the delay. Vacation... This pattern appears in many variations in the implementation of wg21.link/p0214r9. The fixed_size ABI tag used with a simd_mask type requires a decision from the

[Bug tree-optimization/86267] New: detect conversions between bitmasks and vector masks

2018-06-21 Thread kretz at kde dot org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Testcase (cf. https://godbolt.org/g/gi6f7V): #include auto f(__m256i a, __m256i b) { __m256i k = a < b; l

[Bug libstdc++/85951] New: make_signed and make_unsigned are incorrect for wchar_t, char16_t, and char32_t

2018-05-28 Thread kretz at kde dot org
: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- make_(un)signed_t of char16_t, char32_t, or wchar_t should never be char16_t/char32_t/wchar_t, just like it is the case

[Bug c++/85827] false positive for -Wunused-but-set-variable because of constexpr-if

2018-05-18 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85827 --- Comment #3 from Matthias Kretz --- But macros are different. They remove the code before the C++ parser sees it (at least as-if). One great improvement of constexpr-if over macros is that all the other branches are parsed and their syntax

[Bug c++/85827] false positive for -Wunused-but-set-variable because of constexpr-if

2018-05-18 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85827 --- Comment #1 from Matthias Kretz --- Same issue for -Wunused-variable

[Bug c++/85827] New: false positive for -Wunused-but-set-variable because of constexpr-if

2018-05-18 Thread kretz at kde dot org
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Testcase `-std=c++17 -Wall` (cf. https://godbolt.org/g/kfgN2V): template int f() { constexpr bool _1 = N == 1; constexpr bool _2

[Bug target/85819] New: conversion from __v[48]su to __v[48]sf should use FMA

2018-05-17 Thread kretz at kde dot org
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Testcase (cf. https://godbolt.org/g/UoU3zj): using T = float; using To [[gnu::vector_size(32)]] = T; using From [[gnu::vector_size(32)]] = unsigned; #define A2(I) (T)a[I

[Bug target/85572] New: faster code for absolute value of __v2di

2018-04-30 Thread kretz at kde dot org
Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The absolute value for 64-bit integer SSE vectors is only optimized when AVX512VL is available. Test case (`-O2 -ffast-math` and one of -mavx512vl, -msse4, or -msse2): #include __v2di

[Bug target/85538] kortest for 32 and 64 bit masks incorrectly uses k0

2018-04-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85538 --- Comment #3 from Matthias Kretz --- Some more observations: 1. The instruction sequence: kmovq %k1,-0x8(%rsp) vmovq -0x8(%rsp),%xmm1 vmovq %xmm1,%rax kmovq %rax,%k0 should be a simple `kmovq %k1,%k0` instead. 2. Adding

[Bug target/85538] kortest for 32 and 64 bit masks incorrectly uses k0

2018-04-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85538 --- Comment #1 from Matthias Kretz --- Sorry, I was trying to force GCC to use the k1 register and playing with register asm (which didn't have any effect at all). f8 should actually be (cf. https://godbolt.org/g/hSkoJV): bool f8(__m512i x,

[Bug target/85538] New: kortest for 32 and 64 bit masks incorrectly uses k0

2018-04-26 Thread kretz at kde dot org
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Test case (`-O2 -march=skylake-avx512`, cf. https://godbolt.org/g/ou3oAZ): #include // bad: bool f8(__m512i x, __m512i y) { register __mmask64 k asm("

[Bug target/85482] unnecessary vmovaps/vmovapd/vmovdqa emitted

2018-04-20 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85482 Matthias Kretz changed: What|Removed |Added Keywords||missed-optimization

[Bug target/85482] New: unnecessary vmovaps/vmovapd/vmovdqa emitted

2018-04-20 Thread kretz at kde dot org
: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Test case (cf. https://godbolt.org/g/QkJYSK): #include __m256 zero_extend1(__m128 a) { return _mm256_insertf128_ps(__m256(), a, 0); } __m256d zero_extend1(__m128d

[Bug target/85480] New: zero extension from xmm to zmm via _mm512_insert???x? not optimized

2018-04-20 Thread kretz at kde dot org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Test case (cf. https://godbolt.org/g/p4Kt8X): #include __m512 zero_extend2(__m128 a) { return _mm512_insertf32x4(__m512(), a, 0

[Bug target/85324] New: missing constant propagation on SSE/AVX conversion intrinsics

2018-04-10 Thread kretz at kde dot org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The following test case shows that constant propagation through conversion intrinsics does not work: #include template using V [[gnu

[Bug target/85323] SSE/AVX/AVX512 shift by 0 not optimized away

2018-04-10 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85323 --- Comment #1 from Matthias Kretz --- Created attachment 43898 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43898=edit idea for a partial solution Constant propagation works using the built in shift operators. At least for the shifts

[Bug target/85323] New: SSE/AVX/AVX512 shift by 0 not optimized away

2018-04-10 Thread kretz at kde dot org
: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- In the following test case, all three functions should compile to just `ret`: #include __m128i f(__m128i x) { x = _mm_sll_epi64(x, __m128i()); x = _mm_sll_epi32(x

[Bug target/85317] New: missing constant propagation on _mm(256)_movemask_*

2018-04-10 Thread kretz at kde dot org
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The following test case shows that the movemask intrinsics are are a barrier for constant propagation. All of these functions should have a trivial constant return value

[Bug c++/85077] [8 Regression] V[248][SD]F abs not optimized to

2018-03-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85077 --- Comment #8 from Matthias Kretz --- Thanks! FWIW my abs implementation now uses: template [[gnu::optimize("finite-math-only,no-signed-zeros")]] constexpr Storage abs(Storage v) { return v.d < 0 ? -v.d : v.d; }

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786 --- Comment #15 from Matthias Kretz --- Here's an idea for a test case (https://godbolt.org/g/SjM2HE: it appears fixed on GCC 8): typedef unsigned short V __attribute__((vector_size (16))); V foo (V x, int y) { x <<= y; asm volatile

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786 --- Comment #14 from Matthias Kretz --- I applied both patches to my GCC 7.2 installation and as a result my complete testsuite passes now. Anything else I can help with?

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786 --- Comment #13 from Matthias Kretz --- I'll try to apply it locally and will report my findings.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786 --- Comment #11 from Matthias Kretz --- Created attachment 43762 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43762=edit test case that produces incorrect vpsrlw Compiled with `g++-7 -std=c++17 -O0 -fabi-version=0 -fabi-compat-version=0

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786 --- Comment #10 from Matthias Kretz --- This is all I have right now: TID 0 SDE-ERROR: Executed instruction not valid for specified chip (KNL): 0x70d281: vpsrlw xmm0, xmm0, xmm16 Image:

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786 --- Comment #8 from Matthias Kretz --- There seems to be a similar bug for vpsrlw and vpsllw. Do you need a testcase? (It's hard to hit the bug... just had one occur on a Travis CI build)

[Bug c++/85077] V[248][SD]F abs not optimized to

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85077 --- Comment #4 from Matthias Kretz --- Oh, there seems to be a regression in GCC 8. In 7 it works as you say. In 8 I can't get the andps to show up

[Bug c++/85077] V[248][SD]F abs not optimized to

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85077 --- Comment #3 from Matthias Kretz --- Ouch, right I didn't think of non-finite values. I.e. -0 < 0 is false... However, this is what I wanted: abs(-inf) -> inf abs( inf) -> inf abs( nan) -> nan abs( -0) -> 0 abs( 0) -> 0 The sign bit

[Bug target/85077] New: V[248][SD]F abs not optimized to

2018-03-26 Thread kretz at kde dot org
Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The following test case (also at https://godbolt.org/g/XEPk7M) shows that `x < 0 ? -x : x` is not optimized to an efficient abs implementation. This is not only the case for SSE, but a

[Bug target/48701] [missed optimization] GCC fails to use aliasing of ymm and xmm registers

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48701 --- Comment #3 from Matthias Kretz --- Updated test case at https://godbolt.org/g/D5P1N1. `testLoad` was fixed with 4.7. `testStore` still combines via the stack.

[Bug target/43514] use of SSE shift intrinsics introduces unnecessary moves to the stack and back

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43514 Matthias Kretz changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/85048] [missed optimization] vector conversions

2018-03-23 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85048 --- Comment #3 from Matthias Kretz --- Just opened PR85052 for tracking __builtin_convertvector support.

[Bug c++/85052] New: Implement support for clang's __builtin_convertvector

2018-03-23 Thread kretz at kde dot org
Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- clang implements __builtin_convertvector to simplify conversions between different vector builtins. In contrast to bitcasts, supported through C casts, this builtin converts

[Bug target/85048] [missed optimization] vector conversions

2018-03-23 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85048 --- Comment #1 from Matthias Kretz --- Godbolt link:

[Bug target/85048] New: [missed optimization] vector conversions

2018-03-23 Thread kretz at kde dot org
Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The following testcase lists all integer and/or float conversions applied to vector builtins of the same number of elements. All of those functions can be compiled to a single

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-10 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786 --- Comment #2 from Matthias Kretz --- Created attachment 43618 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43618=edit unreduced testcase Compile with `g++ -std=c++17 -O2 -march=knl -o knl-fail knl-fail.cpp`. The function

[Bug target/84786] New: [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-09 Thread kretz at kde dot org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- I see generated code, such as: 424821:· vpxord %zmm17,%zmm17,%zmm17 424827:· vpxord %zmm18,%zmm18,%zmm18 [...] 424855:· vunpcklpd

[Bug target/84781] New: [missed optimization] ignore bitmask after movemask

2018-03-09 Thread kretz at kde dot org
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Testcase: https://godbolt.org/g/S3tfrL #include int f(__m128 a) { return _mm_movemask_ps(a)& 0xf; } int f(__m128d a) { return _mm_movemask_pd(a)& 0x3;

[Bug c++/83875] [feature request] target_clones compatible SIMD capability/length check

2018-01-20 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83875 --- Comment #9 from Matthias Kretz --- > inside multi-versioned (target_clones/target) function it depends on the > active target Yes., this part is easy. > inside a constexpr context (function/variable, your examples) or > always_inline

[Bug c++/83875] [feature request] target_clones compatible SIMD capability/length check

2018-01-17 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83875 --- Comment #7 from Matthias Kretz --- Hmm, what should the following print? constexpr int native_simd_width = __builtin_target_supports("avx512f") ? 64 : __builtin_target_supports("avx") ? 32 : __builtin_target_supports("sse") ? 16 :

[Bug c++/83875] [feature request] target_clones compatible SIMD capability/length check

2018-01-17 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83875 Matthias Kretz changed: What|Removed |Added CC||kretz at kde dot org --- Comment #6

[Bug target/83894] [missed optimization] __v16qu shift instruction sequence on x86

2018-01-16 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83894 --- Comment #2 from Matthias Kretz --- I compiled with: g++-7 -march=haswell -std=c++17 -O3 -flax-vector-conversions -o char_shift char_shift.cpp

[Bug target/83894] [missed optimization] __v16qu shift instruction sequence on x86

2018-01-16 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83894 --- Comment #1 from Matthias Kretz --- Created attachment 43149 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43149=edit tsc.h Header required for the benchmark code.

[Bug target/83894] New: [missed optimization] __v16qu shift instruction sequence on x86

2018-01-16 Thread kretz at kde dot org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Created attachment 43148 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43148=edit benchmark shifts of vector builtins with 8-

[Bug c++/83793] Pack expansion outside of lambda containing the pack incorrectly rejected

2018-01-15 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83793 --- Comment #4 from Matthias Kretz --- (In reply to Jonathan Wakely from comment #2) > Looks like a dup of PR 47226 Ah, yes. Sorry for missing it, I recall seeing it before. I agree, a backport would be nice, but an overhaul is not a

[Bug c++/83856] New: ICE in tsubst_copy;

2018-01-15 Thread kretz at kde dot org
: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Testcase (cf. https://godbolt.org/g/jFkk7N): ``` #include template struct simd { static constexpr size_t size() { return 4; } template simd(F &, decltype(std::declval()(0)) * = nul

[Bug c++/83793] New: Pack expansion outside of lambda containing the pack incorrectly rejected

2018-01-11 Thread kretz at kde dot org
: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Testcase (https://godbolt.org/g/mNhetZ): #include #include using std::size_t; template auto f(std::index_sequence) { std

[Bug c++/47226] [C++0x] GCC doesn't expand template parameter pack that appears in a lambda-expression

2017-07-06 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47226 Matthias Kretz changed: What|Removed |Added CC||kretz at kde dot org --- Comment #10

[Bug target/80517] New: [missed optimization] constant propagation through Intel intrinsics

2017-04-25 Thread kretz at kde dot org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Related: #55894 Testcase: #include int f() { __m128i x{}; x = _mm_cmpeq_epi16(x, x); return _pext_u32(_mm_movemask_epi8(x

[Bug c++/68049] template instantiation involving may_alias defines symbol twice

2015-12-08 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68049 --- Comment #1 from Matthias Kretz --- Is there anything I can do to help finding a resolution to this issue? It's a rather annoying issue for my SIMD code.

[Bug c++/68049] New: template instantiation involving may_alias defines symbol twice

2015-10-22 Thread kretz at kde dot org
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The following testcase fails to compile at -O0, but works at -O1 and higher (-std=c++11 is the only required compiler flag): template struct

[Bug libstdc++/67011] division by zero in std::exponential_distribution

2015-07-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67011 Matthias Kretz kretz at kde dot org changed: What|Removed |Added CC||kretz at kde dot

[Bug target/66866] New: [miscompile] incorrect load address on manual vector shuffle

2015-07-14 Thread kretz at kde dot org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The following testcase fails at -O2: #include xmmintrin.h typedef short A __attribute__((__may_alias__)); short extr(const __m128i d, int index

[Bug c++/63423] New: internal compiler error: in cp_parser_abort_tentative_parse, at cp/parser.c

2014-10-01 Thread kretz at kde dot org
: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Testcase: template typename F, typename A, typename = decltype(static_castvoid ()(A )(F::operator())(A())) void test(); Compile it with '-std=c

[Bug c++/50800] Internal compiler error in finish_member_declarations, possibly related to may_alias attribute

2014-09-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50800 Matthias Kretz kretz at kde dot org changed: What|Removed |Added CC||kretz at kde dot

[Bug c++/63385] New: internal compiler error: in pop_binding, at cp/name-lookup.c for implicitly captured variable called closure

2014-09-26 Thread kretz at kde dot org
Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Testcase: template typename F void f(F closure) { auto g = []() { return closure; }; } Compile

[Bug c++/63385] internal compiler error: in pop_binding, at cp/name-lookup.c for implicitly captured variable called closure

2014-09-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63385 Matthias Kretz kretz at kde dot org changed: What|Removed |Added Known to work||4.9.0, 4.9.1, 4.9.2

[Bug tree-optimization/57156] miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a

2013-07-11 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57156 --- Comment #8 from Matthias Kretz kretz at kde dot org --- I just noticed the following in the Intel Optimization Reference Manual (Version 028 from July 2013), section 2.2 Sandy Bridge: 2.2.3.1 Renamer [...] There is another dependency breaking

[Bug c++/57532] New: [4.8.1 regression] operator broken when used on rvalues

2013-06-05 Thread kretz at kde dot org
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org int main() { return (int() int()); } compile with g++ -O0 -o test main.cpp this compiles with all compilers I know, except GCC 4.8.1

[Bug target/57156] New: miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a

2013-05-03 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57156 Bug #: 57156 Summary: miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a Classification: Unclassified Product: gcc Version:

[Bug target/57156] miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a

2013-05-03 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57156 Matthias Kretz kretz at kde dot org changed: What|Removed |Added Known to fail||4.7.0, 4.7.1

[Bug target/57156] miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a

2013-05-03 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57156 --- Comment #2 from Matthias Kretz kretz at kde dot org 2013-05-03 09:15:33 UTC --- The failure disappears with -fno-tree-ccp

[Bug target/57156] miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a

2013-05-03 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57156 --- Comment #4 from Matthias Kretz kretz at kde dot org 2013-05-03 09:37:58 UTC --- (In reply to comment #3) I think this is undefined code as you use a uninitialized. I wouldn't know how to counter this for the _mm_cmpeq_epi8 case

[Bug target/57156] miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a

2013-05-03 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57156 --- Comment #5 from Matthias Kretz kretz at kde dot org 2013-05-03 09:56:00 UTC --- (In reply to comment #4) I wouldn't know how to counter this for the _mm_cmpeq_epi8 case Actually, I have yet to find something in the standard

[Bug target/47769] [missed optimization] use of btr (bit test and reset)

2013-05-03 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47769 --- Comment #5 from Matthias Kretz kretz at kde dot org 2013-05-03 11:45:49 UTC --- Another ping. The bug status is still WAITING...

[Bug tree-optimization/57156] miscompilation of call to _mm_cmpeq_epi8(a, a) or _mm_comtrue_epu8(a, a) with uninitialized a

2013-05-03 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57156 Matthias Kretz kretz at kde dot org changed: What|Removed |Added Component|target |tree-optimization

[Bug tree-optimization/56918] New: incorrect auto-vectorization of array initialization

2013-04-11 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56918 Bug #: 56918 Summary: incorrect auto-vectorization of array initialization Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity:

[Bug tree-optimization/56920] New: another static initialization of an array miscompiled

2013-04-11 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56920 Bug #: 56920 Summary: another static initialization of an array miscompiled Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity:

[Bug c++/56038] declarations in xmmintrin.h conflict with mingw-w64 intrin.h in c++ mode

2013-02-15 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56038 Matthias Kretz kretz at kde dot org changed: What|Removed |Added CC||kretz at kde

[Bug tree-optimization/56253] New: fp-contract does not work with SSE and AVX FMAs (neither FMA4 nor FMA3)

2013-02-08 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56253 Bug #: 56253 Summary: fp-contract does not work with SSE and AVX FMAs (neither FMA4 nor FMA3) Classification: Unclassified Product: gcc Version: 4.7.2

[Bug libstdc++/56019] New: max_align_t should be in std namespace

2013-01-17 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56019 Bug #: 56019 Summary: max_align_t should be in std namespace Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority:

[Bug middle-end/56022] New: [4.8 regression] ICE (segfault) at convert_memory_address_addr_space (explow.c:334)

2013-01-17 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56022 Bug #: 56022 Summary: [4.8 regression] ICE (segfault) at convert_memory_address_addr_space (explow.c:334) Classification: Unclassified Product: gcc Version: 4.8.0

[Bug libstdc++/55727] New: better support for dynamic allocation of over-aligned types

2012-12-18 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55727 Bug #: 55727 Summary: better support for dynamic allocation of over-aligned types Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug libstdc++/55727] better support for dynamic allocation of over-aligned types

2012-12-18 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55727 --- Comment #1 from Matthias Kretz kretz at kde dot org 2012-12-18 08:53:24 UTC --- Created attachment 28992 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28992 simple testcase for std::vector

[Bug libstdc++/55727] better support for dynamic allocation of over-aligned types

2012-12-18 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55727 --- Comment #2 from Matthias Kretz kretz at kde dot org 2012-12-18 09:11:41 UTC --- (In reply to comment #0) Right now it does not even suffice to reimplement new/delete inside Foo to make std::vectorFoo work. Sorry, this statement

[Bug libstdc++/55727] better support for dynamic allocation of over-aligned types

2012-12-18 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55727 --- Comment #3 from Matthias Kretz kretz at kde dot org 2012-12-18 13:26:21 UTC --- (In reply to comment #2) (In reply to comment #0) Right now it does not even suffice to reimplement new/delete inside Foo to make std::vectorFoo

[Bug libstdc++/55727] better support for dynamic allocation of over-aligned types

2012-12-18 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55727 --- Comment #4 from Matthias Kretz kretz at kde dot org 2012-12-18 18:20:00 UTC --- Created attachment 29002 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29002 support for over-aligned types in new_allocator I finished my allocator to fix

[Bug libstdc++/55727] better support for dynamic allocation of over-aligned types

2012-12-18 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55727 Matthias Kretz kretz at kde dot org changed: What|Removed |Added Attachment #29002|0 |1

[Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads

2012-11-24 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448 --- Comment #3 from Matthias Kretz kretz at kde dot org 2012-11-24 21:38:21 UTC --- BTW, the problem is just as well visible with only SSE. The __m128 case then compiles to movlps and movhps instead of the memory operand.

[Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads

2012-11-23 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448 Bug #: 55448 Summary: using const-reference SSE or AVX types leads to unnecessary unaligned loads Classification: Unclassified Product: gcc Version: 4.7.2

[Bug target/54703] _mm_sub_pd is incorrectly substituted with vandnps

2012-09-26 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54703 --- Comment #7 from Matthias Kretz kretz at kde dot org 2012-09-26 10:52:38 UTC --- Thanks for the quick response! You guys are cool! :) The pattern here is for calculation with extended precision: xh = x mask; xl = x - xh; yh = y

[Bug target/54703] New: [miscompilation] _mm_sub_pd is incorrectly substituted with vandnps

2012-09-25 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54703 Bug #: 54703 Summary: [miscompilation] _mm_sub_pd is incorrectly substituted with vandnps Classification: Unclassified Product: gcc Version: 4.8.0 Status:

[Bug target/54703] [miscompilation] _mm_sub_pd is incorrectly substituted with vandnps

2012-09-25 Thread kretz at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54703 --- Comment #1 from Matthias Kretz kretz at kde dot org 2012-09-25 13:32:27 UTC --- Um, sorry. Forgot to note the compiler switches: gcc -O1 -march=bdver1 I can't reproduce the error with corei7-avx or any other non-AVX target.

<    1   2   3   >