[Bug target/90984] PowerPC cast from vector unsigned long long to vector double does not do an integer to float conversion

2019-06-24 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90984

--- Comment #1 from Shawn Landden  ---
Also, how do I do a integer-to-float conversion to work around this bug?

[Bug target/90984] New: PowerPC cast from vector unsigned long long to vector double does not do an integer to float conversion

2019-06-24 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90984

Bug ID: 90984
   Summary: PowerPC cast from vector unsigned long long to vector
double does not do an integer to float conversion
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

Implicit integer to float conversions are not my favorite part of C, but they
are a part of C, and not doing them in this case results in unexpected
behavior.

#include 
int main() {
vector unsigned long long in = {1, 1};
vector double out = (vector double)in;
if (out[0] != 1.0 || out[1] != 1.0)
 return 1;
}

[Bug tree-optimization/90774] avoid doing vector splat arithmetic where possible

2019-06-07 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90774

--- Comment #3 from Shawn Landden  ---
> So this kind of reassociation can only be done with either -fwrapv or 
> unsigned types.  Due to integer overflow being undefined.


That depends on 1) if operations are re-ordered differn't to the order of
operations, and 2) how the other optimizations handle integer overflow that
they determine is UB

-fwrapv is completely legal even if it is not passed, and generally I think
this optimization (if applicable) would outweigh some UB optimizations.

[Bug middle-end/90774] New: avoid doing vector splat arithmetic where possible

2019-06-06 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90774

Bug ID: 90774
   Summary: avoid doing vector splat arithmetic where possible
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

When gcc knows that it is dealing with splats it should just do regular
arithmetic, and only convert to splat at the end.

https://simd.godbolt.org/z/6P3Qcq

[Bug target/90768] better range analysis for converting bit tests into less-than greater-than

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

Shawn Landden  changed:

   What|Removed |Added

  Component|middle-end  |target

--- Comment #3 from Shawn Landden  ---
Yeah I think this is a target issue. These two functions should produce
identical code: https://godbolt.org/z/omu09e

[Bug middle-end/90768] better range analysis for converting lt/gt into bit tests

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

Shawn Landden  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #2 from Shawn Landden  ---
Segher: Not sure if this is a PowerPC issue or a middle-end issue.

[Bug middle-end/90768] better range analysis for converting lt/gt into bit tests

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

--- Comment #1 from Shawn Landden  ---
Whoops I got that backwards, converting the bit test to a
greater-than-or-equal-to is better.

[Bug middle-end/90768] New: better range analysis for converting lt/gt into bit tests

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

Bug ID: 90768
   Summary: better range analysis for converting lt/gt into bit
tests
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

Converting the >= 8 to a & 8 results in one less instruction on power 9

https://godbolt.org/z/0QPN3z

#include 
#include 
int bcmp_2(char *a, char *b, size_t s) {
if (s < 16) {
if (s >= 8)
if (*(uint64_t *)a != *(uint64_t *)b)
return 1;

}
}

[Bug target/90763] New: vec_xl_len should take constnan

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90763

Bug ID: 90763
   Summary: vec_xl_len should take constnan
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

This should compile (that const is preventing it):

#include 
vector char vec_load_const(const unsigned char *s, int num) {
return vec_xl_len(s, num);
}

https://ppc.godbolt.org/z/KkV5JN

[Bug c/90580] New: error: ‘offsetof’ undeclared when it is declared, but used with the wrong number of arguments

2019-05-22 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90580

Bug ID: 90580
   Summary: error: ‘offsetof’ undeclared when it is declared, but
used with the wrong number of arguments
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

test2.c: In function ‘main’:
test2.c:113:42: error: macro "offsetof" requires 2 arguments, but only 1 given
  113 | printf("offsetof %u", offsetof(key.rounds));
  |  ^
In file included from test2.c:64:
/usr/lib/gcc/powerpc64le-linux-gnu/9/include/stddef.h:406: note: macro
"offsetof" defined here
  406 | #define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER)
  | 
test2.c:113:23: error: ‘offsetof’ undeclared (first use in this function)
  113 | printf("offsetof %u", offsetof(key.rounds));
  |   ^~~~
test2.c:65:1: note: ‘offsetof’ is defined in header ‘’; did you
forget to ‘#include ’?
   64 | #include 
  +++ |+#include 
   65 | /*
test2.c:113:23: note: each undeclared identifier is reported only once for each
function it appears in
  113 | printf("offsetof %u", offsetof(key.rounds));
  |   ^~~~

[Bug target/90453] PowerPC/AltiVec VSX: Provide vec_pack/vec_unpackh/vec_unpackl for 32<->64

2019-05-18 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90453

--- Comment #6 from Shawn Landden  ---
Ahh, sorry for wasting your time. I didn't notice the signed requirement, which
is why it didn't work.

[Bug target/90453] PowerPC/AltiVec VSX: Provide vec_pack/vec_unpackh/vec_unpackl for 32<->64

2019-05-18 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90453

--- Comment #4 from Shawn Landden  ---
Oh my bad, I got it backwards

vector unsigned long long unpackedl, unpackedr;
vector unsigned int packed;

packed = vec_pack(unpackedl, unpackedr);
unpackedl = vec_unpackh(packed);
unpackedr = vec_unpackl(packed);

The point is that it is similar to the other pack/unpack unfunctions. Yet
somehow this one doesn't exist (probably because there is no hardware
instruction for it).

[Bug target/90453] PowerPC/AltiVec VSX: Provide vec_pack/vec_unpackh/vec_unpackl for 32<->64

2019-05-18 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90453

--- Comment #2 from Shawn Landden  ---
vector unsigned long long unpacked;
vector unsigned int packedl, packedr;

unpacked = vec_pack(packedl, packedr);
packedl = vec_unpackh(unpacked);
packedr = vec_unpackl(unpacked);

[Bug target/90453] New: PowerPC/AltiVec VSX: Provide vec_pack/vec_unpackh/vec_unpackl for 32<->64

2019-05-13 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90453

Bug ID: 90453
   Summary: PowerPC/AltiVec VSX: Provide
vec_pack/vec_unpackh/vec_unpackl for 32<->64
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

I know these are not part of the spec, but it would make coding easier, as a
GNU extension.

[Bug other/90431] New: support __builtin_cpu_supports() in Linux kernel code

2019-05-10 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90431

Bug ID: 90431
   Summary: support __builtin_cpu_supports() in Linux kernel code
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

Given that the glibc features is based on AT_HWCAPS which comes from Linux this
should be quite straight forward.

#ifdef __KERNEL__

[Bug ipa/82625] lower-optimization are not inlined with symbol multiversioning

2019-05-09 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82625

--- Comment #8 from Shawn Landden  ---
Included in gcc 9

[Bug other/90403] New: __target_clones__ should directly call other __target_clones__ functions, as appropiate

2019-05-08 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90403

Bug ID: 90403
   Summary: __target_clones__ should directly call other
__target_clones__ functions, as appropiate
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

If I define two functions that both use the target_clones attribute, and list
the same architectures, when they call each other they still call each other
through the resolver, and the symbol table. This is unnecessary. They should
instead call the version that matches the caller.

[Bug target/90323] powerpc should convert equivalent sequences to vec_sel()

2019-05-06 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #3 from Shawn Landden  ---
Instead:

.globl without_sel
.type   without_sel, @function
without_sel:
.LFB0:
.cfi_startproc
xxlxor 36,34,36
xxland 36,36,35
xxlxor 34,34,36
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.cfi_endproc
.LFE0:
.size   without_sel,.-without_sel
.align 2
.p2align 4,,15
.globl with_sel
.type   with_sel, @function
with_sel:
.LFB1:
.cfi_startproc
xxsel 34,34,36,35
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.cfi_endproc
.LFE1:
.size   with_sel,.-with_sel

[Bug target/90323] powerpc should convert equivalent sequences to vec_sel()

2019-05-06 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #4 from Shawn Landden  ---
that was compiled with -O3

[Bug target/90323] powerpc should convert equivalent sequences to vec_sel()

2019-05-06 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #2 from Shawn Landden  ---
Created attachment 46305
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46305=edit
test case.

These two functions should produce identical code.

[Bug target/90323] New: ppc should convert equivalent sequences to vec_sel()

2019-05-02 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

Bug ID: 90323
   Summary: ppc should convert equivalent sequences to vec_sel()
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

Something like this:

xi = xi & ~is_subnormal;
xi |= subnormal & is_subnormal;

should be converted to:

xi = sel_vec(xi, subnormal, is_subnormal);

[Bug tree-optimization/58774] tree-switch-conversion doesn't optimize with content in default scase

2017-11-07 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58774

--- Comment #5 from Shawn Landden  ---
Appears fixed

[Bug tree-optimization/58774] tree-switch-conversion doesn't optimize with content in default scase

2017-11-07 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58774

Shawn Landden  changed:

   What|Removed |Added

 CC||slandden at gmail dot com

--- Comment #4 from Shawn Landden  ---
Created attachment 42556
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42556=edit
appers-fixed