[Bug jit/107877] New: segfault in libgccjit when using asse

2022-11-26 Thread andreas_roever at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107877

Bug ID: 107877
   Summary: segfault in libgccjit when using asse
   Product: gcc
   Version: 11.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: jit
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: andreas_roever at web dot de
  Target Milestone: ---

Created attachment 53969
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53969=edit
example of bug

due to bug 107849 I tried to create functions that emulate the bultin AVX
intrinsics.

I seems like there is a bug in libjit though.

In the attached sample there is a simpele function test at the top that should 
provides functionality identical to the _mm256_broadcast_ss intrinsic.

When I try to re-create this function inside the jit compiler. It simply
crashes with a segfault inside the compile function.

Am I doing something wrong that causes the segfault?

Also I guess an error message would be more helpful than a segfault :)

[Bug jit/107849] New: All SIMD instrinsics are missing

2022-11-23 Thread andreas_roever at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107849

Bug ID: 107849
   Summary: All SIMD instrinsics are missing
   Product: gcc
   Version: 11.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: jit
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: andreas_roever at web dot de
  Target Milestone: ---

Created attachment 53956
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53956=edit
A simple example showing the problem

It seems like the jit library only provies builtin functions for the "standard"
builtin functions. All the i386 SSE, AVX, ... stuff is missing.

I attache a silly little program to show what I mean.

The program defines a jit function that takes an argument, calculates the sqrt
using the builtin function and returns that value. It als calls the builtin
directly and then prints out the results


When compiled without defining VECTOR it uses a single float and everything
works as expected. 

When VECTOR is defines everything gets redefined to use an 256bit vector with 8
floats and also uses the appropriate builtin function. In that case the jit
function doesn't find the builtin.



I guess that that is the case because these builtins are not defined in the
same .def file as the functions that are known by libgccjit.

The library only uses the builtins defined in the gcc subdirectoy, but the SIMD
builtins are defined in the folder gcc/config/i386

[Bug libstdc++/103005] experimental simd sin and cos with big arguments returns values bigger than 1

2021-11-06 Thread andreas_roever at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103005

--- Comment #3 from Andreas Röver  ---
I think the source of the problem is function

__fold_input(const simd& __x)

in simd_math.h

this function should return values between -pi/4 and pi/4 but doesn't for the
big argument. E.g for 

2985064393126969344

it returns

27.678747

according to wolfram alpha it should return 

0.189812

This "big" value throws off the calculation of the power series that is
performed after this


the 27.678646 is the result of a rounding error

2985064393126969344 / (pi/2) is (according to wolfram alpha again) 

1900351014455063569.620838

but the CPU gets

1900351014455063552

This seems to be the closest value that the CPU can get to the real value.

the difference between those 2 times pi/2 is the above mentioned 27.67...

... now my numerics skills are not good enough to find a cure for that problem.

[Bug libstdc++/103005] experimental simd sin and cos with big arguments returns values bigger than 1

2021-11-06 Thread andreas_roever at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103005

--- Comment #2 from Andreas Röver  ---
I think the source of the problem is function

__fold_input(const simd& __x)

in simd_math.h

this function should return values between -pi/4 and pi/4 but doesn't for the
big argument. E.g for 

2985064393126969344

it returns

27.678747

according to wolfram alpha it should return 

0.189812

This "big" value throws off the calculation of the power series that is
performed after this


the 27.678646 is the result of a rounding error

2985064393126969344 / (pi/2) is (according to wolfram alpha again) 

1900351014455063569.620838

but the CPU gets

1900351014455063552

This seems to be the closest value that the CPU can get to the real value.

the difference between those 2 times pi/2 is the above mentioned 27.67...

... now my numerics skills are not good enough to find a cure for that problem.

[Bug libstdc++/103005] New: experimental simd sin and cos with big arguments returns values bigger than 1

2021-10-30 Thread andreas_roever at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103005

Bug ID: 103005
   Summary: experimental simd sin and cos with big arguments
returns values bigger than 1
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andreas_roever at web dot de
  Target Milestone: ---

Created attachment 51704
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51704=edit
example program displaying the problem

there seems to be a bug in the sin and cos functions for the experimental/simd
header.

When giving a very big argument to those functions they return values outside
the range -1 .. 1. I understand that the result will be quite meaningless for
such big values. But I'd argue that it should always be in the range -1..1.

so calculating sin(-2985064393126969344) gives 2527133379.389218


This is on Linux

compiled with

g++-11.2.0 -g -march=native --std=c++20 test1.cpp

here the CPU info (for one core)

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 94
model name  : Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
stepping: 3
microcode   : 0xc2
cpu MHz : 3301.335
cache size  : 6144 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 22
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch
cpuid_fault invpcid_single pti ibrs ibpb stibp tpr_shadow vnmi flexpriority ept
vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx
rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida
arat pln pts hwp hwp_notify hwp_act_window hwp_epp
vmx flags   : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb
flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple
shadow_vmcs pml
bugs: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds
swapgs taa itlb_multihit srbds
bogomips: 6399.96
clflush size: 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual