Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)

2020-07-14 Thread Tim Janik via beast
Great! Thanks a lot for the detailed analysis.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/tim-janik/beast/pull/140#issuecomment-658463809___
beast mailing list
beast@gnome.org
https://mail.gnome.org/mailman/listinfo/beast


Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)

2020-03-13 Thread Stefan Westerfeld via beast
As additional information, g++-8 also generates different code if you use -mfma.
```
$ out/tests/suite1 --resampler accuracy --fpu --precision=24 --subsample 
--freq-scan=90,9000,983 --freq-scan-verbose --verbose

## g++ 8.3.0 without fma

# accuracy test for factor 2 subsampling using FPU instructions
#   input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq 
increment = 983.00)
90.0 -129.40792803150048940
1073.0 -130.14152510098830362
2056.0 -131.64395956575623359
3039.0 -131.48377752074048885
4022.0 -127.52623479658792860
5005.0 -131.74602243860934436
5988.0 -131.71784952867952256
6971.0 -133.09696883539237433
7954.0 -131.53986659892970579
8937.0 -131.78909542900839824
#   max difference between correct and computed output: 0.00 = -127.526235 
dB

## g++ 8.3.0 with fma

# accuracy test for factor 2 subsampling using FPU instructions
#   input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq 
increment = 983.00)
90.0 -129.82104550124302023
1073.0 -131.03707434810391419
2056.0 -132.40996614732904391
3039.0 -133.12443453736065635
4022.0 -129.32954166513573568
5005.0 -131.90536134591866357
5988.0 -130.93045250711062977
6971.0 -132.01135805265508338
7954.0 -132.97021680274394839
8937.0 -133.11067142912733630
#   max difference between correct and computed output: 0.00 = -129.329542 
dB
```

So (unlike clang9), the fma version provides slightly more accurate results for 
our 4022 Hz sine wave.

My mental model is this: we start from the error the filter gives us under 
perfect conditions (infinite precision coefficients and arithmetic). Then we 
add some random error (with some distribution) that depends on the type and 
order of the instructions generated by the compiler, due to finite precision. 
From everything I saw, both clang9 and g++ 8.3.0 provide valid translations 
with and without -mfma.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/tim-janik/beast/pull/140#issuecomment-598647103___
beast mailing list
beast@gnome.org
https://mail.gnome.org/mailman/listinfo/beast


Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)

2020-03-12 Thread Stefan Westerfeld via beast
> Did you find out _why_ this is needed?
> I.e. peeked at the generated assembly to figure if -ffast-math related 
> options possibly allow transformations that could become problematic for us 
> in the long term?

**Short answer:**
I spent some time debugging it, -mfma is causing this - removing the flag gives 
us the old behaviour, clang does use fused multiply add instructions in the 
inner loop of the resampler. I don't believe that this optimization has 
negative impact for us, so adjusting the threshold is the sane thing to do here.

**Long answer:**

Lets first look at what exactly is failing here. We have (in explicit form):
```
$ out/tests/suite1 --resampler accuracy --fpu --precision=24 --subsample 
--freq-scan=90,9000,983 --freq-scan-verbose --verbose

## clang9 without fma

# accuracy test for factor 2 subsampling using FPU instructions
#   input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq 
increment = 983.00)
90.0 -129.45954766716391759
1073.0 -130.16677118621706200
2056.0 -129.70751203525446726
3039.0 -132.36528831269109219
4022.0 -128.39849621726884266
5005.0 -128.95512230052295877
5988.0 -131.49506213647262598
6971.0 -131.66641240173382243
7954.0 -131.26927901299603718
8937.0 -134.06944494072371299
#   max difference between correct and computed output: 0.00 = -128.398496 
dB

## clang9 with fma

# accuracy test for factor 2 subsampling using FPU instructions
#   input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq 
increment = 983.00)
90.0 -129.87588683579858184
1073.0 -126.11881934685534645
2056.0 -128.02398873831205606
3039.0 -128.56669380739046460
4022.0 -124.92935085933235939
5005.0 -128.10076816664792432
5988.0 -128.49709000891502342
6971.0 -127.39047422816582866
7954.0 -127.65962334132801459
8937.0 -130.40537251311835121
#   max difference between correct and computed output: 0.01 = -124.929351 
dB
```

So we're testing 24 bit downsampling followed by upsampling with FPU 
instructions here. Using a 4022 Hz sine wave performs worse with fma. Note that 
24 bit is the most problematic test we have, since the accuracy of the floating 
point computations is not really good enough to reliably evaluate the 
convolution of the large FIR filter. Anyway if we look at the source and 
assembly of the FPU code, we'll see the difference:

**Source Code:**
```
template static inline Accumulator
fir_process_one_sample (const float *input,
const float *taps, /* [0..order-1] */
const uint   order)
{
  Accumulator out = 0;
  for (uint i = 0; i < order; i++)
out += input[i] * taps[i];
  return out;
}
```
Both assembly dumps use loop unrolling, so I'm truncating the assembly code. 
Also I only show the upsampling step here, but downsampling looks the same.

**Code generated with clang9 & -mfma**
```
 ::process_sample_unaligned(float const*, float*)>:
   0:   48 8b 47 08 mov0x8(%rdi),%rax
   4:   c5 fa 10 06 vmovss (%rsi),%xmm0
   8:   c5 fa 10 4e 04  vmovss 0x4(%rsi),%xmm1
   d:   c5 f2 59 48 04  vmulss 0x4(%rax),%xmm1,%xmm1
  12:   c4 e2 79 b9 08  vfmadd231ss (%rax),%xmm0,%xmm1
  17:   c5 fa 10 46 08  vmovss 0x8(%rsi),%xmm0
  1c:   c4 e2 71 99 40 08   vfmadd132ss 0x8(%rax),%xmm1,%xmm0
  22:   c5 fa 10 4e 0c  vmovss 0xc(%rsi),%xmm1
  27:   c4 e2 79 99 48 0c   vfmadd132ss 0xc(%rax),%xmm0,%xmm1
  2d:   c5 fa 10 46 10  vmovss 0x10(%rsi),%xmm0
  32:   c4 e2 71 99 40 10   vfmadd132ss 0x10(%rax),%xmm1,%xmm0
  38:   c5 fa 10 4e 14  vmovss 0x14(%rsi),%xmm1
  3d:   c4 e2 79 99 48 14   vfmadd132ss 0x14(%rax),%xmm0,%xmm1
  43:   c5 fa 10 46 18  vmovss 0x18(%rsi),%xmm0
  48:   c4 e2 71 99 40 18   vfmadd132ss 0x18(%rax),%xmm1,%xmm0
...
```
**Code generated with clang9 without -mfma**
```
 6204 
 6205  ::process_sample_unaligned(float const*, float*)>:
 62060:   48 8b 47 08 mov0x8(%rdi),%rax
 62074:   c5 fa 10 06 vmovss (%rsi),%xmm0
 62088:   c5 fa 10 4e 04  vmovss 0x4(%rsi),%xmm1
 6209d:   c5 fa 59 00 vmulss (%rax),%xmm0,%xmm0
 6210   11:   c5 f2 59 48 04  vmulss 0x4(%rax),%xmm1,%xmm1
 6211   16:   c5 fa 58 c1 vaddss %xmm1,%xmm0,%xmm0
 6212   1a:   c5 fa 10 4e 08  vmovss 0x8(%rsi),%xmm1
 6213   1f:   c5 f2 59 48 08  vmulss 0x8(%rax),%xmm1,%xmm1
 6214   24:   c5 fa 10 56 0c  vmovss 0xc(%rsi),%xmm2
 6215   29:   c5 ea 59 50 0c  vmulss 0xc(%rax),%xmm2,%xmm2
 6216   2e:   c5 f2 58 ca vaddss %xmm2,%xmm1,%xmm1
 

Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)

2020-02-13 Thread Tim Janik via beast
> Relax expected accuracy for 24-bit subsampling from 126 to 124.5 dB, to fix
> testresampler for clang9.
> 
> This should fix #139.

Did you find out *why* this is needed?
I.e. peeked at the generated assembly to figure if -ffast-math related options 
possibly allow transformations that could become problematic for us in the long 
term?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/tim-janik/beast/pull/140#issuecomment-585864349___
beast mailing list
beast@gnome.org
https://mail.gnome.org/mailman/listinfo/beast


[tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)

2020-02-13 Thread Stefan Westerfeld via beast
Relax expected accuracy for 24-bit subsampling from 126 to 124.5 dB, to fix
testresampler for clang9.

This should fix #139.
You can view, comment on, or merge this pull request online at:

  https://github.com/tim-janik/beast/pull/140

-- Commit Summary --

  * TESTS: testresampler: fix resampler tests for clang9

-- File Changes --

M tests/testresampler.cc (2)

-- Patch Links --

https://github.com/tim-janik/beast/pull/140.patch
https://github.com/tim-janik/beast/pull/140.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/tim-janik/beast/pull/140
___
beast mailing list
beast@gnome.org
https://mail.gnome.org/mailman/listinfo/beast