Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)
Great! Thanks a lot for the detailed analysis. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/tim-janik/beast/pull/140#issuecomment-658463809___ beast mailing list beast@gnome.org https://mail.gnome.org/mailman/listinfo/beast
Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)
As additional information, g++-8 also generates different code if you use -mfma. ``` $ out/tests/suite1 --resampler accuracy --fpu --precision=24 --subsample --freq-scan=90,9000,983 --freq-scan-verbose --verbose ## g++ 8.3.0 without fma # accuracy test for factor 2 subsampling using FPU instructions # input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq increment = 983.00) 90.0 -129.40792803150048940 1073.0 -130.14152510098830362 2056.0 -131.64395956575623359 3039.0 -131.48377752074048885 4022.0 -127.52623479658792860 5005.0 -131.74602243860934436 5988.0 -131.71784952867952256 6971.0 -133.09696883539237433 7954.0 -131.53986659892970579 8937.0 -131.78909542900839824 # max difference between correct and computed output: 0.00 = -127.526235 dB ## g++ 8.3.0 with fma # accuracy test for factor 2 subsampling using FPU instructions # input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq increment = 983.00) 90.0 -129.82104550124302023 1073.0 -131.03707434810391419 2056.0 -132.40996614732904391 3039.0 -133.12443453736065635 4022.0 -129.32954166513573568 5005.0 -131.90536134591866357 5988.0 -130.93045250711062977 6971.0 -132.01135805265508338 7954.0 -132.97021680274394839 8937.0 -133.11067142912733630 # max difference between correct and computed output: 0.00 = -129.329542 dB ``` So (unlike clang9), the fma version provides slightly more accurate results for our 4022 Hz sine wave. My mental model is this: we start from the error the filter gives us under perfect conditions (infinite precision coefficients and arithmetic). Then we add some random error (with some distribution) that depends on the type and order of the instructions generated by the compiler, due to finite precision. From everything I saw, both clang9 and g++ 8.3.0 provide valid translations with and without -mfma. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/tim-janik/beast/pull/140#issuecomment-598647103___ beast mailing list beast@gnome.org https://mail.gnome.org/mailman/listinfo/beast
Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)
> Did you find out _why_ this is needed? > I.e. peeked at the generated assembly to figure if -ffast-math related > options possibly allow transformations that could become problematic for us > in the long term? **Short answer:** I spent some time debugging it, -mfma is causing this - removing the flag gives us the old behaviour, clang does use fused multiply add instructions in the inner loop of the resampler. I don't believe that this optimization has negative impact for us, so adjusting the threshold is the sane thing to do here. **Long answer:** Lets first look at what exactly is failing here. We have (in explicit form): ``` $ out/tests/suite1 --resampler accuracy --fpu --precision=24 --subsample --freq-scan=90,9000,983 --freq-scan-verbose --verbose ## clang9 without fma # accuracy test for factor 2 subsampling using FPU instructions # input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq increment = 983.00) 90.0 -129.45954766716391759 1073.0 -130.16677118621706200 2056.0 -129.70751203525446726 3039.0 -132.36528831269109219 4022.0 -128.39849621726884266 5005.0 -128.95512230052295877 5988.0 -131.49506213647262598 6971.0 -131.66641240173382243 7954.0 -131.26927901299603718 8937.0 -134.06944494072371299 # max difference between correct and computed output: 0.00 = -128.398496 dB ## clang9 with fma # accuracy test for factor 2 subsampling using FPU instructions # input frequency range used [ 90.00 Hz, 9000.00 Hz ] (SR = 44100.0 Hz, freq increment = 983.00) 90.0 -129.87588683579858184 1073.0 -126.11881934685534645 2056.0 -128.02398873831205606 3039.0 -128.56669380739046460 4022.0 -124.92935085933235939 5005.0 -128.10076816664792432 5988.0 -128.49709000891502342 6971.0 -127.39047422816582866 7954.0 -127.65962334132801459 8937.0 -130.40537251311835121 # max difference between correct and computed output: 0.01 = -124.929351 dB ``` So we're testing 24 bit downsampling followed by upsampling with FPU instructions here. Using a 4022 Hz sine wave performs worse with fma. Note that 24 bit is the most problematic test we have, since the accuracy of the floating point computations is not really good enough to reliably evaluate the convolution of the large FIR filter. Anyway if we look at the source and assembly of the FPU code, we'll see the difference: **Source Code:** ``` template static inline Accumulator fir_process_one_sample (const float *input, const float *taps, /* [0..order-1] */ const uint order) { Accumulator out = 0; for (uint i = 0; i < order; i++) out += input[i] * taps[i]; return out; } ``` Both assembly dumps use loop unrolling, so I'm truncating the assembly code. Also I only show the upsampling step here, but downsampling looks the same. **Code generated with clang9 & -mfma** ``` ::process_sample_unaligned(float const*, float*)>: 0: 48 8b 47 08 mov0x8(%rdi),%rax 4: c5 fa 10 06 vmovss (%rsi),%xmm0 8: c5 fa 10 4e 04 vmovss 0x4(%rsi),%xmm1 d: c5 f2 59 48 04 vmulss 0x4(%rax),%xmm1,%xmm1 12: c4 e2 79 b9 08 vfmadd231ss (%rax),%xmm0,%xmm1 17: c5 fa 10 46 08 vmovss 0x8(%rsi),%xmm0 1c: c4 e2 71 99 40 08 vfmadd132ss 0x8(%rax),%xmm1,%xmm0 22: c5 fa 10 4e 0c vmovss 0xc(%rsi),%xmm1 27: c4 e2 79 99 48 0c vfmadd132ss 0xc(%rax),%xmm0,%xmm1 2d: c5 fa 10 46 10 vmovss 0x10(%rsi),%xmm0 32: c4 e2 71 99 40 10 vfmadd132ss 0x10(%rax),%xmm1,%xmm0 38: c5 fa 10 4e 14 vmovss 0x14(%rsi),%xmm1 3d: c4 e2 79 99 48 14 vfmadd132ss 0x14(%rax),%xmm0,%xmm1 43: c5 fa 10 46 18 vmovss 0x18(%rsi),%xmm0 48: c4 e2 71 99 40 18 vfmadd132ss 0x18(%rax),%xmm1,%xmm0 ... ``` **Code generated with clang9 without -mfma** ``` 6204 6205 ::process_sample_unaligned(float const*, float*)>: 62060: 48 8b 47 08 mov0x8(%rdi),%rax 62074: c5 fa 10 06 vmovss (%rsi),%xmm0 62088: c5 fa 10 4e 04 vmovss 0x4(%rsi),%xmm1 6209d: c5 fa 59 00 vmulss (%rax),%xmm0,%xmm0 6210 11: c5 f2 59 48 04 vmulss 0x4(%rax),%xmm1,%xmm1 6211 16: c5 fa 58 c1 vaddss %xmm1,%xmm0,%xmm0 6212 1a: c5 fa 10 4e 08 vmovss 0x8(%rsi),%xmm1 6213 1f: c5 f2 59 48 08 vmulss 0x8(%rax),%xmm1,%xmm1 6214 24: c5 fa 10 56 0c vmovss 0xc(%rsi),%xmm2 6215 29: c5 ea 59 50 0c vmulss 0xc(%rax),%xmm2,%xmm2 6216 2e: c5 f2 58 ca vaddss %xmm2,%xmm1,%xmm1
Re: [tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)
> Relax expected accuracy for 24-bit subsampling from 126 to 124.5 dB, to fix > testresampler for clang9. > > This should fix #139. Did you find out *why* this is needed? I.e. peeked at the generated assembly to figure if -ffast-math related options possibly allow transformations that could become problematic for us in the long term? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/tim-janik/beast/pull/140#issuecomment-585864349___ beast mailing list beast@gnome.org https://mail.gnome.org/mailman/listinfo/beast
[tim-janik/beast] TESTS: testresampler: fix resampler tests for clang9 (#140)
Relax expected accuracy for 24-bit subsampling from 126 to 124.5 dB, to fix testresampler for clang9. This should fix #139. You can view, comment on, or merge this pull request online at: https://github.com/tim-janik/beast/pull/140 -- Commit Summary -- * TESTS: testresampler: fix resampler tests for clang9 -- File Changes -- M tests/testresampler.cc (2) -- Patch Links -- https://github.com/tim-janik/beast/pull/140.patch https://github.com/tim-janik/beast/pull/140.diff -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/tim-janik/beast/pull/140 ___ beast mailing list beast@gnome.org https://mail.gnome.org/mailman/listinfo/beast