I've not benchmarked to know if their is any real benefit, but changing the include in fast_float_math_hack.h to <mathimf.h> is all that is required to use the latest ICC.
John On 17/05/2014 10:26, lvqcl wrote: > The file > src/share/replaygain_synthesis/include/private/fast_float_math_hack.h > redefines 'tanh' as 'tanhf'. This file is intended for Intel Compiler only, > but it includes outdated mathf.h and doesn't work with current versions > of ICC. > > The fixes are trivial though, and I compiled 2 versions of flac.exe: > with this > 'hack' turned off an on. The difference in decoding speed is very close to > measurement inaccuracy: for 32-bit encoder the decoding time decreases > from 94.5s > to 94.0s, for 64-bit it increases from 82.6s to 82.9s. > (the option for this test was: > --apply-replaygain-which-is-not-lossless=Ln0) > > So this hack is really useless today, and the first patch removes > fast_float_math_hack.h from the sources. > > > > > MSVS profiler shows that tanh calculation doesn't require too much CPU > resources, > the real problem is an integer division (int_64/int_32) in this line: > > val64 = dither_output_(........) / conv_factor; > > Since all possible values of conv_factor are powers of 2, it's possible to > replace division with shift. The second patch does this. > > Decoding time decreases from 94.5s to 64.1s for 32-bit ICC compile, and > from 82.6s to 50.0s for 64-bit ICC compile. > > > > ************************************************* > P.S. Actually, shift ( x << n ) and division ( x / (1<<n) ) can give > different results if x < 0. The difference is very small though: WAV files > differ by 1 LSB. And probably shift gives better results than division. > > Let's compare shift by 2 and division by (1<<2) == 4: > > *** shift *** > argument result > .... > 12, 13, 14, 15 -> 3 > 8, 9, 10, 11 -> 2 > 4, 5, 6, 7 -> 1 > 0, 1, 2, 3 -> 0 > -4, -3, -2, -1 -> -1 > -8, -7, -6, -5 -> -2 > .... > > *** division *** > argument result > .... > 12, 13, 14, 15 -> 3 > 8, 9, 10, 11 -> 2 > 4, 5, 6, 7 -> 1 > -3, -2, -1, 0, 1, 2, 3 -> 0 > -7, -6, -5, -4 -> -> -1 > -11,-10,-9, -8 -> -> -2 > .... > > > So, shift results in small DC offset (1/2 LSB), division results in > small 'nonlinearity' near 0. > > > _______________________________________________ > flac-dev mailing list > flac-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/flac-dev > _______________________________________________ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev