> -----Original Message----- > From: Matt Turner [mailto:[email protected]] > Sent: Wednesday, March 11, 2015 10:20 AM > To: Song, Ruiling > Cc: Luo, Xionghu; [email protected] > Subject: Re: [Beignet] [PATCH 6/7] replace mad with llvm intrinsic. > > On Tue, Mar 10, 2015 at 6:55 PM, Song, Ruiling <[email protected]> > wrote: > >> I'm not sure that it matters for this patch, but do we know if Gen's > >> MAD instruction is a fused-multiply-add? That is, does it not do an > >> intermediate rounding step after the multiply? > > I also have such kind of concern, so I did a simple test: > > on cpu side, I use "reference = (double)x1*(double)x2 + (double)x3;" > > Some recent CPUs have FMA instructions. You should make sure you know > whether your code is compiled using FMA or not. > > > And on gpu side, I use "result = mad(x1, x2, x3);" > > Then compare the result and reference, the bits are exactly the same, so I > think gen's MAD does not do intermediate rounding after multiply. > > The intermediate rounding step will not affect many pairs of numbers that > are multiplied together. You need to make sure you're testing a pair of > numbers that are affected by the intermediate rounding step. > > I wrote a small program to find cases where fmaf(x, y, z) != x*y+z (attached). > Compile with -std=c99 -O2 -march=native -lm. I'm testing on Haswell which > has FMA. > > It shows that > > fmaf(1, 0.333333, 0.666667) is 1 (0x1.000002p+0), but 1 * 0.333333 + > 0.666667 is 1 (0x1p+0) > > Please test that Gen's MAD instruction produces what fmaf() produces for > 1.0 * 0.333333 + 0.666667. > > Assuming glibc's fmaf() is correct... I'm again surprised by floating-point > numbers. :)
My gcc doesn't have nextafterf and fmaf definition, and I use g++ to build on my ivb machine. g++ -O2 -march=native -lm -o fma fma.c its output ( I changed to use "%a" in printf): fmaf(0x1.000002p+0, 0x1.555556p-2, 0x1.555556p-1) is 1 (0x1.000002p+0), but 0x1.000002p+0 * 0x1.555556p-2 + 0x1.555556p-1 is 1 (0x1p+0) and I tried using gen's MAD, its result is same as fmaf. You can have a try on your haswell machine. I think the result would be the same. _______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
