On 08/01/14 21:39, Julian Taylor wrote: > An issue is software emulation of real fma. This can be enabled in the > test ufunc with npfma.set_type("libc"). > This is unfortunately incredibly slow about a factor 300 on my machine > without hardware fma. > This means we either have a function that is fast on some platforms and > slow on others but always gives the same result or we have a fast > function that gives better results on some platforms. > Given that we are not worth that what numpy currently provides I favor > the latter. > > Any opinions on whether this should go into numpy or maybe stay a third > party ufunc?
My preference would be to initially add an "madd" intrinsic. This can be supported on all platforms and can be documented to permit the use of FMA where available. A 'true' FMA intrinsic function should only be provided when hardware FMA support is available. Many of the more interesting applications of FMA depend on there only being a single rounding step and as such "FMA" should probably mean "a*b + c with only a single rounding". Regards, Freddie.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion