> a patch would be great, if it is not too much work. Attached you find three > assembly language files: > > * fma-test_original.s (unchananged csc c to assembly) > * fma-test_modified.s (modified csc c from previous mail) > * fma-test_modified_mfma.s (modified csc c and -mfma gcc option) > > all files were created with the additional gcc arguments -O3 -S > -fverbose-asm. Are these files sufficient?
Yes, thanks a lot. the third case shows that the intrinsic is indeed used. > > I hoped the fma libc function would insulate one from intrinsics; the > compiler option -mfma should activate (I think via defining a C macro) the > use of the corresponding CPU instruction (fma3 on current x86), which my CPU > supports, but using it does not seem to make a difference. Suprising, but I guess, the speedup is too minor, in the presence of all the noise regarding stack usage, etc. In straight-line, dumb, cache-friendly C code it may make a difference, I guess... I will provide an experimental patch for the new operation. Stay tuned. felix
