> a patch would  be great, if it is not too much work. Attached you find three 
> assembly language files:
>
> * fma-test_original.s (unchananged csc c to assembly)
> * fma-test_modified.s (modified csc c from previous mail)
> * fma-test_modified_mfma.s (modified csc c and -mfma gcc option)
>
> all files were created with the additional gcc arguments -O3 -S 
> -fverbose-asm. Are these files sufficient?

Yes, thanks a lot. the third case shows that the intrinsic is indeed used.

>
> I hoped the fma libc function would insulate one from intrinsics; the 
> compiler option -mfma should activate (I think via defining a C macro) the 
> use of the corresponding CPU instruction (fma3 on current x86), which my CPU 
> supports, but using it does not seem to make a difference.

Suprising, but I guess, the speedup is too minor, in the presence of all the 
noise regarding stack
usage, etc. In straight-line, dumb, cache-friendly C code it may make a 
difference, I guess...

I will provide an experimental patch for the new operation. Stay tuned.


felix


Reply via email to