bearophile, Thank you so much for all your help. It seems you're very into ASM. I kept the D_InlineAsm_X86 in my code as you suggested. The code i gave here was just an example. But my code's version implementation looks like this:
version(D_InlineAsm_X86) { // ASM Code. } else { // D code. } This results in a much robust code. You were right about it. You are right too about the "load-load-load processing-processing-processing store-store-store instead a load-processing-store load-processing-store load-processing-store" thing. I'll modify my code to this model, though it will require to move some elements to the stack but no big deal, i think this won't hurt performance as it is designed to work this way. -Does ASM kill inlining for the function where the asm block is present or for the whole compilation? -In your opinion, How badly can be if function inlining is not present? some docs from the net: http://www.parashift.com/c++-faq-lite/inline-functions.html Cheers, Heinz