Hi,
We are trying to create a memory barrier with following testcase.
=====================================
#include <stdio.h>
void Test()
{
float fDivident = 0.000000001f;
float fResult = 0.0f;
fResult = ( fDivident / fResult );
__asm volatile ("mfence" ::: "memory");
printf("\nResult: %f\n", fResult);
}
======================================
'mfence' performs a serializing operation on all load-from-memory and
store-to-memory instructions that were issued prior the MFENCE instruction.
This serializing operation guarantees that every load and store instruction
that precedes the MFENCE instruction in program order becomes globally visible
before any load or store instruction that follows the MFENCE instruction.
The mfence instruction with memory clobber asm instruction should create a
barrier between division and printf instructions.
When the testcase is compiled with optimization options O1 and above it can be
observed that the mfence instruction is reordered and precedes division
instruction.
We expected that the two sets of assembly instructions, one pertaining to
division operation and another pertaining to the printf operation, would not
get mixed up on reordering by the GCC compiler optimizer because of the
presence of the __asm volatile ("mfence" ::: "memory"); line between them.
But, the generated assembly, which is inlined below for reference, isn't quite
right as per our expectation.
====================================================================
pushl %ebp # 23 *pushsi2 [length = 1]
movl %esp, %ebp # 24 *movsi_internal/1 [length = 2]
subl $24, %esp # 25 pro_epilogue_adjust_stack_si_add/1
[length = 3]
mfence
fldz # 20 *movxf_internal/3 [length = 2]
fdivrs .LC0 # 13 *fop_xf_4_i387/1 [length = 6]
====================================================================
You may note that the mfence instruction is generated before the fdivrs
instruction.
Can you please let us know if the usage of the "asm (mfence)" instruction as
given in the above testcase is the right way of creating the expected memory
barrier between the two sets of instructions pertaining to the division and
printf operations, respectively or not?
If yes, then we think, it's a bug in Compiler. Could you please confirm?
If no, then what is the correct usage of "asm (mfence)" so as to get/ achieve
the memory barrier functionality as expected in the above testcase?
Thanks,
Vivek Kinhekar