On 2/2/2015 9:06 AM, Johannes Pfau wrote:
_Dmain:
push rbp
mov rbp,rsp
sub rsp,0x10
mov rax,0x5 <==
mov QWORD PTR [rbp-0x8],rax
mov ecx,DWORD PTR [rax] <== a register based load
The instruction it should generate is
mov ecx, [0x5]
In 64 bit mode, there is no direct addressing like that. The above would be
relative to the instruction pointer, which is RIP, and is actually:
mov ECX, 5[RIP]
So, to load address location 5, you would have to load it into a register first.
(You'd be right for 32 bit x86. But also, all 32 bit x86's have an MMU rather
than direct addressing, and it would be strange to set up the x86 embedded
system to use MMIO rather than the IO instructions, which are designed for that
purpose.)
Not sure if it's actually more efficient on X86 but it makes a huge
difference on real microcontroller architectures.
What addressing mode is generated by the back end has nothing whatsoever to do
with using volatileLoad() or pragma(address).
To reiterate, volatileLoad() and volatileStore() are not reordered by the
optimizer, and replacing them with pragma(address) is not going to make for
better code generation.
The only real issue is the forceinline one.