Joerg Wunsch wrote:
As Stu Bell wrote:
Before the rant below, let me make sure: I interpret this command to
mean that there is no way for the AVR GCC compiler writers to tell
the optimizer, "Thou Shalt Not Reorder Code Around This Boundary".
Even more, there is no way for any mechanism (even #pragma?) for C
source to tell the compiler this.
That's how I recognize it, yes. A couple of years ago, we've been
analyzing a snippet of code in a German-language forum, and that's
been the conclusion. Interestingly enough, IAR produced the very same
code reordering as GCC then.
Like as it's been time to introduce "volatile" in 1989 when
standardizing C for the first time, I think it's time to add another
tweak to the language to tell the compiler "do not reorder code around
this point".
So, blaming the language to say, "well, it just happens" is
specious. GCC allows Linux to run, somehow.
A C compiler has to generate code that appears to act as though it were
being executing on a C virtual machine. But it only has to match up
behaviours when it interacts with the "outside world" - calls to
external libraries or code, and volatile accesses. It has to make these
external interactions in the specified order, and with the expected
values. But between these interactions, it can do as it wants - there
is no way in the language to limit it or control the ordering or timing
of internal calculations. This is fundamental to the way the language
works, and fundamental to the freedom the optimiser has to generate
better code.
Linux is not time-critical, unlike many things that run on
microcontrollers. Moving a couple of instructions around does not
matter in a complex operating system. The code where we came to
the conclusion mentioned above was something like:
void foo(void)
{
some_temp_variable = result_of_expensive_computation;
/* I think it's been a division. */
cli();
something_time_cricital = some_temp_variable;
sei();
}
Both, GCC and IAR reordered the expensive computation *into* the
cli/sei sequence, as they realized it was only once, so there was no
point in executing the instructions earlier. Adding a memory barrier
doesn't change this, as some_temp_variable won't end up in memory
anyway.
I couldn't get any simple examples to exhibit this effect, though I know
the compiler may well do such ordering. If you have an example that
works here, I'd like to test it.
Eric suggested making "some_temp_variable" volatile, which is the
traditional way to enforce ordering in C programming. Strictly
speaking, you don't have to make it volatile - it's enough to make sure
it is in memory if cli() and sei() have memory blocks.
For a slightly more convoluted trick that gives the least impedance to
the optimiser and avoids any extra code or memory accesses, try this:
void foo(void)
{
uint8_t some_temp_variable = result_of_expensive_computation;
/* I think it's been a division. */
asm volatile("" : : "r" (some_temp_variable));
cli();
something_time_cricital = some_temp_variable;
sei();
}
some_temp_variable can remain in a register all the time, but the
clobber will force the compiler to generate the value before that point
in the code. Obviously "something_time_critical" will have to be
volatile, or the cli() and sei() need memory barriers, to avoid
re-ordering there.
Code reordering like that would likely go unnoticed in any high-level
operating system.
That's true - you can never be too careful with code like this. With
every new version of gcc, the optimiser gets smarter and it surprises
people. I've written many posts on this topic in different places, but
much of what I know here I've learned the hard way.
If I remember a remark from Eric correctly, GCC recently introduce
per-function optimization settings. Quite possible that one reason
for this is that it allows for workarounds around similar problems...
Per-function optimisation settings have a number of useful applications,
including this situation - sometimes you get the best code with
particular settings. A more common case, I think, would be to
explicitly enable more aggressive settings such as loop unrolling in
particularly time-critical code, while keeping the rest of your code
size-optimised.
Now assume, the language would be extended to allow for what I'd
call a "volatile block", then the above could become:
void foo(void)
{
some_temp_variable = result_of_expensive_computation;
/* I think it's been a division. */
volatile {
cli();
something_time_cricital = some_temp_variable;
sei();
}
}
...telling the compiler to not reorder code across the borders of that
block. Suddenly, cli/sei wouldn't even need to imply a memory barrier
anymore.
I'm not sure there is any way to give a solid definition of the
semantics of "volatile" here. And I suspect that you'll have /slightly/
more difficulty getting this into mainline gcc than for other extensions
such as 0b binary constants...
_______________________________________________
AVR-libc-dev mailing list
AVR-libc-dev@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-libc-dev