Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
David Brown wrote: ... you are ... protecting the code sequence, not the data itself - the data is indirectly protected by always using protected code sequences to access it. I think this is just a point-of-view thing. I could be wrong, but I think the prevalent point of view is that it is the data that needs to be protected. Any means to do that would be acceptable. It seems strange to talk about protecting the code sequence. Making sure that the code executes in sequence and without interruption is just a means to the primary end of protecting the data. Graham. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Graham Davies wrote: David Brown wrote: ... you are ... protecting the code sequence, not the data itself - the data is indirectly protected by always using protected code sequences to access it. I think this is just a point-of-view thing. I could be wrong, but I think the prevalent point of view is that it is the data that needs to be protected. Any means to do that would be acceptable. It seems strange to talk about protecting the code sequence. Making sure that the code executes in sequence and without interruption is just a means to the primary end of protecting the data. Graham. In some cases, it makes sense to think about protecting the data - in particular, you often want to make sure that no part of the program can access the data in an inconsistent state. In that sense, you protect code sequences as a way to protect the data. But you also want sequences of code that access different data objects rather than directly protecting the data. Ultimately, code is merely something that manipulates data, and data is merely intermediate results of code, so it is, as you say, a point-of-view thing. My point was just that it is often most practical to think in terms of protecting code sequences rather than data - you can't view all cases in terms of protecting data until you abstract to the level of all the data in the system being part of a single large object, and all the code in the system being methods manipulating that data object. At least, that's how I view it. Perhaps if I were a C++ programmer rather than a C programmer, I'd view it in a more data-centric way. mvh., David ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Hi everyone, A lurker pops his head out of his hole for a comment - please don't shoot me! :) Normally, critical sections are supposed to be used to protect *data*, not *code*. The atomic operations listed in the URL are doing very specific atomic operations on data. In other words, they also follow the theme of protecting data, not code. It seems to me that applying the critical attribute to generic functions can lead a programmer to start thinking in terms of protecting code instead of data. Eventually that can lead to a disastrous result, especially if you ever work with multi-threaded code. All of the examples I have seen in the thread so far were used for very simple data-modifying operations that are suitable for inlining. I think that is acceptable use of critical. However, not everyone will follow that principle - they may treat critical as a cure-all. In the past, I have solved the critical section problem by using some #defines. Unfortunately, I had to use THAT_OTHER_COMPILER(tm). I'd give you the exact definitions, but I'm at home and don't have access to my work code. DECLARE_CRITICAL_SECTION - at the top of each function with a critical section - creates a variable to store SREG BEGIN_CRITICAL_SECTION - records the value of SREG and clears global interrupt flag END_CRITICAL_SECTION - restores SREG Technically it's no better than doing the code inline, but it is more self-documenting. Hope this generates some more ideas/discussion, Scott -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andy H Sent: Thursday, May 22, 2008 5:36 PM To: David Brown Cc: avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html David Brown wrote: [EMAIL PROTECTED] wrote: Eric, The problem with the function is typical housekeeping overhead gets wrapped up. To take a silly example: Say you want to send some message to ISR (via memory) using critical function critical void function(int *buffer, int *cValue) { *cValue++ = *buffer++; } Getting *buffer++ does not need ISR disabled. Perhaps neither does the cValue++; It is far more important to be correct than to be fast - thus it is better to err on the side of disabling interrupts for too long. Of course, in time-critical code disabling the interrupts for too long could also make the code fail. The point of the critical attribute is that it makes it very easy to write correct code. If you want code to be as fast as possible, your example should be written thus: static inline critical void functionHelper(int v, int *cValue) { *cValue = v; } void function(int *buffer, int *cValue) { functionHelper(*buffer++, cValue++); } With small static inline functions, you have exactly the granularity you want. And of course, you are still free to write out the cli()/sreg code if you find that makes sense for critical code. There are also ways to get atomic access which don't block interrupts. (I expect you are already familiar with several methods, but this could be of interest to others on this list too.) For example: #define volatileAccess(v) *((volatile typeof((v)) *) (v)) #define nonblock_atomic_read(v) ({ \ typeof(v) a, b; \ a = volatileAccess(v); \ b = volatileAccess(v); \ while (a != b) { \ a = b; \ b = volatileAccess(v); \ }; \ b; \ }) It's also possible to use a 1-byte flag for access control, since it can always be written and read atomically. But these sorts of solutions require more care and thought - a critical attribute is makes it easier for programmers to write working code. So this method, although not wrong, is something I would normally reject for time critical systems, where latency is key. Inline coding has less overhead but even then its not perfect - though perhaps more palatable than assembler. I do not know much about the gcc atomic operations. But it might be worth reviewing. There may be some hooks that make it work well with optimisations without adding a bunch of code hacks. Also it makes code more portable (Atmel ARM or course!) I believe the atomic operations are target-specific, and use target-specific assembly. Many larger processors have atomic operations of some sort - disabling interrupts won't work if you have an SMP system. It would be *very* nice if gcc had a set of atomic access buildins that were consistent across a range of targets. It would also be *very* nice if function attributes were consistent across ports. It seems that every gcc port has its own function attribute for specifying interrupt functions, for example. Sometimes they have the same name, but different functions - the msp430 port has interrupt and signal with slightly different syntax from the avr port, and opposite effects. There are also attributes
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Andy H wrote: BTW I believe the common attribute/keyword for this should be monitor as in monitor void dostuff(void) { putchar ( ’Z’ ); } This will give a degree of compatibility with other compilers. If other compilers use monitor, then that makes sense. It is also arguably more accurate, at least for those who have a background in more theoretical computing, while the word critical could just mean you want the function to be as fast as possible. http://en.wikipedia.org/wiki/Monitor_%28synchronization%29 mvh., David ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Paulo Marques wrote: Weddington, Eric wrote: [...] If ATOMIC_BLOCK() is *the* way to do it, perhaps the docs could be improved a little by emphasizing the use of ATOMIC_BLOCK() as a necessity rather than as a convenience. I'll go a step further: Is there any reason to NOT include the memory barrier in the interrupt macros? Would that even work? Like this, # define sei() __asm__ __volatile__ (sei ::: memory) # define cli() __asm__ __volatile__ (cli ::: memory) This has been discussed before (on the avr-libc list, December 2005, just search for 'cli and sei should clobber memory'), and this exact solution was actually proposed by me. The downside is that a memory clobber like that drops all optimizations of global variables that the compiler might be temporarily storing in registers. So, it actually hurts performance for code unrelated to the critical section. Another thing we could do to help the common cases of reading / writing multi-byte vars (or even single byte non-volatile vars) was to write accessors like (totaly untested): static inline uint16_t atomic_read16(uint16_t *var) { uint16_t ret; uint8_t old_sreg = SREG; cli(); ret = *((volatile uint16_t *)var); SREG = old_sreg; } pedantic You forgot the return ret. /pedantic static inline void atomic_write16(uint16_t *var, uint16_t value) { uint8_t old_sreg = SREG; cli(); *((volatile uint16_t *)var) = value; SREG = old_sreg; } (repeat for other sizes) I'd have to check this code better, because I always end up placing the volatile in the wrong place and casting the pointer to volatile instead of the variable ;) Anyway, the advantage of something like this over ATOMIC_BLOCK() is that it doesn't need any memory clobbers, since all operations are already volatile. So, this should produce slightly better code. One idea worth trying for simplifying this might be to use macros rather than static inlines (although normally I prefer static inlines to macros) along with gcc's typeof operator: #define volatileAccess(v) *((volatile typeof((v)) *) (v)) #define atomic_read(v) ({ \ uint8_t old_sreg = SREG; \ cli(); \ typeof(v) ret = volatileAccess(v); \ SREG = old_sreg; \ ret; \ }) #define atomic_write(v, x) ({ \ uint8_t old_sreg = SREG; \ cli(); \ volatileAccess(v) = x; \ SREG = old_sreg; \ }) An alternative way to handle this sort of thing would be to implement a critical function attribute like in the msp430 port of gcc - a function with this attribute has interrupts disabled at the start, and restored at the end. I find it very convenient when programming for the msp430. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Hi. An alternative way to handle this sort of thing would be to implement a critical function attribute like in the msp430 port of gcc - a function with this attribute has interrupts disabled at the start, and restored at the end. It will work for for tiny and mega devices, but xmega devices have Non-Maskable Interrupts, and clearing of the I-bit does not guarantee full continuity execution of a code. I shall make a patch for critical attribute. Anatoly.___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
You can do it this way but you won't have precise control over disable/restore timing. So you will typically disable interrupts longer than is needed. Also disabling all interrupts is often not what you need. Disabling specfic interrupts is more palatable on real time systems. NMI is a feature not a limitation - it supposed to stop such software hanging up the CPU. BTW gcc provides atomic operations on other ports. Andy -Original Message- From: Anatoly Sokolov [EMAIL PROTECTED] To: Weddington, Eric [EMAIL PROTECTED]; David Brown [EMAIL PROTECTED]; avr-gcc-list@nongnu.org Sent: Thu, 22 May 2008 1:08 pm Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os Hi. An alternative way to handle this sort of thing would be to implement a critical function attribute like in the msp430 port of gcc - a function with this attribute has interrupts disabled at the start, and restored at the end. It will work for for tiny and mega devices, but xmega devices have Non-Maskable Interrupts, and clearing of the I-bit does not guarantee full continuity execution of a code. I shall make a patch for critical attribute. Anatoly. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
-Original Message- From: Anatoly Sokolov [mailto:[EMAIL PROTECTED] Sent: Thursday, May 22, 2008 11:09 AM To: Weddington, Eric; David Brown; avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os Hi. An alternative way to handle this sort of thing would be to implement a critical function attribute like in the msp430 port of gcc - a function with this attribute has interrupts disabled at the start, and restored at the end. It will work for for tiny and mega devices, but xmega devices have Non-Maskable Interrupts, and clearing of the I-bit does not guarantee full continuity execution of a code. Sure. I shall make a patch for critical attribute. Thanks for doing this. When you make the patch, be sure that it also works if the function is inlined. Eric ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, May 22, 2008 11:48 AM To: [EMAIL PROTECTED]; Weddington, Eric; [EMAIL PROTECTED]; avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os You can do it this way but you won't have precise control over disable/restore timing. So you will typically disable interrupts longer than is needed. If the disable can happen right after prologue and the enable right before the epilogue then I don't see an issue. Caveat emptor. Maybe I'm missing something? Also disabling all interrupts is often not what you need. Disabling specfic interrupts is more palatable on real time systems. True, but it can't be easily done in the compiler. Disabling all interrupts is a common enough approach, and while it may not ideal for every application, I think that having this function attribute would be a useful tool available to the programmer. NMI is a feature not a limitation - it supposed to stop such software hanging up the CPU. BTW gcc provides atomic operations on other ports. What kinds of different mechanisms do they provide? Would any of them be useful for the AVR port? ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Eric, The problem with the function is typical housekeeping overhead gets wrapped up. To take a silly example: Say you want to send some message to ISR (via memory) using critical function critical void function(int *buffer, int *cValue) { *cValue++ = *buffer++; } Getting *buffer++ does not need ISR disabled. Perhaps neither does the cValue++; So this method, although not wrong, is something I would normally reject for time critical systems, where latency is key. Inline coding has less overhead but even then its not perfect - though perhaps more palatable than assembler. I do not know much about the gcc atomic operations. But it might be worth reviewing. There may be some hooks that make it work well with optimisations without adding a bunch of code hacks. Also it makes code more portable (Atmel ARM or course!) Andy -Original Message- From: Weddington, Eric [EMAIL PROTECTED] To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; avr-gcc-list@nongnu.org Sent: Thu, 22 May 2008 2:33 pm Subject: RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, May 22, 2008 11:48 AM To: [EMAIL PROTECTED]; Weddington, Eric; [EMAIL PROTECTED]; avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os You can do it this way but you won't have precise control over disable/restore timing. So you will typically disable interrupts longer than is needed. If the disable can happen right after prologue and the enable right before the epilogue then I don't see an issue. Caveat emptor. Maybe I'm missing something? Also disabling all interrupts is often not what you need. Disabling specfic interrupts is more palatable on real time systems. True, but it can't be easily done in the compiler. Disabling all interrupts is a common enough approach, and while it may not ideal for every application, I think that having this function attribute would be a useful tool available to the programmer. NMI is a feature not a limitation - it supposed to stop such software hanging up the CPU. BTW gcc provides atomic operations on other ports. What kinds of different mechanisms do they provide? Would any of them be useful for the AVR port? ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
[EMAIL PROTECTED] wrote: Eric, The problem with the function is typical housekeeping overhead gets wrapped up. To take a silly example: Say you want to send some message to ISR (via memory) using critical function critical void function(int *buffer, int *cValue) { *cValue++ = *buffer++; } Getting *buffer++ does not need ISR disabled. Perhaps neither does the cValue++; It is far more important to be correct than to be fast - thus it is better to err on the side of disabling interrupts for too long. Of course, in time-critical code disabling the interrupts for too long could also make the code fail. The point of the critical attribute is that it makes it very easy to write correct code. If you want code to be as fast as possible, your example should be written thus: static inline critical void functionHelper(int v, int *cValue) { *cValue = v; } void function(int *buffer, int *cValue) { functionHelper(*buffer++, cValue++); } With small static inline functions, you have exactly the granularity you want. And of course, you are still free to write out the cli()/sreg code if you find that makes sense for critical code. There are also ways to get atomic access which don't block interrupts. (I expect you are already familiar with several methods, but this could be of interest to others on this list too.) For example: #define volatileAccess(v) *((volatile typeof((v)) *) (v)) #define nonblock_atomic_read(v) ({ \ typeof(v) a, b; \ a = volatileAccess(v); \ b = volatileAccess(v); \ while (a != b) { \ a = b; \ b = volatileAccess(v); \ }; \ b; \ }) It's also possible to use a 1-byte flag for access control, since it can always be written and read atomically. But these sorts of solutions require more care and thought - a critical attribute is makes it easier for programmers to write working code. So this method, although not wrong, is something I would normally reject for time critical systems, where latency is key. Inline coding has less overhead but even then its not perfect - though perhaps more palatable than assembler. I do not know much about the gcc atomic operations. But it might be worth reviewing. There may be some hooks that make it work well with optimisations without adding a bunch of code hacks. Also it makes code more portable (Atmel ARM or course!) I believe the atomic operations are target-specific, and use target-specific assembly. Many larger processors have atomic operations of some sort - disabling interrupts won't work if you have an SMP system. It would be *very* nice if gcc had a set of atomic access buildins that were consistent across a range of targets. It would also be *very* nice if function attributes were consistent across ports. It seems that every gcc port has its own function attribute for specifying interrupt functions, for example. Sometimes they have the same name, but different functions - the msp430 port has interrupt and signal with slightly different syntax from the avr port, and opposite effects. There are also attributes that only some ports have, that could be easily* and usefully provided in all ports. Examples are critical, naked, interrupt, near, far, and saveall. * At least, it's easy to think out what the generated assembly would look like - I don't know if that makes it easy to implement in practice! mvh., David ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html David Brown wrote: [EMAIL PROTECTED] wrote: Eric, The problem with the function is typical housekeeping overhead gets wrapped up. To take a silly example: Say you want to send some message to ISR (via memory) using critical function critical void function(int *buffer, int *cValue) { *cValue++ = *buffer++; } Getting *buffer++ does not need ISR disabled. Perhaps neither does the cValue++; It is far more important to be correct than to be fast - thus it is better to err on the side of disabling interrupts for too long. Of course, in time-critical code disabling the interrupts for too long could also make the code fail. The point of the critical attribute is that it makes it very easy to write correct code. If you want code to be as fast as possible, your example should be written thus: static inline critical void functionHelper(int v, int *cValue) { *cValue = v; } void function(int *buffer, int *cValue) { functionHelper(*buffer++, cValue++); } With small static inline functions, you have exactly the granularity you want. And of course, you are still free to write out the cli()/sreg code if you find that makes sense for critical code. There are also ways to get atomic access which don't block interrupts. (I expect you are already familiar with several methods, but this could be of interest to others on this list too.) For example: #define volatileAccess(v) *((volatile typeof((v)) *) (v)) #define nonblock_atomic_read(v) ({ \ typeof(v) a, b; \ a = volatileAccess(v); \ b = volatileAccess(v); \ while (a != b) { \ a = b; \ b = volatileAccess(v); \ }; \ b; \ }) It's also possible to use a 1-byte flag for access control, since it can always be written and read atomically. But these sorts of solutions require more care and thought - a critical attribute is makes it easier for programmers to write working code. So this method, although not wrong, is something I would normally reject for time critical systems, where latency is key. Inline coding has less overhead but even then its not perfect - though perhaps more palatable than assembler. I do not know much about the gcc atomic operations. But it might be worth reviewing. There may be some hooks that make it work well with optimisations without adding a bunch of code hacks. Also it makes code more portable (Atmel ARM or course!) I believe the atomic operations are target-specific, and use target-specific assembly. Many larger processors have atomic operations of some sort - disabling interrupts won't work if you have an SMP system. It would be *very* nice if gcc had a set of atomic access buildins that were consistent across a range of targets. It would also be *very* nice if function attributes were consistent across ports. It seems that every gcc port has its own function attribute for specifying interrupt functions, for example. Sometimes they have the same name, but different functions - the msp430 port has interrupt and signal with slightly different syntax from the avr port, and opposite effects. There are also attributes that only some ports have, that could be easily* and usefully provided in all ports. Examples are critical, naked, interrupt, near, far, and saveall. * At least, it's easy to think out what the generated assembly would look like - I don't know if that makes it easy to implement in practice! mvh., David ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
BTW I believe the common attribute/keyword for this should be monitor as in monitor void dostuff(void) { putchar ( ’Z’ ); } This will give a degree of compatibility with other compilers. David Brown wrote: [EMAIL PROTECTED] wrote: Eric, The problem with the function is typical housekeeping overhead gets wrapped up. To take a silly example: Say you want to send some message to ISR (via memory) using critical function critical void function(int *buffer, int *cValue) { *cValue++ = *buffer++; } Getting *buffer++ does not need ISR disabled. Perhaps neither does the cValue++; It is far more important to be correct than to be fast - thus it is better to err on the side of disabling interrupts for too long. Of course, in time-critical code disabling the interrupts for too long could also make the code fail. The point of the critical attribute is that it makes it very easy to write correct code. If you want code to be as fast as possible, your example should be written thus: static inline critical void functionHelper(int v, int *cValue) { *cValue = v; } void function(int *buffer, int *cValue) { functionHelper(*buffer++, cValue++); } With small static inline functions, you have exactly the granularity you want. And of course, you are still free to write out the cli()/sreg code if you find that makes sense for critical code. There are also ways to get atomic access which don't block interrupts. (I expect you are already familiar with several methods, but this could be of interest to others on this list too.) For example: #define volatileAccess(v) *((volatile typeof((v)) *) (v)) #define nonblock_atomic_read(v) ({ \ typeof(v) a, b; \ a = volatileAccess(v); \ b = volatileAccess(v); \ while (a != b) { \ a = b; \ b = volatileAccess(v); \ }; \ b; \ }) It's also possible to use a 1-byte flag for access control, since it can always be written and read atomically. But these sorts of solutions require more care and thought - a critical attribute is makes it easier for programmers to write working code. So this method, although not wrong, is something I would normally reject for time critical systems, where latency is key. Inline coding has less overhead but even then its not perfect - though perhaps more palatable than assembler. I do not know much about the gcc atomic operations. But it might be worth reviewing. There may be some hooks that make it work well with optimisations without adding a bunch of code hacks. Also it makes code more portable (Atmel ARM or course!) I believe the atomic operations are target-specific, and use target-specific assembly. Many larger processors have atomic operations of some sort - disabling interrupts won't work if you have an SMP system. It would be *very* nice if gcc had a set of atomic access buildins that were consistent across a range of targets. It would also be *very* nice if function attributes were consistent across ports. It seems that every gcc port has its own function attribute for specifying interrupt functions, for example. Sometimes they have the same name, but different functions - the msp430 port has interrupt and signal with slightly different syntax from the avr port, and opposite effects. There are also attributes that only some ports have, that could be easily* and usefully provided in all ports. Examples are critical, naked, interrupt, near, far, and saveall. * At least, it's easy to think out what the generated assembly would look like - I don't know if that makes it easy to implement in practice! mvh., David ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Don't worry about being shot down. Nobody hear can shoot straight, unless it's their foot. (myself included) This is very similar to what we do. It can also be targeted to specific interrupts (or tasks). We have to do it like this - or latency will loose us data. Like your it CAPITALISED with red flashing lights for review! Andy Scott and Roxanne Munns wrote: Hi everyone, A lurker pops his head out of his hole for a comment - please don't shoot me! :) Normally, critical sections are supposed to be used to protect *data*, not *code*. The atomic operations listed in the URL are doing very specific atomic operations on data. In other words, they also follow the theme of protecting data, not code. It seems to me that applying the critical attribute to generic functions can lead a programmer to start thinking in terms of protecting code instead of data. Eventually that can lead to a disastrous result, especially if you ever work with multi-threaded code. All of the examples I have seen in the thread so far were used for very simple data-modifying operations that are suitable for inlining. I think that is acceptable use of critical. However, not everyone will follow that principle - they may treat critical as a cure-all. In the past, I have solved the critical section problem by using some #defines. Unfortunately, I had to use THAT_OTHER_COMPILER(tm). I'd give you the exact definitions, but I'm at home and don't have access to my work code. DECLARE_CRITICAL_SECTION - at the top of each function with a critical section - creates a variable to store SREG BEGIN_CRITICAL_SECTION - records the value of SREG and clears global interrupt flag END_CRITICAL_SECTION - restores SREG Technically it's no better than doing the code inline, but it is more self-documenting. Hope this generates some more ideas/discussion, Scott -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andy H Sent: Thursday, May 22, 2008 5:36 PM To: David Brown Cc: avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html David Brown wrote: [EMAIL PROTECTED] wrote: Eric, The problem with the function is typical housekeeping overhead gets wrapped up. To take a silly example: Say you want to send some message to ISR (via memory) using critical function critical void function(int *buffer, int *cValue) { *cValue++ = *buffer++; } Getting *buffer++ does not need ISR disabled. Perhaps neither does the cValue++; It is far more important to be correct than to be fast - thus it is better to err on the side of disabling interrupts for too long. Of course, in time-critical code disabling the interrupts for too long could also make the code fail. The point of the critical attribute is that it makes it very easy to write correct code. If you want code to be as fast as possible, your example should be written thus: static inline critical void functionHelper(int v, int *cValue) { *cValue = v; } void function(int *buffer, int *cValue) { functionHelper(*buffer++, cValue++); } With small static inline functions, you have exactly the granularity you want. And of course, you are still free to write out the cli()/sreg code if you find that makes sense for critical code. There are also ways to get atomic access which don't block interrupts. (I expect you are already familiar with several methods, but this could be of interest to others on this list too.) For example: #define volatileAccess(v) *((volatile typeof((v)) *) (v)) #define nonblock_atomic_read(v) ({ \ typeof(v) a, b; \ a = volatileAccess(v); \ b = volatileAccess(v); \ while (a != b) { \ a = b; \ b = volatileAccess(v); \ }; \ b; \ }) It's also possible to use a 1-byte flag for access control, since it can always be written and read atomically. But these sorts of solutions require more care and thought - a critical attribute is makes it easier for programmers to write working code. So this method, although not wrong, is something I would normally reject for time critical systems, where latency is key. Inline coding has less overhead but even then its not perfect - though perhaps more palatable than assembler. I do not know much about the gcc atomic operations. But it might be worth reviewing. There may be some hooks that make it work well with optimisations without adding a bunch of code hacks. Also it makes code more portable (Atmel ARM or course!) I believe the atomic operations are target-specific, and use target-specific assembly. Many larger processors have atomic operations of some sort - disabling interrupts won't work if you have an SMP system. It would be *very* nice if gcc had a set of atomic access buildins that were consistent across a range of targets. It would also be *very* nice if function attributes
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
David Brown wrote: The point is, the compiler is allowed to do this sort of optimisation. It can be a bit annoying during testing and debugging - especially because such reorderings are relatively rare in practice, so that it's easy to think it worked before, what's wrong now?. I wouldn't have thought the compiler was allowed to re-order statements *around* a volatile access. Perhaps someone can help my understanding, given :- 1: Volatile Statementssp 2: Statementssp 3: Volatile Statementssp where sp is a sequence point. Line 3 has a volatile access and therefore has side-effects that the compiler doesn't know about, so surely it must complete all preceeding statements up to and including the sequence point before the volatile access occurs? Jon -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of David Brown Sent: Friday, 16 May 2008 5:03 PM To: avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os Thomas D. Dean wrote: dtostrf also does not impact the value written to PORTA, so, why does it not reorder a similar series of statements? There are several factors that can affect such reordering. In particular, the compiler must know that it is safe to reorder the statements. If the compiler knows that the function is const (through __attribute__((const)), or if it has the function definition on hand), then it can safely be reordered. pure functions (which can also read global memory) can be reordered to a lesser extent. Some maths functions will be declared const - others will not, perhaps because they set errno, or because they are not re-entrant and use a fixed buffer. Things like compiler versions and optimisation options can obviously make a difference. Sometimes it can be just a matter of luck, as far as the user is concerned - small differences in the sizes of functions, the usage of local variables, or other minor details can change the result of the optimiser output. The point is, the compiler is allowed to do this sort of optimisation. It can be a bit annoying during testing and debugging - especially because such reorderings are relatively rare in practice, so that it's easy to think it worked before, what's wrong now?. A liberal sprinkling of volatile is the usual way to constrain the code in the way you want here. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Hi, Loveny Design schrieb: I wouldn't have thought the compiler was allowed to re-order statements *around* a volatile access. Perhaps someone can help my understanding, given :- 1: Volatile Statementssp 2: Statementssp 3: Volatile Statementssp where sp is a sequence point. Line 3 has a volatile access and therefore has side-effects that the compiler doesn't know about, so surely it must complete all preceeding statements up to and including the sequence point before the volatile access occurs? the problem here is that atan() is declared as const, because it is matematicaly only a table look up. If you make two or more calls to atan() with the same parameter, it could be reduced to one. That is essential in loops: for(uint8_t i = 0; i 32; i++) { PORTB = atan(foobar) + i * 2; } could be reordered to: tmp = atan(foobar); for(uint8_t i = 0; i 32; i++) { PORTB = tmp + i * 2; } Alex ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Loveny Design wrote: David Brown wrote: The point is, the compiler is allowed to do this sort of optimisation. It can be a bit annoying during testing and debugging - especially because such reorderings are relatively rare in practice, so that it's easy to think it worked before, what's wrong now?. I wouldn't have thought the compiler was allowed to re-order statements *around* a volatile access. Perhaps someone can help my understanding, given :- 1: Volatile Statementssp 2: Statementssp 3: Volatile Statementssp where sp is a sequence point. Line 3 has a volatile access and therefore has side-effects that the compiler doesn't know about, so surely it must complete all preceeding statements up to and including the sequence point before the volatile access occurs? No, because statement 2 doesn't have side effects, so it can not affect the outcome of statement 3. Note! This is not even the worst case for this optimization. The worst case I've seen is something like: int my_flag; cli(); my_flag = 0x1234; sei(); Contrary to what you might expect, that my_flag access isn't being protected by the cli/sei pair, and might be reordered by the compiler (and might here means it has happened, not some theoretical scenario). Currently, the official way of doing something like that is using the macros provided in util/atomic.h. -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com As far as we know, our computer has never had an undetected error. Weisert ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
On Friday 16 May 2008 08:05:41 am Paulo Marques wrote: Note! This is not even the worst case for this optimization. The worst case I've seen is something like: int my_flag; cli(); my_flag = 0x1234; sei(); Contrary to what you might expect, that my_flag access isn't being protected by the cli/sei pair, and might be reordered by the compiler (and might here means it has happened, not some theoretical scenario). Currently, the official way of doing something like that is using the macros provided in util/atomic.h. Wow. I think the cli()/sei() construct is used fairly universally, and it's also in the avr-libc docs. What gives ATOMIC_BLOCK() it's special immunity? The memory clobber? -mark ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of Mark Litwack Sent: Friday, May 16, 2008 9:05 AM To: avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os On Friday 16 May 2008 08:05:41 am Paulo Marques wrote: Note! This is not even the worst case for this optimization. The worst case I've seen is something like: int my_flag; cli(); my_flag = 0x1234; sei(); Contrary to what you might expect, that my_flag access isn't being protected by the cli/sei pair, and might be reordered by the compiler (and might here means it has happened, not some theoretical scenario). Currently, the official way of doing something like that is using the macros provided in util/atomic.h. Wow. I think the cli()/sei() construct is used fairly universally, and it's also in the avr-libc docs. Acutally it's not. What is more appropriate is this: uint8_t old_sreg = SREG; cli(); // code SREG = old_sreg; You don't want to blindly set the interrupts after they have been cleared. You want to *restore the interrupts to its previous state*. This is why the status register (SREG) is saved and restored as it contains the 'I' flag, which is the global interrupts flag. Restoring SREG restores the previous state of the globabl interrupts flag. Really the only time that one should see cli()...sei() is in the initialization of the chip, and generally there is a lot more code than one line between the two. Eric ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Mark Litwack wrote: On Friday 16 May 2008 08:05:41 am Paulo Marques wrote: [...] Contrary to what you might expect, that my_flag access isn't being protected by the cli/sei pair, and might be reordered by the compiler (and might here means it has happened, not some theoretical scenario). Currently, the official way of doing something like that is using the macros provided in util/atomic.h. Wow. I think the cli()/sei() construct is used fairly universally, and it's also in the avr-libc docs. What gives ATOMIC_BLOCK() it's special immunity? The memory clobber? Yes. -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com To know recursion, you must first know recursion. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
A few random facts: - accesses to volatiles without intervening sequence points may be reordered - accesses to volatiles with intervening sequence points may not be reordered - accesses to volatiles may be reordered with respect to non-side effecting operations (e.g. atan()) regardless of sequence points John Regehr ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
In my case, I want to allow interrupts, just want to keep the statement ordering in the code segment. The PORTA bits are used for hardware control. I want to use the atan2(), etc. calls as pulse stretching. Making the resultant variable volatile works. tomdean ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
On Friday 16 May 2008 11:32:48 am Weddington, Eric wrote: Wow. I think the cli()/sei() construct is used fairly universally, and it's also in the avr-libc docs. Acutally it's not. What is more appropriate is this: uint8_t old_sreg = SREG; cli(); // code SREG = old_sreg; You don't want to blindly set the interrupts after they have been cleared. You want to *restore the interrupts to its previous state*. This is why the status register (SREG) is saved and restored as it contains the 'I' flag, which is the global interrupts flag. Restoring SREG restores the previous state of the globabl interrupts flag. Really the only time that one should see cli()...sei() is in the initialization of the chip, and generally there is a lot more code than one line between the two. Eric Actually, I had considered your example to be included when I said cli()/sei(). Clearly you would want to save SREG in almost every situation. Are you saying your construct above protects // code in all cases? Or is // code still subject to being re-ordered to outside of the protection if there are no dependencies? If ATOMIC_BLOCK() is *the* way to do it, perhaps the docs could be improved a little by emphasizing the use of ATOMIC_BLOCK() as a necessity rather than as a convenience. -mark ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Weddington, Eric wrote: [...] If ATOMIC_BLOCK() is *the* way to do it, perhaps the docs could be improved a little by emphasizing the use of ATOMIC_BLOCK() as a necessity rather than as a convenience. I'll go a step further: Is there any reason to NOT include the memory barrier in the interrupt macros? Would that even work? Like this, # define sei() __asm__ __volatile__ (sei ::: memory) # define cli() __asm__ __volatile__ (cli ::: memory) This has been discussed before (on the avr-libc list, December 2005, just search for 'cli and sei should clobber memory'), and this exact solution was actually proposed by me. The downside is that a memory clobber like that drops all optimizations of global variables that the compiler might be temporarily storing in registers. So, it actually hurts performance for code unrelated to the critical section. Another thing we could do to help the common cases of reading / writing multi-byte vars (or even single byte non-volatile vars) was to write accessors like (totaly untested): static inline uint16_t atomic_read16(uint16_t *var) { uint16_t ret; uint8_t old_sreg = SREG; cli(); ret = *((volatile uint16_t *)var); SREG = old_sreg; } static inline void atomic_write16(uint16_t *var, uint16_t value) { uint8_t old_sreg = SREG; cli(); *((volatile uint16_t *)var) = value; SREG = old_sreg; } (repeat for other sizes) I'd have to check this code better, because I always end up placing the volatile in the wrong place and casting the pointer to volatile instead of the variable ;) Anyway, the advantage of something like this over ATOMIC_BLOCK() is that it doesn't need any memory clobbers, since all operations are already volatile. So, this should produce slightly better code. -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com All I ask is a chance to prove that money can't make me happy. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
The PORTA bits are used for hardware control. I want to use the atan2(), etc. calls as pulse stretching. Then I recommend using the calls in util/delay.h to get exact delays instead of monkeying around with floating point routine calls. They'll be a lot more exact as well as being a boatlad smaller! Perhaps the atan2() call are accomplishing useful work... John ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
-Original Message- From: Paulo Marques [mailto:[EMAIL PROTECTED] Sent: Friday, May 16, 2008 12:07 PM To: Weddington, Eric Cc: Mark Litwack; avr-gcc-list@nongnu.org; Joerg Wunsch Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os Weddington, Eric wrote: [...] If ATOMIC_BLOCK() is *the* way to do it, perhaps the docs could be improved a little by emphasizing the use of ATOMIC_BLOCK() as a necessity rather than as a convenience. I'll go a step further: Is there any reason to NOT include the memory barrier in the interrupt macros? Would that even work? Like this, # define sei() __asm__ __volatile__ (sei ::: memory) # define cli() __asm__ __volatile__ (cli ::: memory) This has been discussed before (on the avr-libc list, December 2005, just search for 'cli and sei should clobber memory'), and this exact solution was actually proposed by me. Oh! Thanks for the reference! :-) That was a while ago... The downside is that a memory clobber like that drops all optimizations of global variables that the compiler might be temporarily storing in registers. So, it actually hurts performance for code unrelated to the critical section. Another thing we could do to help the common cases of reading / writing multi-byte vars (or even single byte non-volatile vars) was to write accessors like (totaly untested): static inline uint16_t atomic_read16(uint16_t *var) { uint16_t ret; uint8_t old_sreg = SREG; cli(); ret = *((volatile uint16_t *)var); SREG = old_sreg; } static inline void atomic_write16(uint16_t *var, uint16_t value) { uint8_t old_sreg = SREG; cli(); *((volatile uint16_t *)var) = value; SREG = old_sreg; } (repeat for other sizes) I'd have to check this code better, because I always end up placing the volatile in the wrong place and casting the pointer to volatile instead of the variable ;) Anyway, the advantage of something like this over ATOMIC_BLOCK() is that it doesn't need any memory clobbers, since all operations are already volatile. So, this should produce slightly better code. Interesting... ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
John Regehr wrote: # define sei() __asm__ __volatile__ (sei ::: memory) # define cli() __asm__ __volatile__ (cli ::: memory) snip Adding these clobbers is an extremely good idea because they make cli/sei act like proper locks, in two senses. First, they prevent reordering of computations as has been discussed here. Second, they force data cached in registers to be written to RAM when the interrupt lock is released. Both of these are important. Here's a subtle but important point: If you consistently protect shared data with clobbering locks, you don't need to make data volatile when it is shared between interrupts and main(). Of course you still need volatile for accessing hardware registers. Well, isn't the net effect of volatile simply a more fine-grained clobbering lock? snip The downside is that a memory clobber like that drops all optimizations of global variables that the compiler might be temporarily storing in registers. So, it actually hurts performance for code unrelated to the critical section. We studied the impact of adding clobbers to critical sections in TinyOS and found that both code size and CPU usage actually decreased slightly due to the clobbers. No idea why. Details here: http://www.cs.utah.edu/~regehr/papers/plos06b.pdf TinyOS currently clobbers memory on every cli/sei. That's a counter-intuitive result. The No idea why. part makes me a little squinty-eyed. It certainly *could* be a generalizable result, but then again it might be an artifact of your code structure. I could see including both clobbering and non-clobbering flavors of sei()/cli() in avrlibc -- then we can have a whole new and entertaining argument about whether the old names sei()/cli() should have the new or old behavior :) :) -dave John Regehr ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
The PORTA bits are used for hardware control. I want to use the atan2(), etc. calls as pulse stretching. Then I recommend using the calls in util/delay.h to get exact delays instead of monkeying around with floating point routine calls. They'll be a lot more exact as well as being a boatlad smaller! Perhaps the atan2() call are accomplishing useful work... Perhaps so. However, if all they are doing is pulse stretching as was indicated, a flat delay is better. That said, I thought the OP was originally using the PORTA bit flipping as a simple way to time how long each given function takes, perhaps as a calibration or benchmarking exercise. I found the fact that sin, cos, and so on worked, but atan2 didn't, the most interesting part of the discussion. And I think the answer to that has been adequately explained (look-up tables not affecting volatility, etc.). Then we branched off into whether clobber memory is the best way to handle sei() and cli(). Again, a fun discussion. Enough explanation of my comments. Back to the argument, which is already in progress... ;-) Stu ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Well, isn't the net effect of volatile simply a more fine-grained clobbering lock? Almost but not quite: - volatile says nothing about the atomicity of any given access - volatile does not suppress reordering (except with other volatiles) - volatile has no effect on caches and out-of-order memory subsystems (not an issue for AVR obviously) Basically volatile has the effect of sucking in programmers and then leaving them high and dry as system complexity increases. Also volatile is usually too fine-grained, ensuring consistency-always instead of what you want (consistency on lock release) and this can easily lead to inefficiencies. That's a counter-intuitive result. The No idea why. part makes me a little squinty-eyed. It certainly *could* be a generalizable result, but then again it might be an artifact of your code structure. Yep. My off-the-wall guess was that the clobbers reduced register pressure. I could not think of an easy way to test that hypothesis. Finally I'll just add a random plug for a piece of work that a colleague and I recently completed where we found that most compilers have problems implementing the volatile qualifier: http://www.cs.utah.edu/~regehr/papers/emsoft08_submit.pdf This is indeed bad news since (as we have seen so many times on this list) programmers have trouble with volatile too. Section 2 of our paper gives a fairly concise introduction to the semantics of volatile. John ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of John Regehr Sent: Friday, May 16, 2008 2:17 PM To: avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os Finally I'll just add a random plug for a piece of work that a colleague and I recently completed where we found that most compilers have problems implementing the volatile qualifier: http://www.cs.utah.edu/~regehr/papers/emsoft08_submit.pdf This is indeed bad news since (as we have seen so many times on this list) programmers have trouble with volatile too. Section 2 of our paper gives a fairly concise introduction to the semantics of volatile. FYI for everyone: John was kind enough to let me see his paper above early and I thought that it's a very interesting read. It's nice that he's included AVR GCC as part of his analysis. Eric ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
John Regehr wrote: Well, isn't the net effect of volatile simply a more fine-grained clobbering lock? Almost but not quite: - volatile says nothing about the atomicity of any given access - volatile does not suppress reordering (except with other volatiles) - volatile has no effect on caches and out-of-order memory subsystems (not an issue for AVR obviously) Well, true, but lack of cache coherency is a hardware bug. But I agree it throws in some interesting ordering issues w.r.t the volatile keyword. In any case if the CPU writes it's local data cache, that should invalidate all other copies. (Well, all other copies in data caches. Instruction caches often require explicit invalidates.) (I spent quite a few years as a CPU logic designer, starting about 1980. Pretty much every machine I worked on was out-of-order to some extent, and had caches of some flavor.) Also volatile is usually too fine-grained, ensuring consistency-always instead of what you want (consistency on lock release) and this can easily lead to inefficiencies. True in the case of trying to make volatile into a critical section. Not true for its original use as a way to talk to PDP-11 memory mapped I/O. That's a counter-intuitive result. The No idea why. part makes me a little squinty-eyed. It certainly *could* be a generalizable result, but then again it might be an artifact of your code structure. Yep. My off-the-wall guess was that the clobbers reduced register pressure. I could not think of an easy way to test that hypothesis. Forcing early spills helps? Does that say that the optimizer should be more aggressive about spilling? I should probably get quiet about now since I'm wandering outside my expertise talking about register allocators. Finally I'll just add a random plug for a piece of work that a colleague and I recently completed where we found that most compilers have problems implementing the volatile qualifier: http://www.cs.utah.edu/~regehr/papers/emsoft08_submit.pdf Interesting. I read over section 2. Will have to go back and read the whole thing. -dave OT shaggy dog story about caches: In the 1980's I remember reading the weekly highlights from another CPU project in our same design center. During checkout they had discovered a performance bug in the cache invalidate equation which required a single OR-gate to fix, and boosted the performance of an important database benchmark by 9% on 4-CPU systems. One senior logic designer on our project, who was a bit of a wag, said: 9% from just one OR-gate! We need to get some of those OR-gates for *our* project! ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Mark Litwack wrote: On Friday 16 May 2008 04:00:17 pm Dave N6NZ wrote: [...] I could see including both clobbering and non-clobbering flavors of sei()/cli() in avrlibc -- then we can have a whole new and entertaining argument about whether the old names sei()/cli() should have the new or old behavior :) :) We already get clobbering versions of sei()/cli() locks inside ATOMIC_BLOCK() and friends. Won't that do it instead of new versions of sei()/cli() (or whatever name)? Yes, that makes perfect sense. Maybe we should capture the essence of this e-mail thread and include it in the avr-libc documentation so that the difference between the two is clear. After all, every once in a while somebody *does* read the documentation :) -dave -mark ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
I have to do the atan2(). It takes a little longer than I would like the pulse, but, the longer pulse does not hurt anything. So, including the atan2() in the pulse duration does not waste as much time as delayxxx() and then do the atan2(). tomdean ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
The result of the atan2 does not impact the bit value written to PORTA, so the compiler is free to re-order it. That's what optimizers do. -dave Thomas D. Dean wrote: I have a code segment which 1. sets a bit in PORTA. 2. calls atan2. 3. clears the same bit in PORTA. The compiler produces code that 1. sets the bit in PORTA. 2. clears the same bit in PORT. 3. calls atan2. With -O0, the code is correct. # uname -a FreeBSD dv6000.tddhome 7.0-STABLE FreeBSD 7.0-STABLE #0: Sun May 4 07:58:25 PDT 2008 [EMAIL PROTECTED]:/usr/src/sys/GENERIC i386 # avr-gcc --version avr-gcc (GCC) 4.2.2 Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. The code fragment, ... dtostrf(cos_rad,6,3,line4[14]); line3[20] = ' '; line4[20] = ' '; TRACE_ON(TRACE4); // portb |= _bv(04) atan_rad = atan2(cos_rad,sin_rad); TRACE_OFF(TRACE4); // portb = ~_bv(04) dtostrf(atan_rad,6,3,line4[26]); ... # gmake ATMEGA16=1 TRACE=1 8_BIT=1 clean all rm -f *.o *.elf *.s19 *.b *.a *.map *.hex avr-gcc -mmcu=atmega16 -Wall -Wmissing-prototypes -Os -DTRACE -DLCD_8_BIT -I.-c -o main.o main.c avr-gcc -mmcu=atmega16 -Wall -Wmissing-prototypes -Os -DTRACE -DLCD_8_BIT -I.-c -o timer.o ../common/timer.c avr-gcc -mmcu=atmega16 -Wall -Wmissing-prototypes -Os -DTRACE -DLCD_8_BIT -I.-c -o lcd-8-bit.o ../common/lcd-8-bit.c avr-gcc -o main.elf main.o timer.o lcd-8-bit.o -mmcu=atmega16 -L../../../asus-avr-1.0/config/stk500 -L../../../asus-avr-1.0/lib -L/usr/local/avr/lib/ -lasus -lc -lm -lprintf_flt -lscanf_flt avr-objcopy -j .text -j .data -O ihex main.elf main.hex # avr-objdump -D main.elf | grep 'call.*atan2' -B12 -A10 398: 0e 94 71 03 call0x6e2 ; 0x6e2 dtostrf 39c: 80 e2 ldi r24, 0x20 ; 32 39e: f1 01 movwr30, r2 3a0: 84 8b std Z+20, r24 ; 0x14 3a2: f2 01 movwr30, r4 3a4: 84 8b std Z+20, r24 ; 0x14 3a6: dc 9a sbi 0x1b, 4 ; 27 3a8: dc 98 cbi 0x1b, 4 ; 27 3aa: a5 01 movwr20, r10 3ac: 94 01 movwr18, r8 3ae: c7 01 movwr24, r14 3b0: b6 01 movwr22, r12 3b2: 0e 94 0f 06 call0xc1e ; 0xc1e atan2 3b6: cf 54 subir28, 0x4F ; 79 3b8: df 4f sbcir29, 0xFF ; 255 3ba: 08 81 ld r16, Y 3bc: 19 81 ldd r17, Y+1; 0x01 3be: c1 5b subir28, 0xB1 ; 177 3c0: d0 40 sbcir29, 0x00 ; 0 3c2: 23 e0 ldi r18, 0x03 ; 3 3c4: 46 e0 ldi r20, 0x06 ; 6 3c6: 0e 94 71 03 call0x6e2 ; 0x6e2 dtostrf 3ca: b1 01 movwr22, r2 ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
On Thursday 15 May 2008, Thomas D. Dean wrote: I changed the code to asm volatile(sbi 0x14, 4::); atn_rad = atan2(cos_rad,sin_rad); asm volatile(cbi 0x14, 4::); and the compiler still reordered the statements to put both the sbi and cbi statements before the atan(). Changing atan2() to asin() or acos() gives the same incorrect result. Changing atan2() to sin() keeps the statements in the correct order. tomdean If you declare atn_rad as a volatile, won't that make the order work out correctly? As someone already mentioned, the port accesses are already volatile, so you should be able to use the C statments for port pin changing, and with a volatile atn_rad, the compiler should keep everything in the source order. Or else I misunderstand how volatile works. Blake ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Hi, if i read: The code fragment, ... dtostrf(cos_rad,6,3,line4[14]); line3[20] = ' '; line4[20] = ' '; TRACE_ON(TRACE4); // portb |= _bv(04) atan_rad = atan2(cos_rad,sin_rad); TRACE_OFF(TRACE4); // portb = ~_bv(04) dtostrf(atan_rad,6,3,line4[26]); portb it looks like you don´t use the original avr includes, otherwise it should be PORTB. Maybe you don´t declare portb as volatile? Why someone does not use the standard include? Alex ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Thomas D. Dean wrote: I have a code segment which 1. sets a bit in PORTA. 2. calls atan2. 3. clears the same bit in PORTA. The compiler produces code that 1. sets the bit in PORTA. 2. clears the same bit in PORT. 3. calls atan2. With -O0, the code is correct. # uname -a FreeBSD dv6000.tddhome 7.0-STABLE FreeBSD 7.0-STABLE #0: Sun May 4 07:58:25 PDT 2008 [EMAIL PROTECTED]:/usr/src/sys/GENERIC i386 # avr-gcc --version avr-gcc (GCC) 4.2.2 Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. The code fragment, ... dtostrf(cos_rad,6,3,line4[14]); line3[20] = ' '; line4[20] = ' '; TRACE_ON(TRACE4); // portb |= _bv(04) atan_rad = atan2(cos_rad,sin_rad); TRACE_OFF(TRACE4); // portb = ~_bv(04) dtostrf(atan_rad,6,3,line4[26]); ... As others have said, the compiler can freely move the atan2 call around. The easiest way to avoid that here is to declare atan_rad as volatile (to stop it moving past the TRACE_OFF), and either cos_rad or sin_rad as volatile (to stop it moving before the TRACE_ON). You can also try using a memory barrier (asm volatile( : : : memory) ) as a blocker, if the *_rad variables are in memory rather than being local variables. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
I changed the variables to volatile and that fixed things. The compiler does not seem to move things around the sin/cos/tan/asin/acos/atan functions, only the atan2. I put the code segment below. I was looking into size/time for these functions. I saw what I expected around all the functions except atan2(). Is this because of two arguments??? BTW, I use standard includes and the code does use PORTA and _BV(), just the comment was in lower case. tomdean double rad; double sin_rad, cos_rad; double atan_rad; while (1) { TRACE_ON(TRACE0); start = timer_current(); rad = M_PI/4.0; TRACE_ON(TRACE1); sin_rad = sin(rad); // 214 usec TRACE_OFF(TRACE1); TRACE_ON(TRACE2); dtostrf(rad,6,3,line3[4]); // 85 usec TRACE_OFF(TRACE2); line3[10]=')'; dtostrf(sin_rad,6,3,line3[14]); TRACE_ON(TRACE3); cos_rad = atan(sin_rad);// 220 usec TRACE_OFF(TRACE3); dtostrf(rad,6,3,line4[4]); line4[10]=')'; dtostrf(cos_rad,6,3,line4[14]); line3[20] = ' '; line4[20] = ' '; TRACE_ON(TRACE4); // porta |= _bv(04) atan_rad = atan2(cos_rad,sin_rad); TRACE_OFF(TRACE4); // porta = ~_bv(04) dtostrf(atan_rad,6,3,line4[26]); lcd_print(1,line1[0]); // 1.6 msec lcd_print(2,line2[0]); // 1.6 msec lcd_print(3,line3[0]); // 1.6 msec lcd_print(4,line4[0]); // 1.6 msec TRACE_OFF(TRACE0); } ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list