RE: [avr-gcc-list] testsuite saga continues
The logging version will always be slower. This is not just a matter of outputting the log, it is also a matter of building the log. We can avoid the output cost by only printing the last N lines, but we can not avoid the build cost. The code to do this was there at some point, but I decided to remove it, because under Linux you probably can do the same by running avrtest_log test_program | tail -n N and it should run almost as fast as a native solution. The only information you print is register info right? Since the parsing is so heavy does it make sense to save the complete register file (up to SPH). And then parse afterwards? It is just 96 bytes. The only info missing would be addresses to/from memory, but that could be ignored, since it's only load/store. When the address is absolutely relevant, just re-run using log. Just thinking out loud, it is probably nasty to create and the gain is almost nothing... So, I can add a --tail option to the log version, but the naked version will never be able to print any log at all, so that it runs as fast as possible. Remember that the main purpose of avrtest is to run gcc's testsuite. While running the testsuite, having a log is useless, but speed is important. Yes you are right, you can't have it all. BTW, I've done some more optimizations and the version I have now is almost twice as fast as the one on CVS, doing 30 P4 clocks per AVR clock, i.e., on my P4 3GHz I can simulate a 100MHz avr :) I don't have those numbers right now, but since there are tests that don't even fit in 128Kb of flash, there are probably some more that don't fit on 8Kb. Aha, well it is going to be hard to test the 8Kb then. Are these 128KB even in an optimized? I can imagine not fitting when using -O0. Do you already have a format for doing this? XML based? Nop. I haven't even started to think about the details. I would give my full support to anyone trying to setup a benchmarking framework, though ;) Hmm, nope never done such a thing before. Let me first try to get gcc compiling on my windows machine. It is a lot faster compared to my linux machine (DualCore 2.1GHZ vs Duron 1.3 GHz) Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Wouter van Gulik wrote: Can you elaborate on that? What exactly are you calling a dump log? Just printing out the last N instruction executed before the abort. so it's easier to spot where the test fails, but not having the penalty of running avrtest_log. Which on my windows machine is much much slower. avrtest is finished almost immediately, while the _log edition takes several seconds. Using the latest CVS version. The logging version will always be slower. This is not just a matter of outputting the log, it is also a matter of building the log. We can avoid the output cost by only printing the last N lines, but we can not avoid the build cost. The code to do this was there at some point, but I decided to remove it, because under Linux you probably can do the same by running avrtest_log test_program | tail -n N and it should run almost as fast as a native solution. So, I can add a --tail option to the log version, but the naked version will never be able to print any log at all, so that it runs as fast as possible. Remember that the main purpose of avrtest is to run gcc's testsuite. While running the testsuite, having a log is useless, but speed is important. BTW, I've done some more optimizations and the version I have now is almost twice as fast as the one on CVS, doing 30 P4 clocks per AVR clock, i.e., on my P4 3GHz I can simulate a 100MHz avr :) [...] That probably won't work because (IIRC) smaller parts don't have the JMP and CALL instructions and only have RJMP / RCALL, so they can't address a large flash in a natural way. Aha, yes, I forgot about that... good point... right... ;) Do you have an idea how many test compile to a binary bigger then 8KB? Since most test cases look really small, although they may link in huge amounts of library code. I don't have those numbers right now, but since there are tests that don't even fit in 128Kb of flash, there are probably some more that don't fit on 8Kb. One thing we could do is setup our own benchmark suite (not the testsuite) that could be used to test size / speed regressions in all architectures. This could also trigger failures if gcc started using invalid instructions for a particular architecture. We could then have a nice matrix with compiler version / optimization setting / architecture, to show regressions on every architecture we support. Do you already have a format for doing this? XML based? Nop. I haven't even started to think about the details. I would give my full support to anyone trying to setup a benchmarking framework, though ;) -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Hi, all Wouter van Gulik wrote: The exit through the exit port already shows the address. Its only the exit through rjmp +0 that doesn't. I'll change that to make it consistent. On exit you print CPU_C while in log you print CPU_C * 2. That's the difference, and CPU_C is not so useful. Having the true address makes referencing back into a .lss file easier. Just a little note to let you know that the above change and other small cleanups have been committed to cvs. From the changelog: - give more information at program exit - cleanup a lot of #ifdef's - change the timeout from cycles to instructions, because the simulator runs slightly faster this way - add a barrier for the stack at 0x60, that makes avrtest abort with stack overflow when crossed The next step will definitely be ELF loading support. With ELF loading, I can decode symbols like __bss_end to know where the stack overflows exactly or use __stack to know where the stack underflows. I can also do a more symbolic log, by decoding addresses to their symbol names. -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com Deleted code is debugged code. Jeff Sickel ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] testsuite saga continues
Just a little note to let you know that the above change and other small cleanups have been committed to cvs. From the changelog: - give more information at program exit - cleanup a lot of #ifdef's - change the timeout from cycles to instructions, because the simulator runs slightly faster this way - add a barrier for the stack at 0x60, that makes avrtest abort with stack overflow when crossed Yes I have seen it, and used it. My clz no longer passes the test. It bails on stack overflow. But if I comment out the long long parts, all is ok. The next step will definitely be ELF loading support. With ELF loading, I can decode symbols like __bss_end to know where the stack overflows exactly or use __stack to know where the stack underflows. I can also do a more symbolic log, by decoding addresses to their symbol names. That would be very cool! Although a dump log at an abort might also be useful when debugging testcases. Some more thoughts about the smaller avr's. I did not intend to catch wrong instruction, but I was aiming at finding bugs that are do not apply to the mega series. Because if there are bugs in the less capable devices, it's very likely to be in the avr backend, which is easier to fix. Can't avrtest/gcc fake a avr2 device (e.g. at90s8535) with tons of flash and ram? Just like you now fake huge amounts of external memory? Thanks for the good work! Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Wouter van Gulik wrote: [...] The next step will definitely be ELF loading support. With ELF loading, I can decode symbols like __bss_end to know where the stack overflows exactly or use __stack to know where the stack underflows. I can also do a more symbolic log, by decoding addresses to their symbol names. That would be very cool! Although a dump log at an abort might also be useful when debugging testcases. Can you elaborate on that? What exactly are you calling a dump log? Some more thoughts about the smaller avr's. I did not intend to catch wrong instruction, but I was aiming at finding bugs that are do not apply to the mega series. Because if there are bugs in the less capable devices, it's very likely to be in the avr backend, which is easier to fix. Can't avrtest/gcc fake a avr2 device (e.g. at90s8535) with tons of flash and ram? Just like you now fake huge amounts of external memory? That probably won't work because (IIRC) smaller parts don't have the JMP and CALL instructions and only have RJMP / RCALL, so they can't address a large flash in a natural way. One thing we could do is setup our own benchmark suite (not the testsuite) that could be used to test size / speed regressions in all architectures. This could also trigger failures if gcc started using invalid instructions for a particular architecture. We could then have a nice matrix with compiler version / optimization setting / architecture, to show regressions on every architecture we support. Thanks for the good work! You're welcome :) -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com Anything is possible, unless it's not. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Wouter van Gulik wrote: Paulo Marques schreef: The program used more than 4k of stack? Yikes! Well I thinks it's the 64 bit stack bug... if anything goes wrong with the stack you might end up having a huge stack. It's a bug in the program. That makes sense. Can you make avrtest check on stack overflow? I can, specially if I start accepting command line arguments to define memory regions, so that I also know where the stack really ends. I'll post a new version as soon as I have this. In the meanwhile, you can work around that specific problem, by switching the addresses of the exit and the abort ports, so that the abort port is hit first ;) Yes I already thought about doing so. Could you then also print the real flash address of the exit just like you do with the log. The exit through the exit port already shows the address. Its only the exit through rjmp +0 that doesn't. I'll change that to make it consistent. And the total number of cycles past. The latest version in CVS already does that. Just use: cvs -z3 -d:pserver:[EMAIL PROTECTED]:/cvsroot/winavr co -P avrtest to check it out, and then a simple cvs update will be enough to always keep the latest version at hand. BTW, Andrew sent me a test case he tried in both avrora and avrtest and the total cycle count matched almost exactly: 7934 cycles for avrora, 7935 cycles for avrtest. I bet the one cycle difference is from the last OUT instruction, that avrora simply breaks before executing it, while avrtest actually executes it. So, the cycle counts seem to be all correct now, -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com Nostalgia isn't what it used to be. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Wouter van Gulik wrote: The exit through the exit port already shows the address. Its only the exit through rjmp +0 that doesn't. I'll change that to make it consistent. On exit you print CPU_C while in log you print CPU_C * 2. That's the difference, and CPU_C is not so useful. Having the true address makes referencing back into a .lss file easier. Ah! Ok, that's easy to fix. Also, one of the things that I still want to develop is reading the ELF file directly. At that point I can read the symbol information too, and give an even better log output, by converting addresses to symbols, not only in data accesses, but also in function calls, exit addresses, etc. And the total number of cycles past. The latest version in CVS already does that. Just use: Oops missed that change, I thought it was just speed update. And now it also has more readable logs. Keep up the good work! Thanks :) [...] BTW, Andrew sent me a test case he tried in both avrora and avrtest and the total cycle count matched almost exactly: 7934 cycles for avrora, 7935 cycles for avrtest. I bet the one cycle difference is from the last OUT instruction, that avrora simply breaks before executing it, while avrtest actually executes it. This makes me think, can't we make the BREAK OPC code the exit code? This will make it absolutely impossible to exit while writing to (the wrong) memory. Now any write to the exit code memory location can end the program. Point is BREAK is probably not supported for other architectures? I thought about that too, but it seemed harder to write the exit function in C. I've been thinking about writing an avrtest.h include file that has all the wrappers for avrtest. This will include the cycle counting functions and abort/exit. At this point it should be easy enough to change the exit to a break opcode. I don't think that we will have problems with different architectures. In the worst case, we can always emit the binary opcode for the break instruction and we can allow avrtest to execute it always, independently of the architecture. Have you tried the testsuite with a different architecture already? Right now, there is no support in avrtest to specify the architecture and disallow opcodes based on that. On top of that, fixing the tests that fail on even the most powerful avr's seems to be more high priority than getting other tests to start failing because of resource limitations on the less powerful ones ;) -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com Very funny Scotty. Now beam up my clothes. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
After updating avr to gcc head, I can no longer get correct startup file for ATmega128. (cant find file) (crtm128.o?) There have been changes to architecture - do I need to patch binutils or something to get back in Sync? atmega169 worked - but I forgot it only has 1K RAM so it failed testsuite :-( or maybe there is another Atmega I can try that is unaffected by the new groupings? Andy ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Andrew Hutchinson schreef: One might argue that carry is the result of a compare with largest integer value (255 for bytes). But these situation do not directly arise in C - or I assume any other supported language - so it is not considered. (Though the ability to propage carry would indeed help create mode independent arithmetic operation.). Having carry as a condition code indeed seems not very useful But the most benefit from teaching gcc about the carry is the propagation of the carry, that is my main concern. Is not possible to create a special register for carry, (not in cc0) just for doing arithmetics using carry? This would lead to an expand of the sub/add/shift/cmp(?) in to simple byte patterns. Giving gcc much more knowledge on what's going on. This is close to what Dave has suggested in the other thread. I have to little knowledge on gcc's further internals to over see all consequences, I guess there are very good reason not to do this. Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] testsuite saga continues
In the meantime, I tried to not mark long long as unsupported, with similar results. Without no_long_long there are more 32 tests that fail, but less 556 tests are marked as unsupported. Which means that 64 bit long long's are mostly supported in fact. The question is: are long long's officially supported? Should we be running the tests that use them? BTW: 64 bit long long is really hard for a 8 bit microcontroller. At least one of the tests (with -O0 optimization) was initially failing from timeout, which means that it was taking more than 500 million cycles to execute. Increasing the timeout to 2 billion cycles solved the problem, though. Well today I have found out why this could be. I am testing a new version of the clz fixes and I also implemented some DI versions (DI = double int = 64 bit in gcc's internal terms). To my surprise some options did not changed a thing in cpu cycles, while the program got much shorter... So I took another look at it, and guess what... The stack usage was to much, so that it was now pushing it values into I/O memory including the special exit code memory. The program now exited successful on a push r15 :D Can you make avrtest check on stack overflow? Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] testsuite saga continues
Quoting Wouter van Gulik [EMAIL PROTECTED]: [...] BTW: 64 bit long long is really hard for a 8 bit microcontroller. At least one of the tests (with -O0 optimization) was initially failing from timeout, which means that it was taking more than 500 million cycles to execute. Increasing the timeout to 2 billion cycles solved the problem, though. Well today I have found out why this could be. I am testing a new version of the clz fixes and I also implemented some DI versions (DI = double int = 64 bit in gcc's internal terms). Nice. To my surprise some options did not changed a thing in cpu cycles, while the program got much shorter... So I took another look at it, and guess what... The stack usage was to much, so that it was now pushing it values into I/O memory including the special exit code memory. The program now exited successful on a push r15 :D The program used more than 4k of stack? Yikes! Can you make avrtest check on stack overflow? I can, specially if I start accepting command line arguments to define memory regions, so that I also know where the stack really ends. I'll post a new version as soon as I have this. In the meanwhile, you can work around that specific problem, by switching the addresses of the exit and the abort ports, so that the abort port is hit first ;) -- Paulo Marques This message was sent using IMP, the Internet Messaging Program. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Paulo Marques schreef: The program used more than 4k of stack? Yikes! Well I thinks it's the 64 bit stack bug... if anything goes wrong with the stack you might end up having a huge stack. It's a bug in the program. Can you make avrtest check on stack overflow? I can, specially if I start accepting command line arguments to define memory regions, so that I also know where the stack really ends. I'll post a new version as soon as I have this. In the meanwhile, you can work around that specific problem, by switching the addresses of the exit and the abort ports, so that the abort port is hit first ;) Yes I already thought about doing so. Could you then also print the real flash address of the exit just like you do with the log. And the total number of cycles past. Thanks in advance, Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Wouter van Gulik wrote: Well, the GCC library provides most (if not all) functions for 64-bit operations. Alas this is written in C and the compiler can't make this nearly as good as an handwritten assembler routine. Mostly due to the lack of carry support in gcc. Is it feasible to introduce carry support in gcc? There are many architectures that would benefit from direct access to the carry flag (and other flags commonly found in cpus, such as overflow flags, sign flags, the avr's T flag, or the extended flag on m68k cpus). It would mean that long arithmetic functions could be written almost optimally in C, as well as benefiting other types of code (such as crc routines or anything else involving rotations). It's certainly a feature that many commercial microcontroller compilers support - the ability to write things like if (CARRY) ... can be a big win for some code. I guess it's just wishful thinking... mvh., David ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
I have not dug enough into the details of gcc, but I thought that flags were only visible at a low level, such as in the avr.md file, where you are defining the assembly code sequences for different effects. Thus it is possible to define a 16-bit addition instruction with an add, adc sequence - but you can't really make use of the carry flag after that. Yes this is exactly what I wanted to point out. The carry is now only used in handwritten assembler (in avr.md). GCC's RTL does not know anything about the carry bit being available when it's set/cleared and when it's clobbered. HTH, Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
Wouter van Gulik wrote: Yes this is exactly what I wanted to point out. The carry is now only used in handwritten assembler (in avr.md). GCC's RTL does not know anything about the carry bit being available when it's set/cleared and when it's clobbered. Is there some limitation in the RTL that keeps one from describing condition code bits as registers and describing when the are set and used in the RTL? (He asks naively, never having looked at gcc's rtl...) Wouldn't that be transparent to normal code, since the cc's would be mostly unused, and data flow analysis should discard the result in most cases... except when you want to pick them up. -dave (sits quietly waiting for the clue-bat to descend...) ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] testsuite saga continues
gcc uses its own status register cc0 that is set as a result of compares and operators such as EQ,LT,GT. This is effectively a translation of the avr status register effects for signed/unsigned operations. So after each instruction gcc know what status is available or not. This register may be set implicit from an operation (often implied compare with zero) or from a specific compare. unsigned CARRY is a special case. It is really an overflow during an operation and cannot be equated to a compare operation. So it does not fit gcc model. One might argue that carry is the result of a compare with largest integer value (255 for bytes). But these situation do not directly arise in C - or I assume any other supported language - so it is not considered. (Though the ability to propage carry would indeed help create mode independent arithmetic operation.). The cc0 system at present cannot deal with instruction between the setter and the user of cc0. Tracking avr status register effects is already there to support this - but the rest of gcc does not fully utilize this. Some of this limitation is often hidden in the final code by back end optimization - so it may appear at times that gcc is smarter than I describe. There has been some suggestion of changing gcc cc0 based system, But as yet, nobody has completely figured out a way of doing it. Andy Wouter van Gulik wrote: I have not dug enough into the details of gcc, but I thought that flags were only visible at a low level, such as in the avr.md file, where you are defining the assembly code sequences for different effects. Thus it is possible to define a 16-bit addition instruction with an add, adc sequence - but you can't really make use of the carry flag after that. Yes this is exactly what I wanted to point out. The carry is now only used in handwritten assembler (in avr.md). GCC's RTL does not know anything about the carry bit being available when it's set/cleared and when it's clobbered. HTH, Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] testsuite saga continues
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of Andrew Hutchinson Sent: Wednesday, January 30, 2008 3:55 PM To: Wouter van Gulik Cc: avr-gcc-list@nongnu.org Subject: Re: [avr-gcc-list] testsuite saga continues gcc uses its own status register cc0 that is set as a result of compares and operators such as EQ,LT,GT. snip There has been some suggestion of changing gcc cc0 based system, But as yet, nobody has completely figured out a way of doing it. See this page on the GCC web site: http://gcc.gnu.org/projects/beginner.html This page lists projects which are feasible for people who aren't intimately familiar with GCC's internals. Many of them are things which would be extremely helpful if they got done, but the core team never seems to get around to them. This page lists this item: - Convert md files that use (cc0) so they don't anymore. And it give some suggestion on how to do it, but it also says that it would be difficult. The AVR is one of the ports that uses the cc0 system. It would be great if we could move to the new system eventually. Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] testsuite saga continues
Quoting Wouter van Gulik [EMAIL PROTECTED]: [...] Do you have a clue on why the tests fail? There is an ugly bug concerning stack allocation and 64 bit variables, maybe that is the evil one. See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27386 for details. All the failed tests I've seen so far do in fact pass long long arguments to functions together with a bunch of other arguments (sometimes using va_args, too). In one of the cases (gcc.c-torture/execute/20030307-1.c), the test only fails at -O0 and -O1, but passes with other optimization levels because the functions get inlined and disappear completely, so the argument passing problem disappears too. So, I would say that it is very likely the same bug... -- Paulo Marques This message was sent using IMP, the Internet Messaging Program. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list