RE: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list]GCC-AVR Register optimisations]]
For what it is worth, I would prefer that simulavrxx proper could be used, even if it was just built as an separate executable along with the full-up code. This is one point of view of course. I also however know that libbfd is a pain for us the way we use it becuase over time it changes in ways we often don't care about, but cuases trouble for our simulavrxx users who have to cause it to be built and installed...then simulavrxx has to find and use it x-p I'm pretty sure one of my build clean-up activities should include just including a suitable version of libbfd sources in simulavrxx and dispense with the special build requirements we have today. Hence I'm actually contemplating doing just what you did. (and I've been told this is a wrong approach too ;-p ) So in the end I say, more power to you. Thanks for posting. The free and open communication certainly is in the spirit of FSF and OSS. It's all good. BTW: Where would you host your new tool? For my own information, how do you use it in conjunction with the GCC testsuite? Feel free to take this part offline or ignore if you prefer. :-) On Sun, 2008-01-13 at 23:15 +, Paulo Marques wrote: Now, I don't mind at all discussing technical merits of the idea, especially if I can show my own code to use as a counter argument. So, I was trying to delay my replies (including the reply to Joerg Wunsch) to a point where I could show some code instead of the natural handwaving that these kinds of discussions inevitably degenerate into. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list]GCC-AVR Register optimisations]]
Quoting William Rivet [EMAIL PROTECTED]: For what it is worth, I would prefer that simulavrxx proper could be used, even if it was just built as an separate executable along with the full-up code. This is one point of view of course. Maybe I caused the wrong impression, but I did look into simulavrxx before taking on this task, becaue that was my initial thought too. The thing is, I think of avrtest more of a test tool rather than a simulator. A test tool that we can tweak in any way that simplifies the setup needed to run the testsuite. I also however know that libbfd is a pain for us the way we use it becuase over time it changes in ways we often don't care about, but cuases trouble for our simulavrxx users who have to cause it to be built and installed...then simulavrxx has to find and use it x-p I'm pretty sure one of my build clean-up activities should include just including a suitable version of libbfd sources in simulavrxx and dispense with the special build requirements we have today. Hence I'm actually contemplating doing just what you did. (and I've been told this is a wrong approach too ;-p ) Please note that I took just a small function from simulavrxx, and one that I probably still want to re-write someday, anyway. Most of the code is written from scratch to be much simpler than the simulavrxx version (just compare the almost 5000 lines of code for just the decode.* part of simulavrxx vs 1400 for the complete avrtest). I'm not going to tell you that it is the wrong approach, but you should look at avrtest, too ;) So in the end I say, more power to you. Thanks for posting. The free and open communication certainly is in the spirit of FSF and OSS. It's all good. Thanks :) BTW: Where would you host your new tool? I still didn't look into it, but my idea was to host in place that gave the idea of this is what you need to run the gcc testsuite for gcc and not so much this is where you can find yet another avr simulator. I was just trying to make it work myself before thinking about an official release. For my own information, how do you use it in conjunction with the GCC testsuite? Feel free to take this part offline or ignore if you prefer. At this point I'm still reading dejagnu documentation and trying to figure out how everything fits together. From what I've already seen it looks like avrtest can indeed be very helpful, in terms of simplifying the total setup, increasing execution speed and improving portability. As soon as I get some concrete results, I'll post them on the list, together with the steps needed to reproduce them. Just give me a few more days. -- Paulo Marques This message was sent using IMP, the Internet Messaging Program. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: Simulator for GCC Testing [was: RE: [Fwd: Re:[avr-gcc-list]GCC-AVR Register optimisations]]
FYI changing simulators is very easy. Get testsuite cases to pass is another thing! Im trying avrora - just for speed right now. I'll compare when I've got my build under control. Weddington, Eric wrote: -Original Message- From: Paulo Marques [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 15, 2008 5:01 PM To: [EMAIL PROTECTED] Cc: Weddington, Eric; Andrew Hutchinson; [EMAIL PROTECTED]; avr-gcc-list@nongnu.org; KlausRudolph Subject: RE: Simulator for GCC Testing [was: RE: [Fwd: Re:[avr-gcc-list]GCC-AVR Register optimisations]] Quoting William Rivet [EMAIL PROTECTED]: BTW: Where would you host your new tool? I still didn't look into it, but my idea was to host in place that gave the idea of this is what you need to run the gcc testsuite for gcc and not so much this is where you can find yet another avr simulator. I was just trying to make it work myself before thinking about an official release. FWIW, I'm willing to host it on the WinAVR CVS repository. Thanks, Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list]GCC-AVR Register optimisations]]
Hi all, I also however know that libbfd is a pain for us the way we use it becuase over time it changes in ways we often don't care about, but cuases trouble for our simulavrxx users who have to cause it to be built and installed...then simulavrxx has to find and use it x-p There has nothing changed which was a problem for simulavrxx. The only trouble comes from mixing headers with different libs. I'm pretty sure one of my build clean-up activities should include just including a suitable version of libbfd sources in simulavrxx Oh no! We have discussed that a lot of times and I will not include any! sources from foreign projects. Not TCL/TK and not libbfd. There is no reason for it! The only problem is that we have the corresponding bfd.h with the libbfd compiled for avr. Nothing more must be fulfilled. For a gcc test suite simulavrxx and simulavr are nearly the same. The decoder uses the same instruction set (sleep is not supported, but this instruction will not be part of any standard c code and also writing flash is not inside, but this is not what the compiler is interested in:-) I have not understood why we need a reduced smaller simulavrxx for a test suite? Is the size a problem? Back to the point of build tools: Bill had done a lot for building the tool on different plattforms and a lot of searching all the dependencies. But all that work results actually in a very complex build system. As Knut allready mentioned, we actually have a build tool chain which must have python to build the tcl examples. That sounds very terrible for me and makes things much to complex. From my point of view: I use my own old Makefile with one config file which contains 2 lines of informations: path to libbfd and path to tcl/tk. Thats all. No need for any! kind of external tooling (autotools) and so on. Making things automated simple could result in complex results :-) For a gcc test suite there is also no need for having tcl/tk or python or any other scripting language available. Simply read the elf-files and watch the results of simulation with some environment for automation. If there is a need for a more elaborate solution: let me know! Maybe I can do that for you! Bye Klaus ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: AVR Benchmark Test Suite [was: RE: [avr-gcc-list] GCC-AVR Register optimisations]
Dave N6NZ [EMAIL PROTECTED] wrote: the Atmel 802.15.4 MAC, Need to check license on that one -- but a good choice otherwise BSD-style. If it is desired to have it in a more neutral place, such as avr-libc, I'm open to that too, if Joerg Wunsch is willing. Seems to me that as long as they are publicly available under an appropriate license, it doesn't really matter much who backs them up :) Agreed, I think both locations (sf.net, or savannah.nongnu.org) would do fine. -- cheers, Jorg .-.-. --... ...-- -.. . DL8DTL http://www.sax.de/~joerg/NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list]GCC-AVR Register optimisations]]
-Original Message- From: Paulo Marques [mailto:[EMAIL PROTECTED] Sent: Saturday, January 12, 2008 8:21 AM To: Weddington, Eric Cc: Andrew Hutchinson; [EMAIL PROTECTED]; avr-gcc-list@nongnu.org; [EMAIL PROTECTED]; Klaus Rudolph Subject: Re: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list]GCC-AVR Register optimisations]] Quoting Weddington, Eric [EMAIL PROTECTED]: I strongly recommend that the wheel not be reinvented. If people are interested in running the GCC Regression Test Suite, I would recommend using available tools, and improving the available tools rather then invent new ones. I've looked at the code for both simulavr and simulavrxx. It seems to me that these are more geared towards people trying to debug problems with their own projects, and not so much automate compiler tests. (more like AVR studio, too) The goal is for both. Simulavr is already being used for testing the compiler and for avr-libc. Most of the code there is to handle all sorts of peripherals that can be found on avr microcontrollers, as is to be expected from full emulators. However, my idea is much simpler: it is probably just the size of decoder.cpp of simulavrxx, re-written in plain C. This should make it really easy to port it to any platform (cygwin, etc.). So now we're talking about simulavr again. Simulavr is written in C and can at least be built for Cygwin. But it's unmaintained. The goal is to get simulavrxx working for Cygwin AND for running the GCC test suite. The major advantage is that we are _not_ trying to emulate a specific avr model, and as such we can do all sorts of hacks to help the test / benchmark suite as best as we can. We can allow the program to write to files on the host. We can measure acurate cycle timings and dump the results in a convenient way for the benchmark suite. We can emulate an AVR with 8Mb of flash and 2Mb of RAM. We can force the start/stop cycle counter instructions to use zero cycles, so that they don't interfere with the counts themselves. We can report exit codes from the avr code, so that the test suite can use it to determine success/failure in some of the tests. Etc., etc. So, I don't think I'm reinventing the wheel here. This is getting to a point where I'm very tempted to just do it (it seems so simple) and publish it so that I can show what I mean... And this will further split the community. Why can't we work together, instead of always separately? Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: AVR Benchmark Test Suite [was: RE: [avr-gcc-list] GCC-AVR Register optimisations]
I'll just plug Avrora again: http://compilers.cs.ucla.edu/avrora/ It runs on many platforms (written all in Java), is quite fast, and is well designed. Best of all it is easy to extend, you just add monitors that can be configured to receive a wide variety of callbacks about program events such as memory operations, I/O operations, execution of different kinds of instructions, interrupts, etc. focus on AVR-specific code, and GCC-specific AVR code at that. Definitely. If people want to test avr-gcc against other compilers, or compare AVR to other architectures, that's a separate exercise. MiBench is an aging but useful collection of embedded C codes: http://www.eecs.umich.edu/mibench/ John, I would welcome publicly available code from TinyOS, but I would need to be already compiled with nesc, so that way we just have straight C that we can feed into avr-gcc. Sure, this is easy. It'll target ATmega128 only, howver. Re. floating point I believe that the papabench codes do a lot of this: http://www.irit.fr/recherches/ARCHI/MARCH/rubrique.php3?id_rubrique=97 This is code extracted from the Paparazzi UAV project, which uses an ATmega for onboard flight control. There needs to be some consensus on what we measure, how we measure it, what output files we want generated, and hopefully some way to automatically generate composite results. I'm certainly open to anything in this area. Code size and static RAM consumption are obvious. Some sort of throughput metric is useful. For interrupt-driven codes, my group often uses processor duty cycle as a measure of efficiency. This is the % of time that the CPU is not in a sleep mode. Dyanmic stack memory consumption is good, though this is not a very consistent metric for interrupt-driven codes since in a short simulation run the worst-case stack usage is unlikely to be encountered. Perhaps adding up the stack memory usage of main + all interrupts would be better. John Regehr ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list]GCC-AVR Register optimisations]]
Quoting Weddington, Eric [EMAIL PROTECTED]: -Original Message- [...] I've looked at the code for both simulavr and simulavrxx. It seems to me that these are more geared towards people trying to debug problems with their own projects, and not so much automate compiler tests. (more like AVR studio, too) The goal is for both. Simulavr is already being used for testing the compiler and for avr-libc. Yes, but simulavr isn't being maintained any more, right? Even more, one of my points is that having a software to handle both cases might be harder to maintain than a simple simulator and full-featured hardware emulator as separate projects. This is not my strongest point, though. Most of the code there is to handle all sorts of peripherals that can be found on avr microcontrollers, as is to be expected from full emulators. However, my idea is much simpler: it is probably just the size of decoder.cpp of simulavrxx, re-written in plain C. This should make it really easy to port it to any platform (cygwin, etc.). So now we're talking about simulavr again. Simulavr is written in C and can at least be built for Cygwin. But it's unmaintained. The goal is to get simulavrxx working for Cygwin AND for running the GCC test suite. No, simulavrxx has a _lot_ of code to handle peripherals. In fact it is the majority of the code of simulavrxx. The CPU part is just 4897 lines out of a total of 20586. [...] And this will further split the community. You make it sound like a bad thing. Why can't we work together, instead of always separately? Because it's not the way open source works. (or at least not the way it works better) Open source works the other way: projects blossom or die by natural selection, with the advantage that we can pick the best parts of each project and mix and match as we please (doing a bit of artificial selection in the process). So, I don't like the way simulavrxx works. The pre-decoding of flash into instances of opcode classes doesn't seem like a good idea to me, and it is a fundamental concept of simulavrxx. That _is_ my personal opinion and everyone is entitled to his own. Now, I don't mind at all discussing technical merits of the idea, especially if I can show my own code to use as a counter argument. So, I was trying to delay my replies (including the reply to Joerg Wunsch) to a point where I could show some code instead of the natural handwaving that these kinds of discussions inevitably degenerate into. Attached is a beta version of avrtest. It also has a small Hello, World! demo that actually runs under avrtest and produces the expected output. It is a single file of C code, 1391 lines long. It must still have some bugs in there, but I was actually quite surprised when after writing the whole thing it ran Hello, World! on the first attempt. I used the lookup_opcode function from simulavrxx to save some time, but if the author has a problem with me using it (although I'm legally entitled to use it), I'll write my own too, because I believe that respecting the author's whishes is more important than respecting the actual license. I'll be polishing it up a bit over the next few days because it still lacks a few things: - a few opcodes are still not implemented at all - RAMPx registers still need to be handled in a few places - a few hidden bugs (hopefully) that need to be chased down and shot So, please, instead of just dismissing this project as a vanity hacker's project or NIH sindrome, just take a look at it. If you still don't like it, that's fine. But at least now we can talk more technical and less handwaving. At least, I bet you can easily compile it under cygwin ;) -- Paulo Marques This message was sent using IMP, the Internet Messaging Program. avrtest.tar.gz Description: GNU Zip compressed data ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: AVR Benchmark Test Suite [was: RE: [avr-gcc-list] GCC-AVR Register optimisations]
Weddington, Eric wrote: Hi John, Dave, others, Here are some random thoughts about a benchmark test suite: - GCC has a page on benchmarks: http://gcc.gnu.org/benchmarks/ However all of those are geared towards larger processors and host systems. There is a link to a benchmark that focuses on code size, CSiBE, http://www.inf.u-szeged.hu/csibe/. Again, that benchmark is geared towards larger processors. This creates a need to have a benchmark that is geared towards 8-bit microcontroller environments in general, and specifically for the AVR. What would we like to test? Code size for sure. Everyone always seems to be interested in code size. There is an interest in seeing how the GCC compiler performs from one version to the next, to see if optimizations have improved or if they have regressed. Which I would call regression tests, not benchmarks, per se. Of performance regressions, I would guess that code size regressions under -Os are the #1 priority for the typical user. (A friend is currently tearing his hair out over a code size regression in a commercial PIC C compiler -- he needs to release a minor firmware update to the field... but not even the original code fits his flash any more...) It's worth drawing a distinction between benchmarks and regression tests. They need to be written differently. A regression test needs to sensitize a particular condition, and needs to be small enough to be debuggable. A benchmark needs to be realistic, which often makes them harder to debug. I say we need both. The performance regression tests can easily roll into release criteria. A suite of performance benchmarks is more useful as a confirmatory measure of goodness -- but actual mysteries in the aggregate score will most likely be chased with smaller tests. My guess is that existing tests my help us a lot in the benchmark category, but the regression tests will require some elbow grease on our part to get a good set. There's a good chance we can extract good regression tests from existing benchmark-sized tests. A semi-related question is how many of these tests can be pushed up stream? If we could get a handful of uCtlr-oriented code size regression tests packaged up so that the developers of the generic optimizer could run them as release criteria, it would, I would think, improve the overall quality of gcc for all uCtlr targets. There is also an interest in comparing AVR compilers, such as how GCC compares to IAR, Codevision or ImageCraft compilers. Who is interested? gcc developers, as a means to keep gcc competitive? Or potential users? The former is benchmarking, the latter is moving towards bench-marketing. Not that marketing is bad, but that sort of thing can be a distraction. In any case, the tests that are meaningful here are the benchmark overall goodness test suite, not the targeted test suite. And sometimes there is an interest in comparing AVR against other microcontrollers, notably Microchip's PIC and TI's MSP430. Different processor with same compiler? Different processor with best compiler? -- Now this is beginning to sound like SPEC. Because there are these different interests, it is challenging to come up with appropriate code samples to showcase and benchmark these different issues. But we could also implement this in stages, and focus on AVR-specific code, and GCC-specific AVR code at that. Clarity of classification is import. Different buckets for different issues. If we are going to put together a benchmark test suite, like others benchmarks for GCC (for larger processors), then I would think that it would be better to model it somewhat after those other benchmarks. I see that they tend to use publicly available code, and a variety of different types of applications. For benchmarking, and bench-marketing, that's a good approach. I'll be redundant and say those are probably not what you want to be debugging. It would make sense for what I'll call a avr-gcc dashboard. I see a web page with a bunch of bar graphs on it. A summary bar at the top that is the weighted sum of individual test bars. As an avr-gcc user, that kind of summary page would be very useful from one release to the next for setting expectations regarding performance on your own application. As an avr-gcc release master, it's a good dashboard for tracking progress and release worthy-ness. We should have something similar. Some suggested projects: FreeRTOS (for the AVR) Sounds good, , uIP (however, we need to pick a specific implementation of it for the AVR; I have a copy of uIP-Crumb644), Another good one the Atmel 802.15.4 MAC, Need to check license on that one -- but a good choice otherwise and the GCC version of the Butterfly firmware. I also have a copy of the TI Competitive Benchmark, which they, and other semiconductor companies, have used to do comparisons between processors. Not familiar with it. Also, check the license.
Re: AVR Benchmark Test Suite [was: RE: [avr-gcc-list] GCC-AVR Register optimisations]
(A friend is currently tearing his hair out over a code size regression in a commercial PIC C compiler -- he needs to release a minor firmware update to the field... but not even the original code fits his flash any more...) Embedded compiler rule #1: If you find a version of the compiler that works, keep a copy around for the life of the product. John ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]]
Eric, please share what info you have with me. I wouldn't mind running whatever works through simulavr to see what it is about (I'm new to the regression suite...I've only hacked GCC for it's C++ front-end bits, and certainly broke many things when I did :-p ) Give the recent interest I'm trying to make some time to improve what I can on simulavrxx since it was supposed to superceed the old simulavr. If you CC the simulavr list, I'll also pick up on relevant threads a bit quicker...I'm bad about following the avr list :-) Cheers, Bill On Fri, 2008-01-11 at 20:14 -0700, Weddington, Eric wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of Andrew Hutchinson Sent: Thursday, January 10, 2008 6:27 PM To: [EMAIL PROTECTED] Cc: avr-gcc-list@nongnu.org Subject: Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations] Here my input: For starters gcc has testsuite that can be used. It's not perfect but its quite demanding - even if we cant do all the tests. Do we have info on setting this up with simulator? I did have some instruction - once! After than I suggest some benchmark that would produce more normal code and also give qualitative indications of performance (size is easy, speed would be nice). Finally, regression tests using testcases and bug reports. Hi All, Some points: - Yes, GCC does have a Regression Test Suite, and it can execute for the AVR using the SimulAVR simulator. There are many, many tests that pass for the AVR. There are quite a few that don't, but most of those failures that I have looked at either the test needs fixing (because it assumes a 32-bit processor), or the tests don't apply to the AVR. Some work needs to be done to get the Regression Test Suite in shape for the AVR. - As mentioned, simulavr is known to work as a simulator for the GCC test suite. However, simulavr is not really maintained anymore. At the simulavr project on Savannah, there is a new code base called simulavrxx which is based on C++. This is maintained, but it could use help: It doesn't run on Cygwin yet, and AFAIK it cannot run the GCC Test Suite yet. Any help on this is deeply appreciated. https://savannah.nongnu.org/projects/simulavr I strongly recommend that the wheel not be reinvented. If people are interested in running the GCC Regression Test Suite, I would recommend using available tools, and improving the available tools rather then invent new ones. I have instructions on running the GCC Regression Test Suite (from Bjoern Haase, IIRC). I have yet to run it myself, but others have done so successfully. However, there are reports about difficulties on running the test on Cygwin. I have heard that it is successful on Linux. There is a person from Belgium, Mike Stein, who has been running the GCC Test Suite for the AVR pretty much on a daily basis and he has been posting the results regularly on the gcc-testresults mailing list: http://gcc.gnu.org/ml/gcc-testresults/. Just search for avr to see the results. (It looks like he last did it in December.) I'll be attempting to run the GCC Test Suite probably sometime in Q1 2008. (I'm going to be busy in January and February.) Just email me if anyone is interested in the instructions. Note that running the GCC Test Suite is imperative for anyone who works on GCC, because in order to submit any patches to the GCC project, they require that the patch is tested with the Test Suite and that there are no new regressions. That's the main purpose of the test suite. I'll start a new thread about a benchmark suite... Thanks, Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: AVR Benchmark Test Suite [was: RE: [avr-gcc-list] GCC-AVR Register optimisations]
-Original Message- From: Dave N6NZ [mailto:[EMAIL PROTECTED] Sent: Sunday, January 13, 2008 4:19 PM To: Weddington, Eric Cc: John Regehr; avr-gcc-list@nongnu.org Subject: Re: AVR Benchmark Test Suite [was: RE: [avr-gcc-list] GCC-AVR Register optimisations] Weddington, Eric wrote: It's worth drawing a distinction between benchmarks and regression tests. They need to be written differently. A regression test needs to sensitize a particular condition, and needs to be small enough to be debuggable. A benchmark needs to be realistic, which often makes them harder to debug. I say we need both. The performance regression tests can easily roll into release criteria. A suite of performance benchmarks is more useful as a confirmatory measure of goodness -- but actual mysteries in the aggregate score will most likely be chased with smaller tests. Ok. Regression tests should really fit within the GCC Regression Test framework. I would rather not duplicate the work that they have there. So I'm really looking for benchmark tests, under your definition. That's not to say I want to ignore the regression tests. I just want to fill in a gap that's missing for the AVR. A semi-related question is how many of these tests can be pushed up stream? If we could get a handful of uCtlr-oriented code size regression tests packaged up so that the developers of the generic optimizer could run them as release criteria, it would, I would think, improve the overall quality of gcc for all uCtlr targets. Nothing can be pushed upstream right now. As I mentioned in another post in this thread, the AVR target is not that important in the eyes of the overall members of the GCC project. I'm working diligently to change that. But it's one of those, if we want something done, do it ourselves. There is also an interest in comparing AVR compilers, such as how GCC compares to IAR, Codevision or ImageCraft compilers. Who is interested? gcc developers, as a means to keep gcc competitive? Or potential users? The former is benchmarking, the latter is moving towards bench-marketing. Not that marketing is bad, but that sort of thing can be a distraction. In any case, the tests that are meaningful here are the benchmark overall goodness test suite, not the targeted test suite. As a gcc developer, I am interested in some kind of metric to keep gcc competitive with other AVR compilers. Honestly, it seems that it is urban myth that IAR optimizes better than GCC. Is that really true? For what applications? For what compiler switches? Eventually I'd like to have something definitive to combat any FUD. I don't want to get into bench-marketing. I would really like to have something of value and meaningful, and not have to tweak numbers to arrive at good results to show off. If AVR GCC sucks in an area, I don't want to paper over it. I want to show it so we know what needs improvement. And sometimes there is an interest in comparing AVR against other microcontrollers, notably Microchip's PIC and TI's MSP430. Different processor with same compiler? Different processor with best compiler? -- Now this is beginning to sound like SPEC. Well, lofty goals for sure. I don't want to get outside of the 8-bit microcontroller realm. I certainly want to do first things first. But I think it might be interesting, at some point in the future, if some of those things could be achieved. If we are going to put together a benchmark test suite, like others benchmarks for GCC (for larger processors), then I would think that it would be better to model it somewhat after those other benchmarks. I see that they tend to use publicly available code, and a variety of different types of applications. For benchmarking, and bench-marketing, that's a good approach. I'll be redundant and say those are probably not what you want to be debugging. It would make sense for what I'll call a avr-gcc dashboard. I see a web page with a bunch of bar graphs on it. A summary bar at the top that is the weighted sum of individual test bars. As an avr-gcc user, that kind of summary page would be very useful from one release to the next for setting expectations regarding performance on your own application. As an avr-gcc release master, it's a good dashboard for tracking progress and release worthy-ness. That's definitely the idea. the Atmel 802.15.4 MAC, Need to check license on that one -- but a good choice otherwise :-) and the GCC version of the Butterfly firmware. I also have a copy of the TI Competitive Benchmark, which they, and other semiconductor companies, have used to do comparisons between processors. Not familiar with it. Also, check the license. Processor manufacturers (like, oh, for instance, *all* the several I have worked for) are very touchy about benchmarks and benchmark
Re: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]]
Quoting Weddington, Eric [EMAIL PROTECTED]: [...] Hi All, Hi, Some points: - Yes, GCC does have a Regression Test Suite, and it can execute for the AVR using the SimulAVR simulator. There are many, many tests that pass for the AVR. There are quite a few that don't, but most of those failures that I have looked at either the test needs fixing (because it assumes a 32-bit processor), or the tests don't apply to the AVR. Some work needs to be done to get the Regression Test Suite in shape for the AVR. - As mentioned, simulavr is known to work as a simulator for the GCC test suite. However, simulavr is not really maintained anymore. At the simulavr project on Savannah, there is a new code base called simulavrxx which is based on C++. This is maintained, but it could use help: It doesn't run on Cygwin yet, and AFAIK it cannot run the GCC Test Suite yet. Any help on this is deeply appreciated. https://savannah.nongnu.org/projects/simulavr I strongly recommend that the wheel not be reinvented. If people are interested in running the GCC Regression Test Suite, I would recommend using available tools, and improving the available tools rather then invent new ones. I've looked at the code for both simulavr and simulavrxx. It seems to me that these are more geared towards people trying to debug problems with their own projects, and not so much automate compiler tests. (more like AVR studio, too) Most of the code there is to handle all sorts of peripherals that can be found on avr microcontrollers, as is to be expected from full emulators. However, my idea is much simpler: it is probably just the size of decoder.cpp of simulavrxx, re-written in plain C. This should make it really easy to port it to any platform (cygwin, etc.). The major advantage is that we are _not_ trying to emulate a specific avr model, and as such we can do all sorts of hacks to help the test / benchmark suite as best as we can. We can allow the program to write to files on the host. We can measure acurate cycle timings and dump the results in a convenient way for the benchmark suite. We can emulate an AVR with 8Mb of flash and 2Mb of RAM. We can force the start/stop cycle counter instructions to use zero cycles, so that they don't interfere with the counts themselves. We can report exit codes from the avr code, so that the test suite can use it to determine success/failure in some of the tests. Etc., etc. So, I don't think I'm reinventing the wheel here. This is getting to a point where I'm very tempted to just do it (it seems so simple) and publish it so that I can show what I mean... -- Paulo Marques This message was sent using IMP, the Internet Messaging Program. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]]
Writing this sounds intriguing and amusing to me. If you need more coders, add me to the list. I'd be interested in contributing. Gre7g --- Paulo Marques [EMAIL PROTECTED] wrote: Quoting Weddington, Eric [EMAIL PROTECTED]: So, I don't think I'm reinventing the wheel here. This is getting to a point where I'm very tempted to just do it (it seems so simple) and publish it so that I can show what I mean... __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]]
Paulo Marques [EMAIL PROTECTED] wrote: I've looked at the code for both simulavr and simulavrxx. It seems to me that these are more geared towards people trying to debug problems with their own projects, and not so much automate compiler tests. (more like AVR studio, too) No, that's one of their points but not the only one. simulavr is already in use by the avr-libc project for teir own testsuite, and as Eric told you, it used to be in use for the GCC testsuite as well. This is done by using the simulator on a standalone basis (as opposed to coupling it to GDB as it's the case for interactive debugging). I don't think another simulator will do any good. My opinion is that simulavrxx particularly needs help on the documentation front (the original author isn't very fluent in writing English), and it might need some help in upgrading some of the implemented features for more recent AVRs (but except for adding the ATmega256x architecture, this is not much of relevance for plain compiler or library tests). Another simulator will only further split the development forces rather than bundle them, it will further confuse the users about why there's a multitude of simulators (with none of them being really good), it will introduce further bugs rather than fixing those that are already there. Some of the CPU details are not so obvious from the datasheet, so since a further simulator is likely to ignore all the experience the other simulator writers have already collected, it stands a good chance of simulating wrong in some situations. Excuse me, but it sounds much like a vanity hacker's project to me than any serious help. NIH, so to speak. -- cheers, Jorg .-.-. --... ...-- -.. . DL8DTL http://www.sax.de/~joerg/NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
Andrew Hutchinson schreef: PS Please report as a bug - gcc should be better than this. I did, it got number 34737. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34737 I hope all info is ok. I wanted to add a link to your e-mail. Put it's not on the list archives yet. Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
Dave N6NZ wrote: [...] The test matrix deserves some thought. The same test might have different pass/fail criteria under different options. For example, imagine a suite of 10 tests. You might say 10/10 must show zero code growth under -Os, 7/10 show no speed degradation under -Os, the same 10 tests must show zero slow down for 10/10 under -O3, 6/10 no code growth under -O3. (Just an example, may not be realistic.) One thing that might help us a lot here, is a cooperative simulator. Something along these lines: - the simulator would be used to test gcc, so most hardware simulation could be ignored. The test code could be written to only use what the simulator offered. - the simulator could have special ports to control output and generate statistics to test regressions. Something along these lines (for instance): - mem address 0xFF is the control port. Writting 0xEn starts cycle counter 'n' and writing 0xFn stops cycle counter 'n' and outputs the result - writing a value less than 0x20 to the control port stops the emulator and returns that as the emulator exit code - writing to address 0xFE sends data to standard output - reading from address 0xFE reads one byte from standard input The possibilities for commands and emulator control from the actual code being executed are endless, and this is just a few ideas from the top of my head. Writing a simulator like this is pretty easy (I've written one for the CPU on my watch [1]) because most of the work in doing an emulator is writing the hardware emulation part. The CPU is the easy part, especially with a CPU with such a regular instruction set. With this emulator we could build test scripts that would run the generated code under the emulator and could compare cycle counts, code size, return values, etc. The idea is to allow avr code to run almost as any other unix process. What do you guys think? Is it worth doing? (and I'm volunteering for the initial work, at least) -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com Anything is possible, unless it's not. [1] http://sourceforge.net/projects/virtualdatalink/ and http://tech.groups.yahoo.com/group/timexdatalinkusbdevelop/ ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Simulator for GCC Testing [was: RE: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]]
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of Andrew Hutchinson Sent: Thursday, January 10, 2008 6:27 PM To: [EMAIL PROTECTED] Cc: avr-gcc-list@nongnu.org Subject: Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations] Here my input: For starters gcc has testsuite that can be used. It's not perfect but its quite demanding - even if we cant do all the tests. Do we have info on setting this up with simulator? I did have some instruction - once! After than I suggest some benchmark that would produce more normal code and also give qualitative indications of performance (size is easy, speed would be nice). Finally, regression tests using testcases and bug reports. Hi All, Some points: - Yes, GCC does have a Regression Test Suite, and it can execute for the AVR using the SimulAVR simulator. There are many, many tests that pass for the AVR. There are quite a few that don't, but most of those failures that I have looked at either the test needs fixing (because it assumes a 32-bit processor), or the tests don't apply to the AVR. Some work needs to be done to get the Regression Test Suite in shape for the AVR. - As mentioned, simulavr is known to work as a simulator for the GCC test suite. However, simulavr is not really maintained anymore. At the simulavr project on Savannah, there is a new code base called simulavrxx which is based on C++. This is maintained, but it could use help: It doesn't run on Cygwin yet, and AFAIK it cannot run the GCC Test Suite yet. Any help on this is deeply appreciated. https://savannah.nongnu.org/projects/simulavr I strongly recommend that the wheel not be reinvented. If people are interested in running the GCC Regression Test Suite, I would recommend using available tools, and improving the available tools rather then invent new ones. I have instructions on running the GCC Regression Test Suite (from Bjoern Haase, IIRC). I have yet to run it myself, but others have done so successfully. However, there are reports about difficulties on running the test on Cygwin. I have heard that it is successful on Linux. There is a person from Belgium, Mike Stein, who has been running the GCC Test Suite for the AVR pretty much on a daily basis and he has been posting the results regularly on the gcc-testresults mailing list: http://gcc.gnu.org/ml/gcc-testresults/. Just search for avr to see the results. (It looks like he last did it in December.) I'll be attempting to run the GCC Test Suite probably sometime in Q1 2008. (I'm going to be busy in January and February.) Just email me if anyone is interested in the instructions. Note that running the GCC Test Suite is imperative for anyone who works on GCC, because in order to submit any patches to the GCC project, they require that the patch is tested with the Test Suite and that there are no new regressions. That's the main purpose of the test suite. I'll start a new thread about a benchmark suite... Thanks, Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
AVR Benchmark Test Suite [was: RE: [avr-gcc-list] GCC-AVR Register optimisations]
-Original Message- From: Dave N6NZ [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 1:35 PM To: gEDA user mailing list (by accident) Subject: Re: [avr-gcc-list] GCC-AVR Register optimisations Hi, Over the course of years I have worked in and managed various different hardware and software validation teams. This is probably a good way for me to contribute, since I'm not a compiler back-end guru, but I have relevant experience writing and reviewing test plans. Some of those test plans even targeted C compilers :) -dave -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of John Regehr Sent: Thursday, January 10, 2008 4:34 PM To: avr-gcc-list@nongnu.org Subject: RE: [avr-gcc-list] GCC-AVR Register optimisations - I, and others, are very interested in what you are using to test your proposed changes. I have plans to put together an AVR Benchmark Suite, consisting of a variety of publicly available programs that can be used to test the compiler performance over time. It definitely needs to have different types of programs, and publicly available programs so there are no issues with distributing such a Benchmark Suite. I welcome any collaboration on this. This is excellent and much needed, Eric. I can help with some TinyOS applications. Hopefully the benchmarks can be packaged up with a simulator in order to get dynamic information as well as static. I use Avrora (first google hit for avrora) for everything, though it supports relatively few chips. Hi John, Dave, others, Here are some random thoughts about a benchmark test suite: - GCC has a page on benchmarks: http://gcc.gnu.org/benchmarks/ However all of those are geared towards larger processors and host systems. There is a link to a benchmark that focuses on code size, CSiBE, http://www.inf.u-szeged.hu/csibe/. Again, that benchmark is geared towards larger processors. This creates a need to have a benchmark that is geared towards 8-bit microcontroller environments in general, and specifically for the AVR. What would we like to test? Code size for sure. Everyone always seems to be interested in code size. There is an interest in seeing how the GCC compiler performs from one version to the next, to see if optimizations have improved or if they have regressed. There is also an interest in comparing AVR compilers, such as how GCC compares to IAR, Codevision or ImageCraft compilers. And sometimes there is an interest in comparing AVR against other microcontrollers, notably Microchip's PIC and TI's MSP430. Because there are these different interests, it is challenging to come up with appropriate code samples to showcase and benchmark these different issues. But we could also implement this in stages, and focus on AVR-specific code, and GCC-specific AVR code at that. If we are going to put together a benchmark test suite, like others benchmarks for GCC (for larger processors), then I would think that it would be better to model it somewhat after those other benchmarks. I see that they tend to use publicly available code, and a variety of different types of applications. We should have something similar. Some suggested projects: FreeRTOS (for the AVR), uIP (however, we need to pick a specific implementation of it for the AVR; I have a copy of uIP-Crumb644), the Atmel 802.15.4 MAC, and the GCC version of the Butterfly firmware. I also have a copy of the TI Competitive Benchmark, which they, and other semiconductor companies, have used to do comparisons between processors. I also have a copy of some Atmel internal AES code. I believe that this code is publicly available in some kits that Atmel offers, but I need to do some double-checking to make sure that it is public. John, I would welcome publicly available code from TinyOS, but I would need to be already compiled with nesc, so that way we just have straight C that we can feed into avr-gcc. I've been aware of avrora for several years. If it is useful for this kind of work, I'm open minded to it. I just want to make sure that it works and is somewhat easy to use. Does anyone have other suggestions on projects to include in the Benchmark? One are that seems to be lacking is some application that uses floating point. Any help to find some application in this area would be much appreciated. There needs to be some consensus on what we measure, how we measure it, what output files we want generated, and hopefully some way to automatically generate composite results. I'm certainly open to anything in this area. I would think that we need to be as open as possible on this, with documentation (minimal, it can be a text file) on what are our methods, how the results were arrived at, but importantly that the secondary/generated files be available for others to review and verify the results. On practicalities: I am certainly willing to host the benchmark test
Re: [avr-gcc-list] GCC-AVR Register optimisations
Registers 17 downwards are call saved and push/popped in prescribed order by prolog/epilog functions. Also R28,29 is potential frame pointer and so that is best left alone. So the key registers are: R18-R27 R30,31 Note that in some cases it could be very interesting to use r27, or Y, register. Consider this example: char *x; volatile int y; void foo(char *p) { y += *p; } void main(void) { char *p1 = x; foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); } This will generate very bad code. /* prologue: frame size=0 */ push r14 push r15 push r16 push r17 /* prologue end (size=4) */ lds r24,x lds r25,(x)+1 movw r16,r24 subi r16,lo8(-(1)) sbci r17,hi8(-(1)) call foo movw r14,r16 sec adc r14,__zero_reg__ adc r15,__zero_reg__ movw r24,r16 call foo movw r16,r14 subi r16,lo8(-(1)) sbci r17,hi8(-(1)) movw r24,r14 call foo movw r14,r16 sec adc r14,__zero_reg__ adc r15,__zero_reg__ movw r24,r16 call foo movw r16,r14 subi r16,lo8(-(1)) sbci r17,hi8(-(1)) movw r24,r14 call foo etc.. A more optimal scheme would be call foo movw r24, r16 adiw r24, 1 movw r16, r24 call foo etc.. Using the r24 capability to do a 16 bit increment But in this special case there is no frame pointer. So we could use R28 to store instead of R16. Then we can add on r28 and do something like this: call foo adiw r28, 1 movw r24, r28 call foo So yes using R28 as last resort looks like a sane thing. Unless there is no frame pointer at all, and there is a need for 16 (or 32 bit) arithmetic on saved registers. This is probably incredibly difficult. But I thought to mention it anyway HTH, Wouter ps. Writing it like foo(p); p++; Will produce better code?!? I will fill a bug report for this. With the order, there are several problems: 1) Initial register allocation fragments the register set. For example, allocating r25 will prevent R24-25 being used for 16bit register and prevent R22-25 and R24-27 being used as 32 bit registers. gcc register allocator does not seem to overcome this fragmentation. 2) The situation is made worse by the order of 16bit+ register used for call and return values - which are allocated in reverse order. eg R24-R25, R22-24, R18-24. This means that the function parameters or return values are rarely in the right place - except for 16bit values. 3) Allocating a byte to odd number register precluded it being extended to 16bit value without a move. So, I tried creating an order which would preserve the contiguous register space and avoid the above issues as much as possible. This is what I ended up with: R18,26,22,30,20,24,19,21,23,25,27,31,28,29, \ 17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,\ The result is a 1.25% saving in code size for a simple mixed application. Pretty good for such a simple change! For more floating point, the saving might well be higher as it demands more contiguous 32 bit registers. On the same basis, the current order of called saved registers R2-R17 dictated by (mcall) prolog limit further improvement is clearly imperfect. These are used less frequently, though their cost is much higher. So its difficult to gauge impact. I might take a look at some intense floating point functions to see if this if it is worth pursuing reordering these too. Andy ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] GCC-AVR Register optimisations
Wouter van Gulik schreef: Note that in some cases it could be very interesting to use r27, or Y, register. Should have written R28 of course. Since gcc seems down at the moment I did some more testing. Now consider this example: void main(void) { char *p = x; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; } This must be done using a subi/sbci pare. But the compiler now seems to realize that p is a constant offset to x. So we now get: main: /* prologue: frame size=0 */ push r16 push r17 /* prologue end (size=2) */ lds r16,x lds r17,(x)+1 movw r24,r16 call foo movw r24,r16 subi r24,lo8(-(65)) sbci r25,hi8(-(65)) call foo movw r24,r16 subi r24,lo8(-(130)) sbci r25,hi8(-(130)) Here x is stored in r16 and the cumulative offset is added to R24 But if the compiler can realize this... Then why not do this for adds within the adiw range?!? So for p++/p+=1 we would get something like: movw r24, r16 adiw r24, 1 call foo movw r24, r16 adiw r24, 2 etc.. This is just as small as the earlier suggested use of R28! Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] GCC-AVR Register optimisations
Thanks for feedback! I will try your example latter today and see what I get. The change in register allocation order allows gcc to fixes some other things. Part of the problem in your example is the strange move away from R16 to R14: movw r14,r16 sec adc r14,__zero_reg__ adc r15,__zero_reg__ movw r24,r16 It is not obvious why this is not optimised out (unless optimisation was disabled or retricted). This normally only happens when all the higher numbered registers are used up - or it needs to preserve result across call. It should have picked a higher number register (24 or even R28) - as I would expect these to be unused - and obviously ahead of R14 in the allocation order. If the move had been made to R16 or higher, then the addition would have been simpler - or even as simple as your example. However, often looking at intermediate RTL gives some clue why. Can you tell me what optimisation setting was used? Andy Wouter van Gulik [EMAIL PROTECTED] wrote: Registers 17 downwards are call saved and push/popped in prescribed order by prolog/epilog functions. Also R28,29 is potential frame pointer and so that is best left alone. So the key registers are: R18-R27 R30,31 Note that in some cases it could be very interesting to use r27, or Y, register. Consider this example: char *x; volatile int y; void foo(char *p) { y += *p; } void main(void) { char *p1 = x; foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); foo(p1++); } This will generate very bad code. /* prologue: frame size=0 */ push r14 push r15 push r16 push r17 /* prologue end (size=4) */ lds r24,x lds r25,(x)+1 movw r16,r24 subi r16,lo8(-(1)) sbci r17,hi8(-(1)) call foo movw r14,r16 sec adc r14,__zero_reg__ adc r15,__zero_reg__ movw r24,r16 call foo movw r16,r14 subi r16,lo8(-(1)) sbci r17,hi8(-(1)) movw r24,r14 call foo movw r14,r16 sec adc r14,__zero_reg__ adc r15,__zero_reg__ movw r24,r16 call foo movw r16,r14 subi r16,lo8(-(1)) sbci r17,hi8(-(1)) movw r24,r14 call foo etc.. A more optimal scheme would be call foo movw r24, r16 adiw r24, 1 movw r16, r24 call foo etc.. Using the r24 capability to do a 16 bit increment But in this special case there is no frame pointer. So we could use R28 to store instead of R16. Then we can add on r28 and do something like this: call foo adiw r28, 1 movw r24, r28 call foo So yes using R28 as last resort looks like a sane thing. Unless there is no frame pointer at all, and there is a need for 16 (or 32 bit) arithmetic on saved registers. This is probably incredibly difficult. But I thought to mention it anyway HTH, Wouter ps. Writing it like foo(p); p++; Will produce better code?!? I will fill a bug report for this. With the order, there are several problems: 1) Initial register allocation fragments the register set. For example, allocating r25 will prevent R24-25 being used for 16bit register and prevent R22-25 and R24-27 being used as 32 bit registers. gcc register allocator does not seem to overcome this fragmentation. 2) The situation is made worse by the order of 16bit+ register used for call and return values - which are allocated in reverse order. eg R24-R25, R22-24, R18-24. This means that the function parameters or return values are rarely in the right place - except for 16bit values. 3) Allocating a byte to odd number register precluded it being extended to 16bit value without a move. So, I tried creating an order which would preserve the contiguous register space and avoid the above issues as much as possible. This is what I ended up with: R18,26,22,30,20,24,19,21,23,25,27,31,28,29, \ 17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,\ The result is a 1.25% saving in code size for a simple mixed application. Pretty good for such a simple change! For more floating point, the saving might well be higher as it demands more contiguous 32 bit registers. On the same basis, the current order of called saved registers R2-R17 dictated by (mcall) prolog limit further improvement is clearly imperfect. These are used less frequently, though their cost is much higher. So its difficult to gauge impact. I might take a look at some intense floating point functions to see if this if it is worth pursuing reordering these too. Andy ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org
RE: [avr-gcc-list] GCC-AVR Register optimisations
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of Andrew Hutchinson Sent: Wednesday, January 09, 2008 6:52 PM To: avr-gcc-list@nongnu.org Subject: [avr-gcc-list] GCC-AVR Register optimisations Hi all, just spend some days going over gcc-avr and missed optimizations. One area I looked at was register allocation snip So is there an better order? I certainly appreciate all your effort in looking at missed optimizations. I know that not very many people are able to look into this area in AVR GCC right now. However, I'd like to bring up a few points: - Changing the register order, while it seems promising, introduces a major backwards incompatibility. Avr-libc is written in mostly hand-optimized assembly, which means that C functions in the application call assembly routines in avr-libc. Changing the register order means a complete overhaul of avr-libc; something that is not likely to happen quickly or without a lot of effort. Would you be prepared to help take this on? - There are many missed optimization bugs in the bug database that probably could be fixed without resorting to changing the register order. These are definitely real world problems that need to be fixed. http://www.nongnu.org/avr-libc/bugs.html - I, and others, are very interested in what you are using to test your proposed changes. I have plans to put together an AVR Benchmark Suite, consisting of a variety of publicly available programs that can be used to test the compiler performance over time. It definitely needs to have different types of programs, and publicly available programs so there are no issues with distributing such a Benchmark Suite. I welcome any collaboration on this. - The bug list http://www.nongnu.org/avr-libc/bugs.html has a number of bugs that are wrong-code bugs or bugs that generate an internal compiler error on valid code (ICE-on-valid-code). These bugs are much more important to fix right now then tackling the various missed optimization bugs. These higher priority bugs show where the compiler, or AVR back end of the compiler, is *failing*. Any help in fixing these would be very much appreciated. IMHO, after these high-priority bugs get fixed, then it would be worthwhile to start looking at fixing missed optimizations. Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] GCC-AVR Register optimisations
Weddington wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of Andrew Hutchinson Sent: Wednesday, January 09, 2008 6:52 PM To: avr-gcc-list@nongnu.org Subject: [avr-gcc-list] GCC-AVR Register optimisations Hi all, just spend some days going over gcc-avr and missed optimizations. One area I looked at was register allocation snip So is there an better order? I certainly appreciate all your effort in looking at missed optimizations. I know that not very many people are able to look into this area in AVR GCC right now. However, I'd like to bring up a few points: - Changing the register order, while it seems promising, introduces a major backwards incompatibility. Avr-libc is written in mostly hand-optimized assembly, which means that C functions in the application call assembly routines in avr-libc. Changing the register order means a complete overhaul of avr-libc; something that is not likely to happen quickly or without a lot of effort. Would you be prepared to help take this on? No!. Changing the order has not effect of the registers used to call functions or return values. They are separately controlled in back end. The allocation order refers to the order in which registers are used for intermediate values or locals. So even if order starts with R18, functions will still expect and return an int in R24-25. Changing the lower registers (CALL SAVED) does introduce a libgcc incompatibility, in that the routines for prolog/epilog invoked by -mcall-prolog assume that these registers are push/popped in a contigous sequence starting with R17. So changing to a different allocation order may well incur more push/ops than needed - unless prolog/epilog push/pop order was changed to match the same order. For this reason, I have left that alone. (although its not a big deal). prolog/epilog call are parts of gcc not replaced by libc. So even that would leaves libc untouched. Of course, any c still used in libc would benefit from recompilation. - There are many missed optimization bugs in the bug database that probably could be fixed without resorting to changing the register order. These are definitely real world problems that need to be fixed. http://www.nongnu.org/avr-libc/bugs.html Yes, I have tried to look at underlying problems rather than concentrate on specfics. That way, you fix more problems - I, and others, are very interested in what you are using to test your proposed changes. I have plans to put together an AVR Benchmark Suite, consisting of a variety of publicly available programs that can be used to test the compiler performance over time. It definitely needs to have different types of programs, and publicly available programs so there are no issues with distributing such a Benchmark Suite. I welcome any collaboration on this. Absolutely! - The bug list http://www.nongnu.org/avr-libc/bugs.html has a number of bugs that are wrong-code bugs or bugs that generate an internal compiler error on valid code (ICE-on-valid-code). These bugs are much more important to fix right now then tackling the various missed optimization bugs. These higher priority bugs show where the compiler, or AVR back end of the compiler, is *failing*. Any help in fixing these would be very much appreciated. IMHO, after these high-priority bugs get fixed, then it would be worthwhile to start looking at fixing missed optimizations. I have not ignored the higher priority bugs. Indeed you have my patch for register spill. The register allocation order is an off shoot from this to cover the possibility that patch would produce less optimum code. Some of the others bugs less easy to reproduce on newer versions of gcc - also fixing one problem often prevent the other occuring. And having multiple gcc/winavr version is tricky enough with 2. I have some WIP for other bugs - but have not posted any resolution yet. Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] GCC-AVR Register optimisations
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 11:48 AM To: avr-gcc-list@nongnu.org; Weddington, Eric Subject: RE: [avr-gcc-list] GCC-AVR Register optimisations - Changing the register order, while it seems promising, introduces a major backwards incompatibility. Avr-libc is written in mostly hand-optimized assembly, which means that C functions in the application call assembly routines in avr-libc. Changing the register order means a complete overhaul of avr-libc; something that is not likely to happen quickly or without a lot of effort. Would you be prepared to help take this on? No!. Changing the order has not effect of the registers used to call functions or return values. They are separately controlled in back end. The allocation order refers to the order in which registers are used for intermediate values or locals. So even if order starts with R18, functions will still expect and return an int in R24-25. Changing the lower registers (CALL SAVED) does introduce a libgcc incompatibility, in that the routines for prolog/epilog invoked by -mcall-prolog assume that these registers are push/popped in a contigous sequence starting with R17. So changing to a different allocation order may well incur more push/ops than needed - unless prolog/epilog push/pop order was changed to match the same order. For this reason, I have left that alone. (although its not a big deal). prolog/epilog call are parts of gcc not replaced by libc. So even that would leaves libc untouched. Of course, any c still used in libc would benefit from recompilation. Thanks for the clarification. It certainly helps that the call order essentially won't change. I'm still not fond of the idea of having to change libgcc. It brings up a whole host of issues of synchronizing these changes and introducing them to the end user. - The bug list http://www.nongnu.org/avr-libc/bugs.html has a number of bugs that are wrong-code bugs or bugs that generate an internal compiler error on valid code (ICE-on-valid-code). These bugs are much more important to fix right now then tackling the various missed optimization bugs. These higher priority bugs show where the compiler, or AVR back end of the compiler, is *failing*. Any help in fixing these would be very much appreciated. IMHO, after these high-priority bugs get fixed, then it would be worthwhile to start looking at fixing missed optimizations. I have not ignored the higher priority bugs. Indeed you have my patch for register spill. Thanks again! I hope that your patch can be reviewed soon. :-) The register allocation order is an off shoot from this to cover the possibility that patch would produce less optimum code. Some of the others bugs less easy to reproduce on newer versions of gcc - also fixing one problem often prevent the other occuring. And having multiple gcc/winavr version is tricky enough with 2. I have some WIP for other bugs - but have not posted any resolution yet. I look forward to seeing your work when it's ready! :-) Eric Weddington ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] GCC-AVR Register optimisations
Ok, I checked instruction patterns in GCC AVR.MD and use of ADIW registers is marked ! for ADD 16bits, ADD 32 bits and TEST 16bits This means that it will not be used by reload and it will be a second/third choice elsewhere. Which seems to match your observations! It also will push allocation away from R24-R30 - which might explain why R14 was getting used. I looked back thru change history and this has been their since original. It could be that it fixes a problem that no longer exists. For sure it will produce poor code as you describe It so happens I noticed this the other day and removed it from my working copy (to see if anything bad happened and also to smoke test my patch for BASE_POINTER register spill - since I wanted to force more use of pointer registers) Nothing bad has happened so far. I will post results latter. Wouter van Gulik [EMAIL PROTECTED] wrote: Wouter van Gulik schreef: Note that in some cases it could be very interesting to use r27, or Y, register. Should have written R28 of course. Since gcc seems down at the moment I did some more testing. Now consider this example: void main(void) { char *p = x; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; } This must be done using a subi/sbci pare. But the compiler now seems to realize that p is a constant offset to x. So we now get: main: /* prologue: frame size=0 */ push r16 push r17 /* prologue end (size=2) */ lds r16,x lds r17,(x)+1 movw r24,r16 call foo movw r24,r16 subi r24,lo8(-(65)) sbci r25,hi8(-(65)) call foo movw r24,r16 subi r24,lo8(-(130)) sbci r25,hi8(-(130)) Here x is stored in r16 and the cumulative offset is added to R24 But if the compiler can realize this... Then why not do this for adds within the adiw range?!? So for p++/p+=1 we would get something like: movw r24, r16 adiw r24, 1 call foo movw r24, r16 adiw r24, 2 etc.. This is just as small as the earlier suggested use of R28! Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
[Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
Well, after spamming the wrong list with this, I hope I've got the right place now :( -dave ---BeginMessage--- Weddington, Eric wrote: - I, and others, are very interested in what you are using to test your proposed changes. I have plans to put together an AVR Benchmark Suite, consisting of a variety of publicly available programs that can be used to test the compiler performance over time. It definitely needs to have different types of programs, and publicly available programs so there are no issues with distributing such a Benchmark Suite. I welcome any collaboration on this. Hi, Over the course of years I have worked in and managed various different hardware and software validation teams. This is probably a good way for me to contribute, since I'm not a compiler back-end guru, but I have relevant experience writing and reviewing test plans. Some of those test plans even targeted C compilers :) -dave ---End Message--- ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
Here my input: For starters gcc has testsuite that can be used. It's not perfect but its quite demanding - even if we cant do all the tests. Do we have info on setting this up with simulator? I did have some instruction - once! After than I suggest some benchmark that would produce more normal code and also give qualitative indications of performance (size is easy, speed would be nice). Finally, regression tests using testcases and bug reports. Andy Dave N6NZ wrote: Well, after spamming the wrong list with this, I hope I've got the right place now :( -dave Subject: Re: [avr-gcc-list] GCC-AVR Register optimisations From: Dave N6NZ [EMAIL PROTECTED] Date: Thu, 10 Jan 2008 12:35:10 -0800 To: gEDA user mailing list [EMAIL PROTECTED] To: gEDA user mailing list [EMAIL PROTECTED] Weddington, Eric wrote: - I, and others, are very interested in what you are using to test your proposed changes. I have plans to put together an AVR Benchmark Suite, consisting of a variety of publicly available programs that can be used to test the compiler performance over time. It definitely needs to have different types of programs, and publicly available programs so there are no issues with distributing such a Benchmark Suite. I welcome any collaboration on this. Hi, Over the course of years I have worked in and managed various different hardware and software validation teams. This is probably a good way for me to contribute, since I'm not a compiler back-end guru, but I have relevant experience writing and reviewing test plans. Some of those test plans even targeted C compilers :) -dave ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
[Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
I tried they earlier example: char *p1 = x; foo(p1++); foo(p1++); foo(p1++); etc Even with different register allocation, the result is bad. a) Basic flow that gcc creates is something like: b=a+1; R24=a; foo(R24); c=b+1; R24=b; foo(R24) This needs 1 variable to be saved over call. But due to overlapped lifetimes, it creates 2. For example, both a and b must exist at same time. If it had reversed ordering, it would not need this. For example R24=a; b=a+1 foo(R24); R24=b; c=b+1; foo(R24) with reordering, when b is created, a is dead. So we only need 1 register I am not experienced enough to know why gcc cannot optimise this case. But it looks like a weakness with gcc (not gcc-avr) b) Register costs used to preference allocation are all equal for AVR - so there is no preference for ADIW regs (even when I removed !w) So backend does not indicate a preference between R16=R16+1, R14=R14+1 or R28=R28+1 In current gcc, frame pointer (r28-29) does not get used for register allocation - clearly that would be the best call saved register which could use ADIW and avoid moves. It looks like allocation is made with frame pointer used. Then, if it is not required, it does not use R28-29. But it does not try allocation without frame_pointer. I tried improved foo(p);p++; it produces much better code. Still not using R28 (for same reason) In this case, the increment is specified after function call, so we dont have overlapped lifetime of registers - only one is then used and all becomes simple. Andy PS Please report as a bug - gcc should be better than this. Wouter van Gulik wrote: The RTL dump will tell me why it chose R14 before. What do you mean with RTL dump exactly? I tried looking through some dumps but I could not make sense of it. I used -dP and --save-temps. But all looked the same to me. I recollect there were some odd ! markers that stops the possibility of ADIW registers being by reload for certain operations That might be reason. If so I'll have to dig out why they were put in - maybe to fix some other problem. Well for all possible ADIW uses (addsi, addhi) it's a !w. If this could be undone much pointer arithmetic could be done better I guess/hope. Any clue on why foo(p++) gives even poorer code compared to foo(p); p++? HTH, Wouter Wouter van Gulik [EMAIL PROTECTED] wrote: Wouter van Gulik schreef: Note that in some cases it could be very interesting to use r27, or Y, register. Should have written R28 of course. Since gcc seems down at the moment I did some more testing. Now consider this example: void main(void) { char *p = x; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; foo(p); p+=65; } This must be done using a subi/sbci pare. But the compiler now seems to realize that p is a constant offset to x. So we now get: main: /* prologue: frame size=0 */ push r16 push r17 /* prologue end (size=2) */ lds r16,x lds r17,(x)+1 movw r24,r16 call foo movw r24,r16 subi r24,lo8(-(65)) sbci r25,hi8(-(65)) call foo movw r24,r16 subi r24,lo8(-(130)) sbci r25,hi8(-(130)) Here x is stored in r16 and the cumulative offset is added to R24 But if the compiler can realize this... Then why not do this for adds within the adiw range?!? So for p++/p+=1 we would get something like: movw r24, r16 adiw r24, 1 call foo movw r24, r16 adiw r24, 2 etc.. This is just as small as the earlier suggested use of R28! Wouter ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
Andrew Hutchinson wrote: Here my input: For starters gcc has testsuite that can be used. It's not perfect but its quite demanding - even if we cant do all the tests. Yes, we could probably pull a subset of meaningful test from that which would give us a great start. The hard/tedious part will be generating our expected results. After than I suggest some benchmark that would produce more normal code and also give qualitative indications of performance (size is easy, speed would be nice). Finally, regression tests using testcases and bug reports. Performance regressions are particularly nasty to test. First you have to have code that can sensitize an optimization that you want. Then, you need to have some expected results. The compare of actual to expected is difficult -- what are you measuring? And how close is good enough? Exactly clock count? Clock count within a guard band? A specific assembler sequence that must happen (but actual registers used don't matter)? Some registers matter? Then you have the test matrix... what is the expected result with -Os versus -O3? etc, etc, etc. All this presumes a test framework that can capture all the required outputs to check against expected results: .S, clock count, code size, exit code, Oh, and getting the right numerical (or whatever) answer... and are we getting it for the right reason? The compiler might show a great speed and code size improvement by mistakenly ignoring the volatile keyword in some optimization, and a simulator might never show a wrong answer :( Having been purely a gcc and avr-gcc user, I don't have any idea if any parts of this framework already exist in a form that addresses these problems. Anyway, the test matrix is huge, and a simulator is going to limit the ultimate performance of the test rig. So we won't be able to do everything we can think of. On the plus side of the ledger: Our audience, being embedded developers, have very specific needs, so that can help prioritize. OK, this e-mail was a shotgun blast of issues completely devoid of specifics :) :) Andy -dave Dave N6NZ wrote: Well, after spamming the wrong list with this, I hope I've got the right place now :( -dave Subject: Re: [avr-gcc-list] GCC-AVR Register optimisations From: Dave N6NZ [EMAIL PROTECTED] Date: Thu, 10 Jan 2008 12:35:10 -0800 To: gEDA user mailing list [EMAIL PROTECTED] To: gEDA user mailing list [EMAIL PROTECTED] Weddington, Eric wrote: - I, and others, are very interested in what you are using to test your proposed changes. I have plans to put together an AVR Benchmark Suite, consisting of a variety of publicly available programs that can be used to test the compiler performance over time. It definitely needs to have different types of programs, and publicly available programs so there are no issues with distributing such a Benchmark Suite. I welcome any collaboration on this. Hi, Over the course of years I have worked in and managed various different hardware and software validation teams. This is probably a good way for me to contribute, since I'm not a compiler back-end guru, but I have relevant experience writing and reviewing test plans. Some of those test plans even targeted C compilers :) -dave ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
You input in valued. For speed/size I was think we need to have sanity checks as gcc versions have been known to suddenly dump a bunch of unexpected data in the middle of a linked program. Of course all the functional tests still give correct results and all looks good! The second use would be as evaluating impact of a patch - so we can see that if it causes significant degradations in other areas. Qualative perhaps - but the alternative is nothing.. gcc test suite has full set of answers and framework. We may not pass all the marked applicable tests due to limited memory or unimplemented functions. Yet even with this functional regression can easily be seen. Andy Dave N6NZ wrote: Andrew Hutchinson wrote: Here my input: For starters gcc has testsuite that can be used. It's not perfect but its quite demanding - even if we cant do all the tests. Yes, we could probably pull a subset of meaningful test from that which would give us a great start. The hard/tedious part will be generating our expected results. After than I suggest some benchmark that would produce more normal code and also give qualitative indications of performance (size is easy, speed would be nice). Finally, regression tests using testcases and bug reports. Performance regressions are particularly nasty to test. First you have to have code that can sensitize an optimization that you want. Then, you need to have some expected results. The compare of actual to expected is difficult -- what are you measuring? And how close is good enough? Exactly clock count? Clock count within a guard band? A specific assembler sequence that must happen (but actual registers used don't matter)? Some registers matter? Then you have the test matrix... what is the expected result with -Os versus -O3? etc, etc, etc. All this presumes a test framework that can capture all the required outputs to check against expected results: .S, clock count, code size, exit code, Oh, and getting the right numerical (or whatever) answer... and are we getting it for the right reason? The compiler might show a great speed and code size improvement by mistakenly ignoring the volatile keyword in some optimization, and a simulator might never show a wrong answer :( Having been purely a gcc and avr-gcc user, I don't have any idea if any parts of this framework already exist in a form that addresses these problems. Anyway, the test matrix is huge, and a simulator is going to limit the ultimate performance of the test rig. So we won't be able to do everything we can think of. On the plus side of the ledger: Our audience, being embedded developers, have very specific needs, so that can help prioritize. OK, this e-mail was a shotgun blast of issues completely devoid of specifics :) :) Andy -dave Dave N6NZ wrote: Well, after spamming the wrong list with this, I hope I've got the right place now :( -dave Subject: Re: [avr-gcc-list] GCC-AVR Register optimisations From: Dave N6NZ [EMAIL PROTECTED] Date: Thu, 10 Jan 2008 12:35:10 -0800 To: gEDA user mailing list [EMAIL PROTECTED] To: gEDA user mailing list [EMAIL PROTECTED] Weddington, Eric wrote: - I, and others, are very interested in what you are using to test your proposed changes. I have plans to put together an AVR Benchmark Suite, consisting of a variety of publicly available programs that can be used to test the compiler performance over time. It definitely needs to have different types of programs, and publicly available programs so there are no issues with distributing such a Benchmark Suite. I welcome any collaboration on this. Hi, Over the course of years I have worked in and managed various different hardware and software validation teams. This is probably a good way for me to contribute, since I'm not a compiler back-end guru, but I have relevant experience writing and reviewing test plans. Some of those test plans even targeted C compilers :) -dave ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
Andrew Hutchinson wrote: For speed/size I was think we need to have sanity checks as gcc versions have been known to suddenly dump a bunch of unexpected data in the middle of a linked program. Of course all the functional tests still give correct results and all looks good! Yup. A common compiler testing problem. Optimizations are always a balancing act, since some that improve speed increase code size and so forth. The second use would be as evaluating impact of a patch - so we can see that if it causes significant degradations in other areas. Qualative perhaps - but the alternative is nothing.. I agree. If there were a good test set that could be applied to a patch in isolation it would be very useful. Even just a qualitative dash board report reveals good information, especially after everyone gets used to seeing the same data presented with the same format every time. The test matrix deserves some thought. The same test might have different pass/fail criteria under different options. For example, imagine a suite of 10 tests. You might say 10/10 must show zero code growth under -Os, 7/10 show no speed degradation under -Os, the same 10 tests must show zero slow down for 10/10 under -O3, 6/10 no code growth under -O3. (Just an example, may not be realistic.) W.r.t options, we might also want to have a depth first test set, and a breadth first test set. Depth first being going after specific test points with targeted, tests, and breadth first being a few tests under a broad range of conditions. gcc test suite has full set of answers and framework. We may not pass all the marked applicable tests due to limited memory or unimplemented functions. Yet even with this functional regression can easily be seen. I'll have to go take a look at the framework. So far, I've been a trusting user ./configure; make all; and get a cup of tea. -dave ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
On Friday 11 January 2008 11:38, Andrew Hutchinson wrote: I tried they earlier example: char *p1 = x; foo(p1++); foo(p1++); foo(p1++); [...] I am not experienced enough to know why gcc cannot optimise this case. But it looks like a weakness with gcc (not gcc-avr) Possible, yes. Some another targets of GCC 4.2.2: pdp11 -- ugly arm/thumb -- ugly arm/arm -- OK Regards, Dmitry ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
[avr-gcc-list] GCC-AVR Register optimisations
Hi all, just spend some days going over gcc-avr and missed optimizations. One area I looked at was register allocation - this is not gcc strong point. However, the current settings we use are making life more difficult than it needs be. The current order is: R24,25,\ 18,19,\ 20,21,\ 22,23,\ 30,31,\ 26,27,\ 28,29,\ 17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,\ You can tweak it with gcc-avr/winavr compile option -morder1 to get a better result ( or -morder2 to get a much worse one!) So is there an better order? Registers 17 downwards are call saved and push/popped in prescribed order by prolog/epilog functions. Also R28,29 is potential frame pointer and so that is best left alone. So the key registers are: R18-R27 R30,31 With the order, there are several problems: 1) Initial register allocation fragments the register set. For example, allocating r25 will prevent R24-25 being used for 16bit register and prevent R22-25 and R24-27 being used as 32 bit registers. gcc register allocator does not seem to overcome this fragmentation. 2) The situation is made worse by the order of 16bit+ register used for call and return values - which are allocated in reverse order. eg R24-R25, R22-24, R18-24. This means that the function parameters or return values are rarely in the right place - except for 16bit values. 3) Allocating a byte to odd number register precluded it being extended to 16bit value without a move. So, I tried creating an order which would preserve the contiguous register space and avoid the above issues as much as possible. This is what I ended up with: R18,26,22,30,20,24,19,21,23,25,27,31,28,29, \ 17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,\ The result is a 1.25% saving in code size for a simple mixed application. Pretty good for such a simple change! For more floating point, the saving might well be higher as it demands more contiguous 32 bit registers. On the same basis, the current order of called saved registers R2-R17 dictated by (mcall) prolog limit further improvement is clearly imperfect. These are used less frequently, though their cost is much higher. So its difficult to gauge impact. I might take a look at some intense floating point functions to see if this if it is worth pursuing reordering these too. Andy ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] GCC-AVR Register optimisations
Andrew Hutchinson wrote: Hi all, just spend some days going over gcc-avr and missed optimizations. Which is a huge bunch of work! Thanks! snip The result is a 1.25% saving in code size for a simple mixed application. Pretty good for such a simple change! Very good! What test cases were you using as a test case? 1.25% code size is very significant, since it translates into a speed improvement as well, so 1.25% reduction of an inner loop can buy even more of a speed up. For more floating point, the saving might well be higher as it demands more contiguous 32 bit registers. That makes sense. I don't have any good test cases for 32 bit AVR code. But actually, simply measuring libm is probably a very good test case in a practical sense. -dave ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] GCC-AVR Register optimisations
Very interesting! I will try this new order with Avr-libc's C-functions (probably at the nearest week-end). Today (with default order) the results are: AVR: at90s8515__ atmega8 GCC: 3.3.6 3.4.6 4.0.4 4.1.2 4.2.2 4.3.X 3.3.6 3.4.6 4.0.4 4.1.2 4.2.2 4.3.X --- bsearch(z,s,sizeof(s),1,cmp) Flh: 272 270 266 266 266 268214 212 208 208 208 204 Stk:161616161616 161616161616 Tim: 526 533 530 530 530 530327 334 331 331 331 327 --- dtostre(1.2345,s,6,0) Flh: 1000 998 1104 1194 1184 1174932 930 1020 1094 1088 1086 Stk:151515171717 151515171717 Tim: 1197 1197 1285 1313 1313 1290 1058 1058 1119 1152 1152 1143 --- dtostrf(1.2345,15,6,s) Flh: 1666 1688 1696 1668 1676 1690 1512 1528 1568 1544 1548 1566 Stk:353838383639 353838383639 Tim: 1670 1621 1667 1607 1608 1618 1480 1437 1493 1444 1443 1456 --- free(p) Flh: 540 548 544 544 556 568486 494 500 500 508 512 Stk: 4 4 4 4 4 4 4 4 4 4 4 4 Tim: 222 229 229 229 231 229201 208 211 211 212 210 --- malloc(1) Flh: 540 548 544 544 556 568486 494 500 500 508 512 Stk: 2 4 4 4 4 4 2 4 4 4 4 4 Tim: 186 193 195 195 197 195167 174 178 178 179 177 --- qsort(s,sizeof(s),1,cmp) Flh: 1314 1302 1220 1222 1242 1496 1086 1078 994 996 1008 1268 Stk:363636363840 363636363840 Tim: 21915 21896 20182 20474 20914 21091 16976 16964 16002 16294 16678 16926 --- rand() Flh: 548 548 498 492 508 498508 508 478 480 484 456 Stk:181818181818 181818181818 Tim: 1505 1505 1484 1484 1488 1484 1497 1497 1482 1482 1484 1475 --- realloc((void*)0,1) Flh: 1162 1170 1156 1130 1156 1166 1036 1046 1044 1032 1046 1046 Stk:182020182022 182020182022 Tim: 293 300 302 294 304 311269 276 280 272 281 288 --- sprintf_min(s,%d,12345) Flh: 1306 1272 1292 1210 1216 1274 1158 1138 1160 1082 1086 1142 Stk:555554535954 555554535954 Tim: 1847 1841 1805 1811 1846 1801 1706 1703 1673 1678 1711 1666 --- sprintf(s,%d,12345) Flh: 1720 1696 1704 1642 1674 1608 1534 1514 1524 1462 1498 1422 Stk:545457575857 545457575857 Tim: 1633 1627 1639 1618 1610 1623 1545 1543 1555 1536 1528 1537 --- sprintf_flt(s,%e,1.2345) Flh: 3422 3372 3300 3262 3334 3320 3130 3088 3006 2968 3040 2966 Stk:616163646666 616163646667 Tim: 2503 2496 2500 2482 2513 2492 2281 2277 2282 2263 2297 2302 --- sscanf_min(12345,%d,i) Flh: 1468 1466 1486 1484 1498 1504 1334 1334 1350 1352 1360 1364 Stk:494953535955 494953535955 Tim: 1681 1672 1640 1638 1643 1685 1371 1367 1359 1357 1356 1392 --- sscanf(12345,%d,i) Flh: 1792 1784 1908 1874 1848 1876 1624 1616 1700 1674 1662 1668 Stk:505054546156 505054546156 Tim: 1715 1736 1714 1694 1749 1751 1413 1434 1432 1417 1461 1463 --- sscanf_flt(1.2345,%e,x) Flh: 4134 4094 4190 4114 4220 4472 3802 3762 3808 3772 3856 4086 Stk: 124 124 126 128 140 132