My understanding is that you will not be able to achieve significant 
speed gains with FP scheduling. The Geode already has an out-of-order 
execution unit and no matter in what order you throw ops at the unit, it 
will take almost the same clocks to execute. So you can switch on 
-march=geode and shave 1% off from the length of the code and get speed 
gains because of more free cache and that's all.

What you can do:
- fix build scripts (that 1% speed gain will not hurt)
- replace double with float
- kill operations not needed (rewrite algorithms)
- replace integer divide with mul/FP/3dnow where appropriate
- put a lot of prefetch ops into the appropriate places (very important!!!)
- measure real execution times and schedule FP ops ahead of time (even 
if they will not be needed)
- use 3dnow ops (gcc does not support them) but you have to measure 
their real execution times as well
(The last 3 requires inline asm.)

If you decide to go into asm land and measure execution speeds then 
please let me know your results. What I am interested in is the latency 
of the FP/3dnow pipeline and the real execution speeds of 
PF2ID/PF2IW/FIST/FCOMI ops since there is a high chance that the geode 
databook does not match reality in that area. I would have been tested 
them but my login to a real XO stopped working.

Brian Carnes wrote:
> The developers page on the wiki
> (http://wiki.laptop.org/go/Developers_program) mentions:
>
> "compiler optimization: if you are a compiler wizard, we understand that
> the Geode lacks a specific back end code scheduler, which limits
> performance, particularly FP performance. We'd love to see work go on in
> this area which would help everyone."
>
>
> What aspects of this issue/request for help are still open?  I'll go take
> a look at the OLPC build system tonight to see what is being used (late
> versions of GCC do have some Geode -mtune/-march modes), but would 
> love to
> be hooked into whatever project is addressing this
>
> ...Or start my own project if I'm the first to step to the plate on this
> issue.  If anyone knows any particularly lengthy floating-point dominated
> operations in the current software, let me know and we'll use them as our
> metric for improvement.
>
> Thanks,
>
>   Brian
>
>
> _______________________________________________
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
>
>   

_______________________________________________
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel

Reply via email to