On 8/31/06 6:28 PM, "Chris Hanson" <[EMAIL PROTECTED]> wrote:
> This isn't directly OSG, but since almost all of OSG's platforms at leave > have an X86 > variant, I thought it might be of interest. > > Reading AMD's optimization guide (which addresses 32-bit and 64-bit > optimizations, as > well as optimizations that are generally applicable to all X86 CPUs, not just > AMD): > > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112. > PDF > > Chapter 5 deals with memory and cache. > > 5.2 recommends aligning data members on their natural alignments. Dynamic > memory > allocations typically are already aligned on a known boundary -- accounts seem > to disagree > whether this is 32-bit (long/float) or 64-bit (longlong/double) alignment. It > may be > OS-dependent. > > 5.5 suggests that misalignment can cause the Store-to-Load forwarding > mechanism to be > ineffective -- which is one of the main cures for the X86-32 CPU's terrible > shortage of > registers. > > 5.11 suggests reordering structs/classes by the size of their atomic > members -- > doubles, then floats/longs, shorts, bytes to avert this misalignment (using > padding where > necessary). > > > Has anyone gone this route? Using AMD's CodeAnalyst tool for Windows, one > would come to > the conclusion that a lot of time is spent in CPU pipeline stalls. Is this an > effective > code optimization, or is it a lot of work for very little benefit on a > codebase the size > of OSG? It all depends on the types and amounts of data that is in each struct. At a recent conference I was at there was a session on optimization and this very subject was used as an example. What was laid out was a class that had some integers and a string. The string was between the integers and caused the memory layout to cross page boundaries. The before and after was amazing in speed. Moving all the integers to the top of the class got rid of all the memory page boundaries for the integers and the speed up was amazing. They also stated that this was an extreme example of what can happen if the memory layout is not thought through. I would say grouping the data members by type might be a good habit to get into. If some programmers out there have some performance tools to run against OSG then that would make more sense than just blindly restructuring classes in the hopes of getting some optimization. Just my 2 cents. -- Mike Jackson imikejackson <at> gmail <dot> com _______________________________________________ osg-users mailing list [email protected] http://openscenegraph.net/mailman/listinfo/osg-users http://www.openscenegraph.org/
