Re: [osg-users] X86 optimization

Mike Jackson Thu, 31 Aug 2006 18:06:29 -0700

On 8/31/06 6:28 PM, "Chris Hanson" <[EMAIL PROTECTED]> wrote:


>    This isn't directly OSG, but since almost all of OSG's platforms at leave
> have an X86 
> variant, I thought it might be of interest.
> 
>    Reading AMD's optimization guide (which addresses 32-bit and 64-bit
> optimizations, as
> well as optimizations that are generally applicable to all X86 CPUs, not just
> AMD):
> 
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.
> PDF
> 
>    Chapter 5 deals with memory and cache.
> 
>    5.2 recommends aligning data members on their natural alignments. Dynamic
> memory 
> allocations typically are already aligned on a known boundary -- accounts seem
> to disagree 
> whether this is 32-bit (long/float) or 64-bit (longlong/double) alignment. It
> may be 
> OS-dependent.
> 
>    5.5 suggests that misalignment can cause the Store-to-Load forwarding
> mechanism to be 
> ineffective -- which is one of the main cures for the X86-32 CPU's terrible
> shortage of 
> registers.
> 
>    5.11 suggests reordering structs/classes by the size of their atomic
> members -- 
> doubles, then floats/longs, shorts, bytes to avert this misalignment (using
> padding where 
> necessary).
> 
> 
>    Has anyone gone this route? Using AMD's CodeAnalyst tool for Windows, one
> would come to 
> the conclusion that a lot of time is spent in CPU pipeline stalls. Is this an
> effective 
> code optimization, or is it a lot of work for very little benefit on a
> codebase the size
> of OSG?

It all depends on the types and amounts of data that is in each struct. At a
recent conference I was at there was a session on optimization and this very
subject was used as an example. What was laid out was a class that had some
integers and a string. The string was between the integers and caused the
memory layout to cross page boundaries. The before and after was amazing in
speed. Moving all the integers to the top of the class got rid of all the
memory page boundaries for the integers and the speed up was amazing. They
also stated that this was an extreme example of what can happen if the
memory layout is not thought through.

  I would say grouping the data members by type might be a good habit to get
into. 

  If some programmers out there have some performance tools to run against
OSG then that would make more sense than just blindly restructuring classes
in the hopes of getting some optimization.

Just my 2 cents.
-- 
Mike Jackson
imikejackson <at> gmail <dot> com


_______________________________________________
osg-users mailing list
[email protected]
http://openscenegraph.net/mailman/listinfo/osg-users
http://www.openscenegraph.org/

Re: [osg-users] X86 optimization

Reply via email to