Ok I thought it was the collision detection but that is not the case here are some of the numbers with collision disabled:

CS:EIP Symbol + Offset 64-bit Timer samples 0x10083cc0 osg::Group::traverse 1434 0x10083d60 osg::Group::computeBound 1391 0x10099ca0 osg::Matrixf::mult 833 0x1001a9d0 osg::PositionAttitudeTransform::accept 409 0x10099370 osg::Matrixf::preMult 407 0x1000e840 osg::AnimationPathCallback::update 352 0x1009bb50 osg::Node::dirtyBound 340 0x100dcee0 osg::Transform::computeBound 318 0x100a9df0 osg::PositionAttitudeTransform::computeLocalToWorldMatrix 294 0x100126f0 osg::AnimationPath::getInterpolatedControlPoint 285 0x10009c70 osg::AnimationPathCallback::setPause 251 0x1000c8e0 osg::StateSet::requiresUpdateTraversal 228

12 functions, 806 instructions, Total: 6542 samples, 50.85% of samples in the module, 16.36% of total session samples

Ok here is with collision detection:
=======================
CS:EIP Symbol + Offset 64-bit Timer samples 0x10083cc0 osg::Group::traverse 1382 0x10083d60 osg::Group::computeBound 1237 0x10099ca0 osg::Matrixf::mult 924 0x10099370 osg::Matrixf::preMult 600 0x1001a9d0 osg::PositionAttitudeTransform::accept 394 0x1000e840 osg::AnimationPathCallback::update 292 0x100dcee0 osg::Transform::computeBound 284 0x1009bb50 osg::Node::dirtyBound 280 0x100126f0 osg::AnimationPath::getInterpolatedControlPoint 274 0x100a9df0 osg::PositionAttitudeTransform::computeLocalToWorldMatrix 230 0x10009c70 osg::AnimationPathCallback::setPause 225 0x10002e00 osg::Matrixf::preMult 210

12 functions, 846 instructions, Total: 6332 samples, 51.35% of samples in the module, 15.83% of total session samples


Here is with both matrixf and invert4x4 optimized:
=================================
CS:EIP Symbol + Offset 64-bit Timer samples 0x10083cb0 osg::Group::traverse 1362 0x10083d50 osg::Group::computeBound 1142 0x1009a180 osg::Matrixf::mult 922 0x1001ac70 osg::PositionAttitudeTransform::accept 381 0x1000e650 osg::AnimationPathCallback::update 354 0x100dcf30 osg::Transform::computeBound 306 0x1009bcf0 osg::Node::dirtyBound 274 0x100124f0 osg::AnimationPath::getInterpolatedControlPoint 257 0x1009a340 osg::Matrixf::invert_4x3 252 0x10009bb0 osg::GraphicsContext::ScreenIdentifier::~ScreenIdentifier 248 0x100a9b20 osg::PositionAttitudeTransform::computeLocalToWorldMatrix 245 0x10002d00 osg::Matrixf::preMult 214 0x10002c70 osg::Matrixf::preMult 197 0x1000c6b0 osg::StateSet::requiresUpdateTraversal 178

14 functions, 829 instructions, Total: 6332 samples, 54.18% of samples in the module, 15.84% of total session samples

For the optimized profile it did push down the Invert4x4 way to the bottom (I did not want to show that here). If you want the complete list let me know and I'll resend as attachments. Actually you cannot really use this to see how much better the performance is, because the Matrixf Mult is still needed just as much, the actual way to tell would be to show the framerate of the game; however here is where I can show the optimization:
Avarage time using the D3DXMATRIX class:  402.54
Avarage time using the SPMatrix class:    277.69
Avarage time using the Matrixf class:    297.40
Avarage time using the ScalarDP class:    400.21
Avarage time using the DPMatrix class:    1418.11
Avarage time using the Matrixd class:    471.69

Here is the result for postMult where matrixf use to be the same as Matrixd. The 277.69 is what would have been for Matrixf is it was aligned.

Avarage time using the D3DXMATRIX class:  1035.63
Avarage time using the SPMatrix class:    365.36
Avarage time using the Matrixf class:    706.09
Avarage time using the ScalarDP class:    664.13
Avarage time using the DPMatrix class:    2052.29
Avarage time using the Matrixd class:    2125.93

Here is the results for Invert4x4 where Matrixf also was the same as Marixd, and the 365 is what it would have been if the data was aligned.

This stress code is part of the matlib2 with a little tweaking of the osg code to add into the mix.








James Killian
----- Original Message ----- From: "Mathias Fröhlich" <[EMAIL PROTECTED]>
To: "OpenSceneGraph Users" <[email protected]>
Sent: Tuesday, July 29, 2008 10:14 AM
Subject: Re: [osg-users] Using SSE within OSG



James,

On Tuesday 29 July 2008 16:59, James Killian wrote:
Paul asked me the same question a few days ago, and I just realized that we
took that offline so I'll repost here:
One of the things I should add is the actual profile dump, since that shows a more comprehensive picture. The actual game demo is free to download and
play here:
http://www.fringe-online.com/

The current installer of the game does not have my optimization in it yet,
but it should be noted even with the optimization the postmult is still at
the top.  The Invert4x4() however got pushed way down to the bottom (which
is great).  I'll post my profiles when I get home.


---------------------------------snip--------------------------------------
- ---
That is a good question, and I believe the answer is collision detection.
I should disable it and run the numbers again to confirm.  All ships fire
machine guns at a fast rate, and each bullet that gets close enough to a
bounding box/sphere region has to go through the osg code to get the
precise point where it hit.  Rick would probably have a better explanation
of this and other factors since he coded the bulk of the collision
detection (and osg integration).  Most of my time development in the game
has been spent on the physics and flight dynamics (and now optimization).

It may turn out that we could find some caching technique to reduce the
collision stress (like the KBDtree), but in the mean time, matrix
optimizations can benefit the whole community if we do them right, and I
would like to make some contribution to the community.

Ok, you can do here much for the collision detection.
I expect that you should optimize that algorithmically and gain magnitudes
without sse.

So the question is more if such optimizations will bring performance
improovements for the usual scenegraph case.

Greetings

Mathias

--
Dr. Mathias Fröhlich, science + computing ag, Software Solutions
Hagellocher Weg 71-75, D-72070 Tuebingen, Germany
Phone: +49 7071 9457-268, Fax: +49 7071 9457-511
--
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Florian Geyer,
Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Prof. Dr. Hanns Ruder
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196


_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Reply via email to