Ok I thought it was the collision detection but that is not the case here
are some of the numbers with collision disabled:
CS:EIP Symbol + Offset
64-bit Timer samples
0x10083cc0 osg::Group::traverse
1434
0x10083d60 osg::Group::computeBound
1391
0x10099ca0 osg::Matrixf::mult
833
0x1001a9d0 osg::PositionAttitudeTransform::accept
409
0x10099370 osg::Matrixf::preMult
407
0x1000e840 osg::AnimationPathCallback::update
352
0x1009bb50 osg::Node::dirtyBound
340
0x100dcee0 osg::Transform::computeBound
318
0x100a9df0 osg::PositionAttitudeTransform::computeLocalToWorldMatrix
294
0x100126f0 osg::AnimationPath::getInterpolatedControlPoint
285
0x10009c70 osg::AnimationPathCallback::setPause
251
0x1000c8e0 osg::StateSet::requiresUpdateTraversal
228
12 functions, 806 instructions, Total: 6542 samples, 50.85% of samples in
the module, 16.36% of total session samples
Ok here is with collision detection:
=======================
CS:EIP Symbol + Offset
64-bit Timer samples
0x10083cc0 osg::Group::traverse
1382
0x10083d60 osg::Group::computeBound
1237
0x10099ca0 osg::Matrixf::mult
924
0x10099370 osg::Matrixf::preMult
600
0x1001a9d0 osg::PositionAttitudeTransform::accept
394
0x1000e840 osg::AnimationPathCallback::update
292
0x100dcee0 osg::Transform::computeBound
284
0x1009bb50 osg::Node::dirtyBound
280
0x100126f0 osg::AnimationPath::getInterpolatedControlPoint
274
0x100a9df0 osg::PositionAttitudeTransform::computeLocalToWorldMatrix
230
0x10009c70 osg::AnimationPathCallback::setPause
225
0x10002e00 osg::Matrixf::preMult
210
12 functions, 846 instructions, Total: 6332 samples, 51.35% of samples in
the module, 15.83% of total session samples
Here is with both matrixf and invert4x4 optimized:
=================================
CS:EIP Symbol + Offset
64-bit Timer samples
0x10083cb0 osg::Group::traverse
1362
0x10083d50 osg::Group::computeBound
1142
0x1009a180 osg::Matrixf::mult
922
0x1001ac70 osg::PositionAttitudeTransform::accept
381
0x1000e650 osg::AnimationPathCallback::update
354
0x100dcf30 osg::Transform::computeBound
306
0x1009bcf0 osg::Node::dirtyBound
274
0x100124f0 osg::AnimationPath::getInterpolatedControlPoint
257
0x1009a340 osg::Matrixf::invert_4x3
252
0x10009bb0 osg::GraphicsContext::ScreenIdentifier::~ScreenIdentifier
248
0x100a9b20 osg::PositionAttitudeTransform::computeLocalToWorldMatrix
245
0x10002d00 osg::Matrixf::preMult
214
0x10002c70 osg::Matrixf::preMult
197
0x1000c6b0 osg::StateSet::requiresUpdateTraversal
178
14 functions, 829 instructions, Total: 6332 samples, 54.18% of samples in
the module, 15.84% of total session samples
For the optimized profile it did push down the Invert4x4 way to the bottom
(I did not want to show that here). If you want the complete list let me
know and I'll resend as attachments. Actually you cannot really use this to
see how much better the performance is, because the Matrixf Mult is still
needed just as much, the actual way to tell would be to show the framerate
of the game; however here is where I can show the optimization:
Avarage time using the D3DXMATRIX class: 402.54
Avarage time using the SPMatrix class: 277.69
Avarage time using the Matrixf class: 297.40
Avarage time using the ScalarDP class: 400.21
Avarage time using the DPMatrix class: 1418.11
Avarage time using the Matrixd class: 471.69
Here is the result for postMult where matrixf use to be the same as Matrixd.
The 277.69 is what would have been for Matrixf is it was aligned.
Avarage time using the D3DXMATRIX class: 1035.63
Avarage time using the SPMatrix class: 365.36
Avarage time using the Matrixf class: 706.09
Avarage time using the ScalarDP class: 664.13
Avarage time using the DPMatrix class: 2052.29
Avarage time using the Matrixd class: 2125.93
Here is the results for Invert4x4 where Matrixf also was the same as Marixd,
and the 365 is what it would have been if the data was aligned.
This stress code is part of the matlib2 with a little tweaking of the osg
code to add into the mix.
James Killian
----- Original Message -----
From: "Mathias Fröhlich" <[EMAIL PROTECTED]>
To: "OpenSceneGraph Users" <[email protected]>
Sent: Tuesday, July 29, 2008 10:14 AM
Subject: Re: [osg-users] Using SSE within OSG
James,
On Tuesday 29 July 2008 16:59, James Killian wrote:
Paul asked me the same question a few days ago, and I just realized that
we
took that offline so I'll repost here:
One of the things I should add is the actual profile dump, since that
shows
a more comprehensive picture. The actual game demo is free to download
and
play here:
http://www.fringe-online.com/
The current installer of the game does not have my optimization in it yet,
but it should be noted even with the optimization the postmult is still at
the top. The Invert4x4() however got pushed way down to the bottom (which
is great). I'll post my profiles when I get home.
---------------------------------snip--------------------------------------
- ---
That is a good question, and I believe the answer is collision detection.
I should disable it and run the numbers again to confirm. All ships fire
machine guns at a fast rate, and each bullet that gets close enough to a
bounding box/sphere region has to go through the osg code to get the
precise point where it hit. Rick would probably have a better explanation
of this and other factors since he coded the bulk of the collision
detection (and osg integration). Most of my time development in the game
has been spent on the physics and flight dynamics (and now optimization).
It may turn out that we could find some caching technique to reduce the
collision stress (like the KBDtree), but in the mean time, matrix
optimizations can benefit the whole community if we do them right, and I
would like to make some contribution to the community.
Ok, you can do here much for the collision detection.
I expect that you should optimize that algorithmically and gain magnitudes
without sse.
So the question is more if such optimizations will bring performance
improovements for the usual scenegraph case.
Greetings
Mathias
--
Dr. Mathias Fröhlich, science + computing ag, Software Solutions
Hagellocher Weg 71-75, D-72070 Tuebingen, Germany
Phone: +49 7071 9457-268, Fax: +49 7071 9457-511
--
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Florian Geyer,
Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Prof. Dr. Hanns Ruder
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org