Hi,

I investigated this a bit more. I believe I have arrived at the bottleneck.

I stepped through the BufferObject code (which ImageStream uses via the derived 
class PixelBufferObject) and found that every time the dirty() method is called 
on osg::Image, in the osg::Texture that uses this osg::Image calls 
applyTexImage2D_load which in turn calls BufferObject::compileBuffer. 

Over here, I found that each time the texture's data is updated, it updates the 
PBO, by calling _extensions->glBufferSubData. 

While this works fine for texture updates that are not so frequent or for 
updates that don't involve changing the whole buffer, for updates as frequent 
as 25-50 FPS (as in my case), and ones involving change of the whole buffer 
(1920x1080 pixels), glBufferSubData starts showing drastic performance 
degradation as opposed to using glMapBuffer to get the PBO buffer and directly 
updating it. 

There are some posts out there which talk about the performance of glMapBuffer 
and using that buffer directly vs using glBufferSubData. One notable post is 
this one here 
(http://www.stevestreeting.com/2007/03/17/glmapbuffer-vs-glbuffersubdata-the-return/)
 from the Ogre3D dev lead.

So, I decided to do a quick and dirty hack of the same into OSG. Please find 
attached a patch that does two things:
a) (Hackily) exposes the buffer object's buffer
b) Modifies the FFMPEG plugin to write to this buffer directly

You should be able to apply the patch by simply doing:

Code:
patch -p1 < pboPatch



on a folder containing the source of OSG 3.2.0 release.

The speed improvements are quite spectacular!

On the multimovie sample using the osgDB FFMPEG plugin, if we just remove all 
the nodevisitors and rendercallbacks to show stats (since these kill 
framerates), and simply log FPS while rendering all 10 videos, the frame rate 
improvements for 10 Full HD videos (1920 x 1080) videos are around 2000%(!!!).

You can try the same by toggling the m_useDirectPBOMapping flag that I added to 
FFMPEGImageStream class from false to true.

Without direct buffer mapping (i.e. using glSubBufferData) I get a niggardly 
~20 FPS on an Intel Xeon (2.4 GHz), 8 GB RAM with AMD Radeon HD 6800. 

With the direct buffer mapping turned on, I get over 500 FPS. 

So, then, what I would like to know now, is whether there is a way of 
accomplishing the same direct mapping to PBO buffer (result of glMapBuffer) in 
OSG without having to resort to the hacks that I have. 

Thank you!

Cheers,
Abhishek[/code]

------------------
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=57681#57681




Attachments: 
http://forum.openscenegraph.org//files/3155_1387188273._194.


_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Reply via email to