Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Tugkan Calapoglu
Hi Robert, Wojciech

my initial guess was that the lengthy draw dispatch of the master view
and failing cull  draw parallelism was the result of the same problem.

However, they actually seem to be different problems and I'll focus
first on the draw dispatch.

The master camera draws only a screen aligned quad and nothing else
(scene with shadows is rendered by the slave camera). Also no dynamic
geometry. But, I indeed have a read buffer operation: a glGetTexImage
call in the postdraw callback of the master camera. This call takes ~12 ms.

I read back a small texture that is rendered by a camera in the current
frame. The camera uses FRAME_BUFFER_OBJECT as render target implementation.

It looks like using glReadPixels to read directly from the FBO is the
advised method for getting data back to the system memory. How do I get
the FBO that the camera is rendering to? Or, is there a better method to
get the texture data back to the sysmem?


cheers,
tugkan

 Hi Tugkan,
 
 Robert mentioned lengthy read operation. It may be related to read
 buffer operation thats used to compute shadow volume in
 LightSpacePerpspectiveShadowMapDB. If your slave view uses
 osgShadow::LightSpacePerpspectiveShadowMapDB then you may check if
 osgShadow::LightSpacePerpspectiveShadowMapCB (cull bounds flavour) has
 the same problem.
 
 I am aware of LightSpacePerpspectiveShadowMapDB glReadBuffer limitation
 but I could not find quick and easy to implement workaround that would
 do this without scanning the image by CPU. I allocate small 64x64
 texture and render the scene there, then read it into CPU memory and use
 CPU to scan pixels to optimzie shadow volume from depths and pixel
 locations strored in this prerender image.
 
 Wojtek
 
 - Original Message - From: Robert Osfield
 robert.osfi...@gmail.com
 To: OpenSceneGraph Users osg-users@lists.openscenegraph.org
 Sent: Wednesday, January 13, 2010 1:04 PM
 Subject: Re: [osg-users] RTT slave views and multi-threading
 
 
 Hi Tugkan,
 
 The osgdistortion example works a bit like what you are describing,
 could you try this to see what performance it's getting.
 
 As for general notes about threading, if you are working on the same
 graphics context as you are then all the draw dispatch and the draw
 GPU can only be done by a single graphics thread so there is little
 opportunity to make it more parallel without using another graphics
 card/graphics context and interleaving of frames.
 
 As for why the second camera is very expensive on draw dispatch, this
 suggest to me that it's blocking either due to the OpenGL fifo being
 full or that it contains a GL read back operation of some kind.
 
 Robert.
 
 On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com
 wrote:
 Hi All,

 I am using a slave view for rendering the scene to a texture. Initially
 I tried with a camera node, however, this did not work well due to a
 problem in LiSPSM shadows and I was suggested to use RTT slave views.

 My setup is as follows: There is a single main view and I attach a slave
 view to it. This slave view is attached with addSlave( slave , false );
 so that it does *not* automatically use the master scene.

 I attach a texture to the slave view and make my scene child of this
 view. I attach a screen aligned quad to the main view. This quad
 visualizes the RTT texture from the slave view.

 Now I have a threading problem which can be seen on the snapshot I
 attached. There are two issues:
 1- The main view (cam1) has a very large draw time even though it only
 renders the screen aligned quad. I double checked to see whether it also
 renders the actual scene but this is not the case.

 2- Slave view does not run cull and draw in parallel. Cull and draw do
 run in parallel if they are not rendered with the slave view. Moreover,
 if I change the render order of the slave camera from PRE_RENDER to
 POST_RENDER it is ok.

 I could simply use POST_RENDER but I am afraid it introduces an extra
 one frame latency. If I render the screen aligned quad first and the
 scene later than what I see on the quad is the texture from previous
 frame (right?).

 Any ideas?

 cheers,
 tugkan




 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
 


-- 
Tugkan Calapoglu

-
VIRES Simulationstechnologie GmbH
Oberaustrasse 34
83026 Rosenheim
Germany
phone+49.8031.463641
fax  +49.8031.463645
emailtug...@vires.com
internet www.vires.com

Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread J.P. Delport

Hi Tugkan,

Tugkan Calapoglu wrote:

Hi Robert, Wojciech

my initial guess was that the lengthy draw dispatch of the master view
and failing cull  draw parallelism was the result of the same problem.

However, they actually seem to be different problems and I'll focus
first on the draw dispatch.

The master camera draws only a screen aligned quad and nothing else
(scene with shadows is rendered by the slave camera). Also no dynamic
geometry. But, I indeed have a read buffer operation: a glGetTexImage
call in the postdraw callback of the master camera. This call takes ~12 ms.

I read back a small texture that is rendered by a camera in the current
frame. The camera uses FRAME_BUFFER_OBJECT as render target implementation.

It looks like using glReadPixels to read directly from the FBO is the
advised method for getting data back to the system memory. How do I get
the FBO that the camera is rendering to? Or, is there a better method to
get the texture data back to the sysmem?

Simplest is to just attach an osg::Image to the RTT (to FBO) camera. See 
the attach method of osg::Camera. Think there is an example in osgprerender.


Also see here:
http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/52651
and
http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/53432

rgds
jp



cheers,
tugkan


Hi Tugkan,

Robert mentioned lengthy read operation. It may be related to read
buffer operation thats used to compute shadow volume in
LightSpacePerpspectiveShadowMapDB. If your slave view uses
osgShadow::LightSpacePerpspectiveShadowMapDB then you may check if
osgShadow::LightSpacePerpspectiveShadowMapCB (cull bounds flavour) has
the same problem.

I am aware of LightSpacePerpspectiveShadowMapDB glReadBuffer limitation
but I could not find quick and easy to implement workaround that would
do this without scanning the image by CPU. I allocate small 64x64
texture and render the scene there, then read it into CPU memory and use
CPU to scan pixels to optimzie shadow volume from depths and pixel
locations strored in this prerender image.

Wojtek

- Original Message - From: Robert Osfield
robert.osfi...@gmail.com
To: OpenSceneGraph Users osg-users@lists.openscenegraph.org
Sent: Wednesday, January 13, 2010 1:04 PM
Subject: Re: [osg-users] RTT slave views and multi-threading


Hi Tugkan,

The osgdistortion example works a bit like what you are describing,
could you try this to see what performance it's getting.

As for general notes about threading, if you are working on the same
graphics context as you are then all the draw dispatch and the draw
GPU can only be done by a single graphics thread so there is little
opportunity to make it more parallel without using another graphics
card/graphics context and interleaving of frames.

As for why the second camera is very expensive on draw dispatch, this
suggest to me that it's blocking either due to the OpenGL fifo being
full or that it contains a GL read back operation of some kind.

Robert.

On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com
wrote:

Hi All,

I am using a slave view for rendering the scene to a texture. Initially
I tried with a camera node, however, this did not work well due to a
problem in LiSPSM shadows and I was suggested to use RTT slave views.

My setup is as follows: There is a single main view and I attach a slave
view to it. This slave view is attached with addSlave( slave , false );
so that it does *not* automatically use the master scene.

I attach a texture to the slave view and make my scene child of this
view. I attach a screen aligned quad to the main view. This quad
visualizes the RTT texture from the slave view.

Now I have a threading problem which can be seen on the snapshot I
attached. There are two issues:
1- The main view (cam1) has a very large draw time even though it only
renders the screen aligned quad. I double checked to see whether it also
renders the actual scene but this is not the case.

2- Slave view does not run cull and draw in parallel. Cull and draw do
run in parallel if they are not rendered with the slave view. Moreover,
if I change the render order of the slave camera from PRE_RENDER to
POST_RENDER it is ok.

I could simply use POST_RENDER but I am afraid it introduces an extra
one frame latency. If I render the screen aligned quad first and the
scene later than what I see on the quad is the texture from previous
frame (right?).

Any ideas?

cheers,
tugkan




___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users

Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Tugkan Calapoglu
hi Jp,

unfortunately that method is easy but very slow. I think it also uses
glGetTexImage.

cheers,
tugkan


 Hi Tugkan,
 
 Tugkan Calapoglu wrote:
 Hi Robert, Wojciech

 my initial guess was that the lengthy draw dispatch of the master view
 and failing cull  draw parallelism was the result of the same problem.

 However, they actually seem to be different problems and I'll focus
 first on the draw dispatch.

 The master camera draws only a screen aligned quad and nothing else
 (scene with shadows is rendered by the slave camera). Also no dynamic
 geometry. But, I indeed have a read buffer operation: a glGetTexImage
 call in the postdraw callback of the master camera. This call takes
 ~12 ms.

 I read back a small texture that is rendered by a camera in the current
 frame. The camera uses FRAME_BUFFER_OBJECT as render target
 implementation.

 It looks like using glReadPixels to read directly from the FBO is the
 advised method for getting data back to the system memory. How do I get
 the FBO that the camera is rendering to? Or, is there a better method to
 get the texture data back to the sysmem?

 Simplest is to just attach an osg::Image to the RTT (to FBO) camera. See
 the attach method of osg::Camera. Think there is an example in
 osgprerender.
 
 Also see here:
 http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/52651
 and
 http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/53432
 
 rgds
 jp
 

 cheers,
 tugkan

 Hi Tugkan,

 Robert mentioned lengthy read operation. It may be related to read
 buffer operation thats used to compute shadow volume in
 LightSpacePerpspectiveShadowMapDB. If your slave view uses
 osgShadow::LightSpacePerpspectiveShadowMapDB then you may check if
 osgShadow::LightSpacePerpspectiveShadowMapCB (cull bounds flavour) has
 the same problem.

 I am aware of LightSpacePerpspectiveShadowMapDB glReadBuffer limitation
 but I could not find quick and easy to implement workaround that would
 do this without scanning the image by CPU. I allocate small 64x64
 texture and render the scene there, then read it into CPU memory and use
 CPU to scan pixels to optimzie shadow volume from depths and pixel
 locations strored in this prerender image.

 Wojtek

 - Original Message - From: Robert Osfield
 robert.osfi...@gmail.com
 To: OpenSceneGraph Users osg-users@lists.openscenegraph.org
 Sent: Wednesday, January 13, 2010 1:04 PM
 Subject: Re: [osg-users] RTT slave views and multi-threading


 Hi Tugkan,

 The osgdistortion example works a bit like what you are describing,
 could you try this to see what performance it's getting.

 As for general notes about threading, if you are working on the same
 graphics context as you are then all the draw dispatch and the draw
 GPU can only be done by a single graphics thread so there is little
 opportunity to make it more parallel without using another graphics
 card/graphics context and interleaving of frames.

 As for why the second camera is very expensive on draw dispatch, this
 suggest to me that it's blocking either due to the OpenGL fifo being
 full or that it contains a GL read back operation of some kind.

 Robert.

 On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com
 wrote:
 Hi All,

 I am using a slave view for rendering the scene to a texture. Initially
 I tried with a camera node, however, this did not work well due to a
 problem in LiSPSM shadows and I was suggested to use RTT slave views.

 My setup is as follows: There is a single main view and I attach a
 slave
 view to it. This slave view is attached with addSlave( slave , false );
 so that it does *not* automatically use the master scene.

 I attach a texture to the slave view and make my scene child of this
 view. I attach a screen aligned quad to the main view. This quad
 visualizes the RTT texture from the slave view.

 Now I have a threading problem which can be seen on the snapshot I
 attached. There are two issues:
 1- The main view (cam1) has a very large draw time even though it only
 renders the screen aligned quad. I double checked to see whether it
 also
 renders the actual scene but this is not the case.

 2- Slave view does not run cull and draw in parallel. Cull and draw do
 run in parallel if they are not rendered with the slave view. Moreover,
 if I change the render order of the slave camera from PRE_RENDER to
 POST_RENDER it is ok.

 I could simply use POST_RENDER but I am afraid it introduces an extra
 one frame latency. If I render the screen aligned quad first and the
 scene later than what I see on the quad is the texture from previous
 frame (right?).

 Any ideas?

 cheers,
 tugkan




 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org

Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Robert Osfield
Hi Tugkan,

On Thu, Jan 14, 2010 at 12:00 PM, Tugkan Calapoglu tug...@vires.com wrote:
 unfortunately that method is easy but very slow. I think it also uses
 glGetTexImage.

An operation like glReadPixels and glGetTexImage involves the fifo
being flushed and the data copied back into main memory.  These two
things together make it slow and there isn't much you can do about it
directly.

The best way to deal with the high cost of these operations is to
avoid them completely.  Try to use algorithms that can use render to
texture using FBO's and read this textures directly in other
shaders.  Never try to copy the results back to the CPU/main memory,
this does force you to do more work on the GPU and rely on more
complex shaders but in the end it means that you don't have to force a
round trip to the GPU.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread J.P. Delport

Hi,

Tugkan Calapoglu wrote:

hi Jp,

unfortunately that method is easy but very slow. I think it also uses
glGetTexImage.


You might be surprised. Have you read the threads I linked to? Attach 
uses glReadPixels (while doing the FBO rendering, so you don't have to 
bind anything yourself later) and in many cases this is the fastest. If 
you want something more elaborate, such as async PBO use, see the 
osgscreencapture example. Also, test whatever you use for your setup, 
all sorts of things can change the efficiency of reading data back to 
CPU. YMMV.


Like Robert said tho, not reading anything back to CPU if you can help 
it is the best.


rgds
jp



cheers,
tugkan



Hi Tugkan,

Tugkan Calapoglu wrote:

Hi Robert, Wojciech

my initial guess was that the lengthy draw dispatch of the master view
and failing cull  draw parallelism was the result of the same problem.

However, they actually seem to be different problems and I'll focus
first on the draw dispatch.

The master camera draws only a screen aligned quad and nothing else
(scene with shadows is rendered by the slave camera). Also no dynamic
geometry. But, I indeed have a read buffer operation: a glGetTexImage
call in the postdraw callback of the master camera. This call takes
~12 ms.

I read back a small texture that is rendered by a camera in the current
frame. The camera uses FRAME_BUFFER_OBJECT as render target
implementation.

It looks like using glReadPixels to read directly from the FBO is the
advised method for getting data back to the system memory. How do I get
the FBO that the camera is rendering to? Or, is there a better method to
get the texture data back to the sysmem?


Simplest is to just attach an osg::Image to the RTT (to FBO) camera. See
the attach method of osg::Camera. Think there is an example in
osgprerender.

Also see here:
http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/52651
and
http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/53432

rgds
jp


cheers,
tugkan


Hi Tugkan,

Robert mentioned lengthy read operation. It may be related to read
buffer operation thats used to compute shadow volume in
LightSpacePerpspectiveShadowMapDB. If your slave view uses
osgShadow::LightSpacePerpspectiveShadowMapDB then you may check if
osgShadow::LightSpacePerpspectiveShadowMapCB (cull bounds flavour) has
the same problem.

I am aware of LightSpacePerpspectiveShadowMapDB glReadBuffer limitation
but I could not find quick and easy to implement workaround that would
do this without scanning the image by CPU. I allocate small 64x64
texture and render the scene there, then read it into CPU memory and use
CPU to scan pixels to optimzie shadow volume from depths and pixel
locations strored in this prerender image.

Wojtek

- Original Message - From: Robert Osfield
robert.osfi...@gmail.com
To: OpenSceneGraph Users osg-users@lists.openscenegraph.org
Sent: Wednesday, January 13, 2010 1:04 PM
Subject: Re: [osg-users] RTT slave views and multi-threading


Hi Tugkan,

The osgdistortion example works a bit like what you are describing,
could you try this to see what performance it's getting.

As for general notes about threading, if you are working on the same
graphics context as you are then all the draw dispatch and the draw
GPU can only be done by a single graphics thread so there is little
opportunity to make it more parallel without using another graphics
card/graphics context and interleaving of frames.

As for why the second camera is very expensive on draw dispatch, this
suggest to me that it's blocking either due to the OpenGL fifo being
full or that it contains a GL read back operation of some kind.

Robert.

On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com
wrote:

Hi All,

I am using a slave view for rendering the scene to a texture. Initially
I tried with a camera node, however, this did not work well due to a
problem in LiSPSM shadows and I was suggested to use RTT slave views.

My setup is as follows: There is a single main view and I attach a
slave
view to it. This slave view is attached with addSlave( slave , false );
so that it does *not* automatically use the master scene.

I attach a texture to the slave view and make my scene child of this
view. I attach a screen aligned quad to the main view. This quad
visualizes the RTT texture from the slave view.

Now I have a threading problem which can be seen on the snapshot I
attached. There are two issues:
1- The main view (cam1) has a very large draw time even though it only
renders the screen aligned quad. I double checked to see whether it
also
renders the actual scene but this is not the case.

2- Slave view does not run cull and draw in parallel. Cull and draw do
run in parallel if they are not rendered with the slave view. Moreover,
if I change the render order of the slave camera from PRE_RENDER to
POST_RENDER it is ok.

I could simply use POST_RENDER but I am afraid it introduces an extra
one frame latency. If I render

Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Tugkan Calapoglu
Hi Robert,

I am working on an HDR  implementation which should work on multiple
channels. The method I use requires average luminance of the scene. If I
 use different average luminances for different channels the colors will
simply not match. E.g. in a tunnel front channel will see the tunnel
exit and have a higher average luminance than the side channels which
only see the dark tunnel walls.

So, I do need a way to collect current average luminances of all
channels and compute a single average that can be used for all (by
channel I mean separate computers that are connected to separate
projectors).

I know that getting data back from GPU is slow but 12ms for a 4x4
texture seems extreme.

glReadPixels seems to be faster, because we are able to make full screen
grabs (800x600) and still keep 60hz (even w/o pbo). Some GPGPU people
suggest using glReadPixels to read directly from a FBO rather than
glGetTexImage, so I was wondering if there is a way to obtain the
osg::FBO pointer from the camera?

cheers,
tugkan




 Hi Tugkan,
 
 On Thu, Jan 14, 2010 at 12:00 PM, Tugkan Calapoglu tug...@vires.com wrote:
 unfortunately that method is easy but very slow. I think it also uses
 glGetTexImage.
 
 An operation like glReadPixels and glGetTexImage involves the fifo
 being flushed and the data copied back into main memory.  These two
 things together make it slow and there isn't much you can do about it
 directly.
 
 The best way to deal with the high cost of these operations is to
 avoid them completely.  Try to use algorithms that can use render to
 texture using FBO's and read this textures directly in other
 shaders.  Never try to copy the results back to the CPU/main memory,
 this does force you to do more work on the GPU and rely on more
 complex shaders but in the end it means that you don't have to force a
 round trip to the GPU.
 
 Robert.
 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
 


-- 
Tugkan Calapoglu

-
VIRES Simulationstechnologie GmbH
Oberaustrasse 34
83026 Rosenheim
Germany
phone+49.8031.463641
fax  +49.8031.463645
emailtug...@vires.com
internet www.vires.com
-
Sitz der Gesellschaft: Rosenheim
Handelsregister Traunstein  HRB 10410
Geschaeftsfuehrer: Marius Dupuis
   Wunibald Karl
-
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Robert Osfield
Hi Tugkan,

On Thu, Jan 14, 2010 at 12:31 PM, Tugkan Calapoglu tug...@vires.com wrote:
 I know that getting data back from GPU is slow but 12ms for a 4x4
 texture seems extreme.

It's the flushing of the fifo that is the problem, that's why it's so
slow, not the data transfer itself.  Once you flush the fifo you loose
the parallelism between the CPU and GPU.

The only way to hide this is to use PBO's to do the read back and do
the actual read back on the next frame rather than in the current
frame.  In your case you might be able to get away with this, a frames
latency might not be a big issue if you can keep to a solid 60Hz and
the values you are reading back aren't changing drastically between
frames.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Tugkan Calapoglu
Hi Jp,

my initial implementation used osg:Image attached to a camera and it was
just as slow.

I will see what I can do with PBO's.


regards,
tugkan

 Hi,
 
 Tugkan Calapoglu wrote:
 hi Jp,

 unfortunately that method is easy but very slow. I think it also uses
 glGetTexImage.
 
 You might be surprised. Have you read the threads I linked to? Attach
 uses glReadPixels (while doing the FBO rendering, so you don't have to
 bind anything yourself later) and in many cases this is the fastest. If
 you want something more elaborate, such as async PBO use, see the
 osgscreencapture example. Also, test whatever you use for your setup,
 all sorts of things can change the efficiency of reading data back to
 CPU. YMMV.
 
 Like Robert said tho, not reading anything back to CPU if you can help
 it is the best.
 
 rgds
 jp
 

 cheers,
 tugkan


 Hi Tugkan,

 Tugkan Calapoglu wrote:
 Hi Robert, Wojciech

 my initial guess was that the lengthy draw dispatch of the master view
 and failing cull  draw parallelism was the result of the same problem.

 However, they actually seem to be different problems and I'll focus
 first on the draw dispatch.

 The master camera draws only a screen aligned quad and nothing else
 (scene with shadows is rendered by the slave camera). Also no dynamic
 geometry. But, I indeed have a read buffer operation: a glGetTexImage
 call in the postdraw callback of the master camera. This call takes
 ~12 ms.

 I read back a small texture that is rendered by a camera in the current
 frame. The camera uses FRAME_BUFFER_OBJECT as render target
 implementation.

 It looks like using glReadPixels to read directly from the FBO is the
 advised method for getting data back to the system memory. How do I get
 the FBO that the camera is rendering to? Or, is there a better
 method to
 get the texture data back to the sysmem?

 Simplest is to just attach an osg::Image to the RTT (to FBO) camera. See
 the attach method of osg::Camera. Think there is an example in
 osgprerender.

 Also see here:
 http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/52651
 and
 http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/53432

 rgds
 jp

 cheers,
 tugkan

 Hi Tugkan,

 Robert mentioned lengthy read operation. It may be related to read
 buffer operation thats used to compute shadow volume in
 LightSpacePerpspectiveShadowMapDB. If your slave view uses
 osgShadow::LightSpacePerpspectiveShadowMapDB then you may check if
 osgShadow::LightSpacePerpspectiveShadowMapCB (cull bounds flavour) has
 the same problem.

 I am aware of LightSpacePerpspectiveShadowMapDB glReadBuffer
 limitation
 but I could not find quick and easy to implement workaround that would
 do this without scanning the image by CPU. I allocate small 64x64
 texture and render the scene there, then read it into CPU memory
 and use
 CPU to scan pixels to optimzie shadow volume from depths and pixel
 locations strored in this prerender image.

 Wojtek

 - Original Message - From: Robert Osfield
 robert.osfi...@gmail.com
 To: OpenSceneGraph Users osg-users@lists.openscenegraph.org
 Sent: Wednesday, January 13, 2010 1:04 PM
 Subject: Re: [osg-users] RTT slave views and multi-threading


 Hi Tugkan,

 The osgdistortion example works a bit like what you are describing,
 could you try this to see what performance it's getting.

 As for general notes about threading, if you are working on the same
 graphics context as you are then all the draw dispatch and the draw
 GPU can only be done by a single graphics thread so there is little
 opportunity to make it more parallel without using another graphics
 card/graphics context and interleaving of frames.

 As for why the second camera is very expensive on draw dispatch, this
 suggest to me that it's blocking either due to the OpenGL fifo being
 full or that it contains a GL read back operation of some kind.

 Robert.

 On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com
 wrote:
 Hi All,

 I am using a slave view for rendering the scene to a texture.
 Initially
 I tried with a camera node, however, this did not work well due to a
 problem in LiSPSM shadows and I was suggested to use RTT slave views.

 My setup is as follows: There is a single main view and I attach a
 slave
 view to it. This slave view is attached with addSlave( slave ,
 false );
 so that it does *not* automatically use the master scene.

 I attach a texture to the slave view and make my scene child of this
 view. I attach a screen aligned quad to the main view. This quad
 visualizes the RTT texture from the slave view.

 Now I have a threading problem which can be seen on the snapshot I
 attached. There are two issues:
 1- The main view (cam1) has a very large draw time even though it
 only
 renders the screen aligned quad. I double checked to see whether it
 also
 renders the actual scene but this is not the case.

 2- Slave view does not run cull and draw in parallel. Cull and
 draw do
 run in parallel

Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Tugkan Calapoglu
Hi Robert,

yes one frame latency is OK. Is there an example about the PBO usage?
osgscreencapture seems to be about getting the data from frame buffer
not from an RTT texture.

tugkan

 Hi Tugkan,
 
 On Thu, Jan 14, 2010 at 12:31 PM, Tugkan Calapoglu tug...@vires.com wrote:
 I know that getting data back from GPU is slow but 12ms for a 4x4
 texture seems extreme.
 
 It's the flushing of the fifo that is the problem, that's why it's so
 slow, not the data transfer itself.  Once you flush the fifo you loose
 the parallelism between the CPU and GPU.
 
 The only way to hide this is to use PBO's to do the read back and do
 the actual read back on the next frame rather than in the current
 frame.  In your case you might be able to get away with this, a frames
 latency might not be a big issue if you can keep to a solid 60Hz and
 the values you are reading back aren't changing drastically between
 frames.
 
 Robert.
 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
 


-- 
Tugkan Calapoglu

-
VIRES Simulationstechnologie GmbH
Oberaustrasse 34
83026 Rosenheim
Germany
phone+49.8031.463641
fax  +49.8031.463645
emailtug...@vires.com
internet www.vires.com
-
Sitz der Gesellschaft: Rosenheim
Handelsregister Traunstein  HRB 10410
Geschaeftsfuehrer: Marius Dupuis
   Wunibald Karl
-
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread J.P. Delport

Hi,

Tugkan Calapoglu wrote:

Hi Jp,

my initial implementation used osg:Image attached to a camera and it was
just as slow.


OK.



I will see what I can do with PBO's.


There is some code in the threads I linked to earlier that shows how to 
get data into a PBO using osg's PixelBufferDataObject. It does not do 
the async reading, but see here for more details:

http://www.songho.ca/opengl/gl_pbo.html

regards
jp




regards,
tugkan


Hi,

Tugkan Calapoglu wrote:

hi Jp,

unfortunately that method is easy but very slow. I think it also uses
glGetTexImage.

You might be surprised. Have you read the threads I linked to? Attach
uses glReadPixels (while doing the FBO rendering, so you don't have to
bind anything yourself later) and in many cases this is the fastest. If
you want something more elaborate, such as async PBO use, see the
osgscreencapture example. Also, test whatever you use for your setup,
all sorts of things can change the efficiency of reading data back to
CPU. YMMV.

Like Robert said tho, not reading anything back to CPU if you can help
it is the best.

rgds
jp


cheers,
tugkan



Hi Tugkan,

Tugkan Calapoglu wrote:

Hi Robert, Wojciech

my initial guess was that the lengthy draw dispatch of the master view
and failing cull  draw parallelism was the result of the same problem.

However, they actually seem to be different problems and I'll focus
first on the draw dispatch.

The master camera draws only a screen aligned quad and nothing else
(scene with shadows is rendered by the slave camera). Also no dynamic
geometry. But, I indeed have a read buffer operation: a glGetTexImage
call in the postdraw callback of the master camera. This call takes
~12 ms.

I read back a small texture that is rendered by a camera in the current
frame. The camera uses FRAME_BUFFER_OBJECT as render target
implementation.

It looks like using glReadPixels to read directly from the FBO is the
advised method for getting data back to the system memory. How do I get
the FBO that the camera is rendering to? Or, is there a better
method to
get the texture data back to the sysmem?


Simplest is to just attach an osg::Image to the RTT (to FBO) camera. See
the attach method of osg::Camera. Think there is an example in
osgprerender.

Also see here:
http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/52651
and
http://thread.gmane.org/gmane.comp.graphics.openscenegraph.user/53432

rgds
jp


cheers,
tugkan


Hi Tugkan,

Robert mentioned lengthy read operation. It may be related to read
buffer operation thats used to compute shadow volume in
LightSpacePerpspectiveShadowMapDB. If your slave view uses
osgShadow::LightSpacePerpspectiveShadowMapDB then you may check if
osgShadow::LightSpacePerpspectiveShadowMapCB (cull bounds flavour) has
the same problem.

I am aware of LightSpacePerpspectiveShadowMapDB glReadBuffer
limitation
but I could not find quick and easy to implement workaround that would
do this without scanning the image by CPU. I allocate small 64x64
texture and render the scene there, then read it into CPU memory
and use
CPU to scan pixels to optimzie shadow volume from depths and pixel
locations strored in this prerender image.

Wojtek

- Original Message - From: Robert Osfield
robert.osfi...@gmail.com
To: OpenSceneGraph Users osg-users@lists.openscenegraph.org
Sent: Wednesday, January 13, 2010 1:04 PM
Subject: Re: [osg-users] RTT slave views and multi-threading


Hi Tugkan,

The osgdistortion example works a bit like what you are describing,
could you try this to see what performance it's getting.

As for general notes about threading, if you are working on the same
graphics context as you are then all the draw dispatch and the draw
GPU can only be done by a single graphics thread so there is little
opportunity to make it more parallel without using another graphics
card/graphics context and interleaving of frames.

As for why the second camera is very expensive on draw dispatch, this
suggest to me that it's blocking either due to the OpenGL fifo being
full or that it contains a GL read back operation of some kind.

Robert.

On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com
wrote:

Hi All,

I am using a slave view for rendering the scene to a texture.
Initially
I tried with a camera node, however, this did not work well due to a
problem in LiSPSM shadows and I was suggested to use RTT slave views.

My setup is as follows: There is a single main view and I attach a
slave
view to it. This slave view is attached with addSlave( slave ,
false );
so that it does *not* automatically use the master scene.

I attach a texture to the slave view and make my scene child of this
view. I attach a screen aligned quad to the main view. This quad
visualizes the RTT texture from the slave view.

Now I have a threading problem which can be seen on the snapshot I
attached. There are two issues:
1- The main view (cam1) has a very large draw time even though it
only
renders the screen aligned quad. I

Re: [osg-users] RTT slave views and multi-threading

2010-01-14 Thread Robert Osfield
Hi Tugkan,

On Thu, Jan 14, 2010 at 12:51 PM, Tugkan Calapoglu tug...@vires.com wrote:
 yes one frame latency is OK. Is there an example about the PBO usage?
 osgscreencapture seems to be about getting the data from frame buffer
 not from an RTT texture.

osgscreencapture uses a frame latency when it double buffers the
PBO's.  It doesn't matter whether it's frame buffer or FBO, the PBO is
only related to memory management.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-13 Thread Robert Osfield
Hi Tugkan,

The osgdistortion example works a bit like what you are describing,
could you try this to see what performance it's getting.

As for general notes about threading, if you are working on the same
graphics context as you are then all the draw dispatch and the draw
GPU can only be done by a single graphics thread so there is little
opportunity to make it more parallel without using another graphics
card/graphics context and interleaving of frames.

As for why the second camera is very expensive on draw dispatch, this
suggest to me that it's blocking either due to the OpenGL fifo being
full or that it contains a GL read back operation of some kind.

Robert.

On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com wrote:
 Hi All,

 I am using a slave view for rendering the scene to a texture. Initially
  I tried with a camera node, however, this did not work well due to a
 problem in LiSPSM shadows and I was suggested to use RTT slave views.

 My setup is as follows: There is a single main view and I attach a slave
 view to it. This slave view is attached with addSlave( slave , false );
  so that it does *not* automatically use the master scene.

 I attach a texture to the slave view and make my scene child of this
 view. I attach a screen aligned quad to the main view. This quad
 visualizes the RTT texture from the slave view.

 Now I have a threading problem which can be seen on the snapshot I
 attached. There are two issues:
 1- The main view (cam1) has a very large draw time even though it only
 renders the screen aligned quad. I double checked to see whether it also
 renders the actual scene but this is not the case.

 2- Slave view does not run cull and draw in parallel. Cull and draw do
 run in parallel if they are not rendered with the slave view. Moreover,
 if I change the render order of the slave camera from PRE_RENDER to
 POST_RENDER it is ok.

 I could simply use POST_RENDER but I am afraid it introduces an extra
 one frame latency. If I render the screen aligned quad first and the
 scene later than what I see on the quad is the texture from previous
 frame (right?).

 Any ideas?

 cheers,
 tugkan




 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-13 Thread Tugkan Calapoglu
Hi Robert,
 Hi Tugkan,
 
 The osgdistortion example works a bit like what you are describing,
 could you try this to see what performance it's getting.
 
osgdistortion's threading model is set to SingleThreaded in the code. I
changed it to DrawThreadPerContext and now I can see that draw starts
after cull, i.e. they do not run parallel.

 As for general notes about threading, if you are working on the same
 graphics context as you are then all the draw dispatch and the draw
 GPU can only be done by a single graphics thread so there is little
 opportunity to make it more parallel without using another graphics
 card/graphics context and interleaving of frames.
 
Sure. I do not expect that two cameras render in parallel onto a single
window, but cull and draw of a certain camera should run parallel.
Indeed they do so normally with the exact same scene and application. It
breaks only if the second camera (the slave) has PRE_RENDER render order.


tugkan

 As for why the second camera is very expensive on draw dispatch, this
 suggest to me that it's blocking either due to the OpenGL fifo being
 full or that it contains a GL read back operation of some kind.
 
 Robert.
 
 On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com wrote:
 Hi All,

 I am using a slave view for rendering the scene to a texture. Initially
  I tried with a camera node, however, this did not work well due to a
 problem in LiSPSM shadows and I was suggested to use RTT slave views.

 My setup is as follows: There is a single main view and I attach a slave
 view to it. This slave view is attached with addSlave( slave , false );
  so that it does *not* automatically use the master scene.

 I attach a texture to the slave view and make my scene child of this
 view. I attach a screen aligned quad to the main view. This quad
 visualizes the RTT texture from the slave view.

 Now I have a threading problem which can be seen on the snapshot I
 attached. There are two issues:
 1- The main view (cam1) has a very large draw time even though it only
 renders the screen aligned quad. I double checked to see whether it also
 renders the actual scene but this is not the case.

 2- Slave view does not run cull and draw in parallel. Cull and draw do
 run in parallel if they are not rendered with the slave view. Moreover,
 if I change the render order of the slave camera from PRE_RENDER to
 POST_RENDER it is ok.

 I could simply use POST_RENDER but I am afraid it introduces an extra
 one frame latency. If I render the screen aligned quad first and the
 scene later than what I see on the quad is the texture from previous
 frame (right?).

 Any ideas?

 cheers,
 tugkan




 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
 


-- 
Tugkan Calapoglu

-
VIRES Simulationstechnologie GmbH
Oberaustrasse 34
83026 Rosenheim
Germany
phone+49.8031.463641
fax  +49.8031.463645
emailtug...@vires.com
internet www.vires.com
-
Sitz der Gesellschaft: Rosenheim
Handelsregister Traunstein  HRB 10410
Geschaeftsfuehrer: Marius Dupuis
   Wunibald Karl
-
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-13 Thread Robert Osfield
HI Tugkan,

On Wed, Jan 13, 2010 at 12:20 PM, Tugkan Calapoglu tug...@vires.com wrote:
 Sure. I do not expect that two cameras render in parallel onto a single
 window, but cull and draw of a certain camera should run parallel.
 Indeed they do so normally with the exact same scene and application. It
 breaks only if the second camera (the slave) has PRE_RENDER render order.

Cull and draw can only run in a parallel once all the dynamic geometry
has been dispatched, otherwise the draw will be dispatching data that
is being modified by the next frames update and cull traversals.
Perhaps you have some dynamic geometry or StateSet's that are holding
back the next frame.

Regardless of threading of cull your problem is draw dispatch not
cull, you need to look into why the draw dispatch on the second draw
is taking so long.  Please look at my last email.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] RTT slave views and multi-threading

2010-01-13 Thread Wojciech Lewandowski

Hi Tugkan,

Robert mentioned lengthy read operation. It may be related to read buffer 
operation thats used to compute shadow volume in 
LightSpacePerpspectiveShadowMapDB. If your slave view uses 
osgShadow::LightSpacePerpspectiveShadowMapDB then you may check if 
osgShadow::LightSpacePerpspectiveShadowMapCB (cull bounds flavour) has the 
same problem.


I am aware of LightSpacePerpspectiveShadowMapDB glReadBuffer limitation but 
I could not find quick and easy to implement workaround that would do this 
without scanning the image by CPU. I allocate small 64x64 texture and render 
the scene there, then read it into CPU memory and use CPU to scan pixels to 
optimzie shadow volume from depths and pixel locations strored in this 
prerender image.


Wojtek

- Original Message - 
From: Robert Osfield robert.osfi...@gmail.com

To: OpenSceneGraph Users osg-users@lists.openscenegraph.org
Sent: Wednesday, January 13, 2010 1:04 PM
Subject: Re: [osg-users] RTT slave views and multi-threading


Hi Tugkan,

The osgdistortion example works a bit like what you are describing,
could you try this to see what performance it's getting.

As for general notes about threading, if you are working on the same
graphics context as you are then all the draw dispatch and the draw
GPU can only be done by a single graphics thread so there is little
opportunity to make it more parallel without using another graphics
card/graphics context and interleaving of frames.

As for why the second camera is very expensive on draw dispatch, this
suggest to me that it's blocking either due to the OpenGL fifo being
full or that it contains a GL read back operation of some kind.

Robert.

On Wed, Jan 13, 2010 at 11:34 AM, Tugkan Calapoglu tug...@vires.com wrote:

Hi All,

I am using a slave view for rendering the scene to a texture. Initially
I tried with a camera node, however, this did not work well due to a
problem in LiSPSM shadows and I was suggested to use RTT slave views.

My setup is as follows: There is a single main view and I attach a slave
view to it. This slave view is attached with addSlave( slave , false );
so that it does *not* automatically use the master scene.

I attach a texture to the slave view and make my scene child of this
view. I attach a screen aligned quad to the main view. This quad
visualizes the RTT texture from the slave view.

Now I have a threading problem which can be seen on the snapshot I
attached. There are two issues:
1- The main view (cam1) has a very large draw time even though it only
renders the screen aligned quad. I double checked to see whether it also
renders the actual scene but this is not the case.

2- Slave view does not run cull and draw in parallel. Cull and draw do
run in parallel if they are not rendered with the slave view. Moreover,
if I change the render order of the slave camera from PRE_RENDER to
POST_RENDER it is ok.

I could simply use POST_RENDER but I am afraid it introduces an extra
one frame latency. If I render the screen aligned quad first and the
scene later than what I see on the quad is the texture from previous
frame (right?).

Any ideas?

cheers,
tugkan




___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org 


___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org