Re: [osg-users] infinite block with CullThreadPerCameraDrawThreadPerContext
Hi Csaba, I have finally cleared through my inbox to looking closely at this issue and the suggested bug fix. I believe your suggested change is safe in normal execution for CullThreadPerCameraDrawThreadPerContext and DrawThreadPerContext, as the the _startRenderingBarrier barrier being joined will be holding back the graphics threads, and calling the reset of the block before joing this barrier will only release the graphics threads to join this barrier - they can't go one to next frame till the main thread joins the barrier. WIth your patch applied all the OSG example still run fine even when I set OSG_THREADING to CullThreadPerCameraDrawThreadPerContext, but then these run OK before anyway... so it's no 100% confirmation of a fix, but it's also close as I can get without ever having reproduced the hang myself. Such is the way with threading bugs... My thanks for your perseverance on this. The fix is now checked into svn and will be part of 2.7.5 dev release. I also fixed the type of ThreadSafeQueue as well. Cheers, Robert. On Fri, Oct 17, 2008 at 4:02 PM, Csaba Halász <[EMAIL PROTECTED]> wrote: > On Thu, Oct 16, 2008 at 2:42 PM, Robert Osfield > <[EMAIL PROTECTED]> wrote: >> HI Csaba, >> >> I suspect the particular problem you are seeing is not directly driver >> related, but is an OSG bug, differences in drivers might change the >> timing slightly which leads to the problem not becoming visible, but >> may well still be lurking. > > Hi Robert, > > Huh, took me two days, but I think I have traced this to a race condition. > Apparently the _endDynamicDrawBlock was already reached by a graphics > thread before the viewer got a chance to reset it. > So then that draw thread was released (and subsequently blocked on the > sceneview queue) and the viewer has gone to infinite wait on the > _endDynamicDrawBlock later. > This simpe patch (that doesn't even hint at how difficult it is to > trace such bugs) seems to fix the issue here: > > Index: src/osgViewer/ViewerBase.cpp > === > --- src/osgViewer/ViewerBase.cpp(revision 9034) > +++ src/osgViewer/ViewerBase.cpp(working copy) > @@ -674,14 +674,14 @@ > > bool doneMakeCurrentInThisThread = false; > > -// dispatch the the rendering threads > -if (_startRenderingBarrier.valid()) _startRenderingBarrier->block(); > - > if (_endDynamicDrawBlock.valid()) > { > _endDynamicDrawBlock->reset(); > } > > +// dispatch the the rendering threads > +if (_startRenderingBarrier.valid()) _startRenderingBarrier->block(); > + > // reset any double buffer graphics objects > for(Cameras::iterator camItr = cameras.begin(); > camItr != cameras.end(); > > I am not even sure why the _endDynamicDrawBlock has to be reset. Comments? > > -- > Csaba > ___ > osg-users mailing list > osg-users@lists.openscenegraph.org > http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] infinite block with CullThreadPerCameraDrawThreadPerContext
On Thu, Oct 16, 2008 at 2:42 PM, Robert Osfield <[EMAIL PROTECTED]> wrote: > HI Csaba, > > I suspect the particular problem you are seeing is not directly driver > related, but is an OSG bug, differences in drivers might change the > timing slightly which leads to the problem not becoming visible, but > may well still be lurking. Hi Robert, Huh, took me two days, but I think I have traced this to a race condition. Apparently the _endDynamicDrawBlock was already reached by a graphics thread before the viewer got a chance to reset it. So then that draw thread was released (and subsequently blocked on the sceneview queue) and the viewer has gone to infinite wait on the _endDynamicDrawBlock later. This simpe patch (that doesn't even hint at how difficult it is to trace such bugs) seems to fix the issue here: Index: src/osgViewer/ViewerBase.cpp === --- src/osgViewer/ViewerBase.cpp(revision 9034) +++ src/osgViewer/ViewerBase.cpp(working copy) @@ -674,14 +674,14 @@ bool doneMakeCurrentInThisThread = false; -// dispatch the the rendering threads -if (_startRenderingBarrier.valid()) _startRenderingBarrier->block(); - if (_endDynamicDrawBlock.valid()) { _endDynamicDrawBlock->reset(); } +// dispatch the the rendering threads +if (_startRenderingBarrier.valid()) _startRenderingBarrier->block(); + // reset any double buffer graphics objects for(Cameras::iterator camItr = cameras.begin(); camItr != cameras.end(); I am not even sure why the _endDynamicDrawBlock has to be reset. Comments? -- Csaba ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] infinite block with CullThreadPerCameraDrawThreadPerContext
On Thu, Oct 16, 2008 at 2:42 PM, Robert Osfield <[EMAIL PROTECTED]> wrote: > > CullThreadPerCameraDrawThreadPerContext is the most complex of all the > osgViewer threading models, and with it the widest range of different > thread/barrier combinations. It could be that you are using a > combination of cameras/contexts that hasn't been tested before, or > simply that the timing of threads in your case breaks the setup. > Without being able to reproduce the problem at my end there isn't too > much I can do to help out debug it. Could you have a bash at > recreating the problem with an existing OSG example such as osgcamera > or osgwindows? I can bash all I want, those work :) My investigation seems to indicate that draw() is called twice for a camera: Start frame3 cull() for camera GUI 0xf15aa0 cull() for camera GUI 0xf15aa0 got SceneView 0xf12c00 cull() for camera unnamed1 0xf15fd0 cull() for camera unnamed1 0xf15fd0 got SceneView 0xf1aa50 cull() for camera GUI 0xf15aa0 done for SceneView 0xf12c00 end cull() for camera GUI 0xf15aa0 _clampProjectionMatrix not applied, invalid depth range, znear = 3.40282e+38 zfar = -3.40282e+38 cull() for camera unnamed1 0xf15fd0 done for SceneView 0xf1aa50 end cull() for camera unnamed1 0xf15fd0 draw() for camera unnamed1 0xf15fd0 draw() for camera unnamed1 0xf15fd0 got SceneView 0xf1aa50 glGetString(GL_RENDERER)==GeForce 8600M GT/PCI/SSE2 draw() for camera unnamed1 0xf15fd0 done for SceneView 0xf1aa50 end draw() for camera unnamed1 0xf15fd0 draw() for camera GUI 0xf15aa0 draw() for camera GUI 0xf15aa0 got SceneView 0xf12c00 draw() for camera GUI 0xf15aa0 done for SceneView 0xf12c00 end draw() for camera GUI 0xf15aa0 draw() for camera unnamed1 0xf15fd0 Huh, unnamed1 was already drawn during this frame. I guess that is not normal? I will try to get a backtrace now. -- Csaba ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] infinite block with CullThreadPerCameraDrawThreadPerContext
HI Csaba, On Thu, Oct 16, 2008 at 1:29 PM, Csaba Halász <[EMAIL PROTECTED]> wrote: > The other threading models seem to work, and the osg camera example > works with CullThreadPerCameraDrawThreadPerContext too. > I wonder if the problem reported by Paul in the mail thread "Please > test SVN of OpenSceneGraph in prep for 2.7.3 dev release" may be > related to this one. I have been able to reproduce that with an older > nvidia driver, but since I upgraded to 177.80 osgcamera works ok. I suspect the particular problem you are seeing is not directly driver related, but is an OSG bug, differences in drivers might change the timing slightly which leads to the problem not becoming visible, but may well still be lurking. > If I read the code right, during a frame all threads enter through the > _startRenderingBarrier, then the cull threads do the work while the > matching draw threads are blocked, then finally the draw is done. The > way I see it, one of the draw threads is still blocked waiting for the > cull to happen. I'll try to add some more debug printouts, but I am > still open to ideas :) CullThreadPerCameraDrawThreadPerContext is the most complex of all the osgViewer threading models, and with it the widest range of different thread/barrier combinations. It could be that you are using a combination of cameras/contexts that hasn't been tested before, or simply that the timing of threads in your case breaks the setup. Without being able to reproduce the problem at my end there isn't too much I can do to help out debug it. Could you have a bash at recreating the problem with an existing OSG example such as osgcamera or osgwindows? Robert. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] infinite block with CullThreadPerCameraDrawThreadPerContext
On Thu, Oct 16, 2008 at 10:43 AM, Robert Osfield <[EMAIL PROTECTED]> wrote: > Hi Csaba, > > Given the provided details I don't have enough to provide an clues to > what might be up. > > Try different threading models to see what results you get. > > Also try the standard OSG examples to see if any of them hang. Hi Robert, The other threading models seem to work, and the osg camera example works with CullThreadPerCameraDrawThreadPerContext too. I wonder if the problem reported by Paul in the mail thread "Please test SVN of OpenSceneGraph in prep for 2.7.3 dev release" may be related to this one. I have been able to reproduce that with an older nvidia driver, but since I upgraded to 177.80 osgcamera works ok. If I read the code right, during a frame all threads enter through the _startRenderingBarrier, then the cull threads do the work while the matching draw threads are blocked, then finally the draw is done. The way I see it, one of the draw threads is still blocked waiting for the cull to happen. I'll try to add some more debug printouts, but I am still open to ideas :) -- Greets, Csaba ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] infinite block with CullThreadPerCameraDrawThreadPerContext
Hi Csaba, Given the provided details I don't have enough to provide an clues to what might be up. Try different threading models to see what results you get. Also try the standard OSG examples to see if any of them hang. Robert. On Wed, Oct 15, 2008 at 9:58 PM, Csaba Halász <[EMAIL PROTECTED]> wrote: > Hi again, > > Now that I got threading running, I have run into this trouble: > > draw() 0xee0b20 > draw() got SceneView 0xf18a80 > end draw() 0xee0b20 > draw() 0xda1f80 > cull() > cull() got SceneView 0xf03e90 > end cull() 0xee0b20 > cull() > cull() got SceneView 0xf7b020 > _clampProjectionMatrix not applied, invalid depth range, znear = > 3.40282e+38 zfar = -3.40282e+38 > end cull() 0xda1f80 > draw() got SceneView 0xf7b020 > end draw() 0xda1f80 > draw() 0xee0b20 > draw() got SceneView 0xf03e90 > end draw() 0xee0b20 > draw() 0xda1f80 > cull() > cull() got SceneView 0xf84a20 > _clampProjectionMatrix not applied, invalid depth range, znear = > 3.40282e+38 zfar = -3.40282e+38 > end cull() 0xda1f80 > draw() got SceneView 0xf84a20 > cull() > cull() got SceneView 0xf18a80 > end cull() 0xee0b20 > end draw() 0xda1f80 > draw() 0xee0b20 > draw() got SceneView 0xf18a80 > end draw() 0xee0b20 > draw() 0xda1f80 > > It stops here. > > (gdb) thr 5 > [Switching to thread 5 (Thread 0x420a5950 (LWP 28584))]#0 > 0x2ae3f4c89b99 in pthread_cond_wait@@GLIBC_2.3.2 () > from /lib/libpthread.so.0 > (gdb) bt > #0 0x2ae3f4c89b99 in pthread_cond_wait@@GLIBC_2.3.2 () from > /lib/libpthread.so.0 > #1 0x2ae3f7a55c56 in OpenThreads::Condition::wait (this= optimized out>, mutex=) >at > /home/hcs/src/OpenSceneGraph/src/OpenThreads/pthreads/PThreadCondition.c++:137 > #2 0x2ae3f68d0bd3 in > osgViewer::Renderer::TheadSafeQueue::takeFront (this=0x10f8a78) >at /home/hcs/src/OpenSceneGraph/include/OpenThreads/Block:42 > #3 0x2ae3f68d25de in osgViewer::Renderer::draw (this=0x10f8980) >at /home/hcs/src/OpenSceneGraph/src/osgViewer/Renderer.cpp:340 > #4 0x2ae3f772eac4 in osg::GraphicsContext::runOperations (this=0x10f99a0) >at /home/hcs/src/OpenSceneGraph/src/osg/GraphicsContext.cpp:688 > #5 0x2ae3f776c688 in osg::OperationThread::run (this=0x11cd6c0) >at /home/hcs/src/OpenSceneGraph/src/osg/OperationThread.cpp:413 > #6 0x2ae3f77351c0 in osg::GraphicsThread::run (this=0x11cd6c0) >at /home/hcs/src/OpenSceneGraph/src/osg/GraphicsThread.cpp:38 > #7 0x2ae3f7a55537 in > OpenThreads::ThreadPrivateActions::StartThread (data= out>) >at /home/hcs/src/OpenSceneGraph/src/OpenThreads/pthreads/PThread.c++:172 > #8 0x2ae3f4c853f7 in start_thread () from /lib/libpthread.so.0 > #9 0x2ae3f839497d in clone () from /lib/libc.so.6 > #10 0x in ?? () > > Any ideas? > > -- > Thanks, > Csaba > ___ > osg-users mailing list > osg-users@lists.openscenegraph.org > http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
[osg-users] infinite block with CullThreadPerCameraDrawThreadPerContext
Hi again, Now that I got threading running, I have run into this trouble: draw() 0xee0b20 draw() got SceneView 0xf18a80 end draw() 0xee0b20 draw() 0xda1f80 cull() cull() got SceneView 0xf03e90 end cull() 0xee0b20 cull() cull() got SceneView 0xf7b020 _clampProjectionMatrix not applied, invalid depth range, znear = 3.40282e+38 zfar = -3.40282e+38 end cull() 0xda1f80 draw() got SceneView 0xf7b020 end draw() 0xda1f80 draw() 0xee0b20 draw() got SceneView 0xf03e90 end draw() 0xee0b20 draw() 0xda1f80 cull() cull() got SceneView 0xf84a20 _clampProjectionMatrix not applied, invalid depth range, znear = 3.40282e+38 zfar = -3.40282e+38 end cull() 0xda1f80 draw() got SceneView 0xf84a20 cull() cull() got SceneView 0xf18a80 end cull() 0xee0b20 end draw() 0xda1f80 draw() 0xee0b20 draw() got SceneView 0xf18a80 end draw() 0xee0b20 draw() 0xda1f80 It stops here. (gdb) thr 5 [Switching to thread 5 (Thread 0x420a5950 (LWP 28584))]#0 0x2ae3f4c89b99 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 (gdb) bt #0 0x2ae3f4c89b99 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x2ae3f7a55c56 in OpenThreads::Condition::wait (this=, mutex=) at /home/hcs/src/OpenSceneGraph/src/OpenThreads/pthreads/PThreadCondition.c++:137 #2 0x2ae3f68d0bd3 in osgViewer::Renderer::TheadSafeQueue::takeFront (this=0x10f8a78) at /home/hcs/src/OpenSceneGraph/include/OpenThreads/Block:42 #3 0x2ae3f68d25de in osgViewer::Renderer::draw (this=0x10f8980) at /home/hcs/src/OpenSceneGraph/src/osgViewer/Renderer.cpp:340 #4 0x2ae3f772eac4 in osg::GraphicsContext::runOperations (this=0x10f99a0) at /home/hcs/src/OpenSceneGraph/src/osg/GraphicsContext.cpp:688 #5 0x2ae3f776c688 in osg::OperationThread::run (this=0x11cd6c0) at /home/hcs/src/OpenSceneGraph/src/osg/OperationThread.cpp:413 #6 0x2ae3f77351c0 in osg::GraphicsThread::run (this=0x11cd6c0) at /home/hcs/src/OpenSceneGraph/src/osg/GraphicsThread.cpp:38 #7 0x2ae3f7a55537 in OpenThreads::ThreadPrivateActions::StartThread (data=) at /home/hcs/src/OpenSceneGraph/src/OpenThreads/pthreads/PThread.c++:172 #8 0x2ae3f4c853f7 in start_thread () from /lib/libpthread.so.0 #9 0x2ae3f839497d in clone () from /lib/libc.so.6 #10 0x in ?? () Any ideas? -- Thanks, Csaba ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org