Re: [osg-users] osg(terrain) krasches on a double delete
Hi Robert, Good. This also works for me in the osgterrain-crash-program. I've updated our patches to match this fix, and our internal tests also pass. Thanks for taking the time and sharing this great software! Ola On Fri, 31 Aug 2012 18:03:33 +0200, Robert Osfield robert.osfi...@gmail.com wrote: Hi Ola, Brad et. al, I have just checked in a fix, similar to Ola's suggested fix, with this fix I can't get Ola's modified osgterrain to crash and when the code had debugging code in place it showed that it was detected the problem TerrainTile's that were being deleted and correctly handled these. The code block of interest is now: { OpenThreads::ScopedLockOpenThreads::ReentrantMutex lock(_mutex); for(TerrainTileSet::iterator itr = _updateTerrainTileSet.begin(); itr !=_updateTerrainTileSet.end(); ++itr) { // take a reference first to make sure that the referenceCount can be safely read without another thread decrementing it to zero. (*itr)-ref(); // only if referenceCount is 2 or more indicating there is still a reference held elsewhere is it safe to add it to list of tiles to be updated if ((*itr)-referenceCount()1) tiles.push_back(*itr); // use unref_nodelete to avoid any issues when the *itr TerrainTile has been deleted by another thread while this for loop has been running. (*itr)-unref_nodelete(); } _updateTerrainTileSet.clear(); } Could you please try out the svn/trunk version of the OSG and let me know how you get on, Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
Hi Robert et al., I finally got the time to investigate the code (osgTerrain/Terrain.cpp). As I see it, it is not necessary to introduce a new data structure to hold the list of update tiles. The issue was that a TerrainTile could be partly destructed but not unregisterTile:ed, when an Terrain::update call was made. Then, a TerrainTile could get its refcount bumped while being destructed and be deleted with a dangling ref_ptr to deleted memory. My solution is inspired by the code in ObserverSet::addRefLock and only modifies Terrain::traverse. I bump the raw Terrain-pointers in the update list to ref_ptr and check their status before doing any work on them. TerrainTiles with a refcount of 1 should be discarded (but not deleted). I have attached the complete file. Should this conversation be continued at osg-submissions also/instead? Cheers, Ola Hi Ola et. al, I've looked at the problem Terrain containers and they present an interesting issue - the std::setTerrainTile* that is used can't easily be converted into an std::set osg::observer_ptrTerrainTile as the observer_ptr can have it's value changed to NULL by another thread when destructing the observed TerrainTile and std::set require their values to be const. What will be required is a custom container of osg::oberserver_ptr along the lines of osg::OberservedNodePath but written to manage it's contents in a similar way to a std::set. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org Using Opera's revolutionary email client: http://www.opera.com/mail/ Terrain.cpp Description: Binary data ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
Hi Robert, Hm, there is probably something I don't see. I am new to the code... I assumed that a TerrainTile cannot be _completely_ deleted in the Terrain::update call, because if it were it _must_ have passed through Terrain::unregisterTile and then it wouldn't be in the update set to begin with. Therefore it is still safe to investigate the raw pointers in the _updateTerrainTileSet in Terrain::update. Do you agree on the assumption? If not can you exemplify when it doesn't hold? Cheers, Ola On Mon, 27 Aug 2012 15:49:47 +0200, Robert Osfield robert.osfi...@gmail.com wrote: Hi Ola, I don't think your suggested change will help as you are taking a reference to a potentially deleted object, the Terrain::_mutex that is locked doesn't effected the children just the management of the Terrain::_terrainTileMap and _updateTerrainTileSet. My current thought is that we'll need to introduce a form of set that manages a list of observer_ptr much in the same way that the ObserverNodePath is managed, so we'll need an equivalent ObserverNodeSet class. Robert. On 27 August 2012 12:46, Ola Nilsson o...@weatherone.tv wrote: Hi Robert et al., I finally got the time to investigate the code (osgTerrain/Terrain.cpp). As I see it, it is not necessary to introduce a new data structure to hold the list of update tiles. The issue was that a TerrainTile could be partly destructed but not unregisterTile:ed, when an Terrain::update call was made. Then, a TerrainTile could get its refcount bumped while being destructed and be deleted with a dangling ref_ptr to deleted memory. My solution is inspired by the code in ObserverSet::addRefLock and only modifies Terrain::traverse. I bump the raw Terrain-pointers in the update list to ref_ptr and check their status before doing any work on them. TerrainTiles with a refcount of 1 should be discarded (but not deleted). I have attached the complete file. Should this conversation be continued at osg-submissions also/instead? Cheers, Ola Hi Ola et. al, I've looked at the problem Terrain containers and they present an interesting issue - the std::setTerrainTile* that is used can't easily be converted into an std::set osg::observer_ptrTerrainTile as the observer_ptr can have it's value changed to NULL by another thread when destructing the observed TerrainTile and std::set require their values to be const. What will be required is a custom container of osg::oberserver_ptr along the lines of osg::OberservedNodePath but written to manage it's contents in a similar way to a std::set. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
On Mon, 27 Aug 2012 16:40:39 +0200, Robert Osfield robert.osfi...@gmail.com wrote: Hi Ola, On 27 August 2012 15:23, Ola Nilsson o...@weatherone.tv wrote: Hm, there is probably something I don't see. I am new to the code... I assumed that a TerrainTile cannot be _completely_ deleted in the Terrain::update call, because if it were it _must_ have passed through Terrain::unregisterTile and then it wouldn't be in the update set to begin with. unregisterTile only gets called in the TerrainTile destructor after it's ref count has gone to zero. It'd be safe if unregisterTile was Yes, the ref count is zero but the destructor chain has not yet completed, by induction it hasn't passed ~TerrainTile. (If it had unregisterTile would have been called and completed.) Right? In this situation memory is not yet recycled and the object's parents data (Referenced) should be safe to access and even manipulate. Right? Ola Btw. the new code passes my modified osgterrain test, which of course doesn't prove anything... called prior to the TerrailTile object's ref count goes to zero. Therefore it is still safe to investigate the raw pointers in the _updateTerrainTileSet in Terrain::update. Do you agree on the assumption? If not can you exemplify when it doesn't hold? Nope it's not safe as explained above. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
Hi Robert, On Mon, 27 Aug 2012 17:58:58 +0200, Robert Osfield robert.osfi...@gmail.com wrote: Hi Ola, On 27 August 2012 16:31, Ola Nilsson o...@weatherone.tv wrote: Yes, the ref count is zero but the destructor chain has not yet completed, by induction it hasn't passed ~TerrainTile. (If it had unregisterTile would have been called and completed.) Right? Thinking about your explanation, I think you are correct,the Terrain::unregisterTile won't be allowed to complete because the Terrain::traverse() will be holding the Terrain::_mutex so the destructor won't be able to finish and have the memory fully deleted and be able to be recycled. In this situation memory is not yet recycled and the object's parents data (Referenced) should be safe to access and even manipulate. Right? Safe... possible, it still doesn't feel robust though. Agree. If someone registered a tile that wasn't ref counted it wouldn't work. There are some assumptions on usage, but I think most of them apply with the current code. Please advise if you have any specific scenarios that should be respected. Btw. the new code passes my modified osgterrain test, which of course doesn't prove anything... It's an encouraging start though. I do wonder if in your changes there is a need to take a ref_ptr and then check for a referenceCount() of 1, as if the ref count is zero to start with then just checking against 0 without the ref_ptr should be sufficient a test to see if the object has been deleted. That was also my initial version. But thinking about worst case I realized that there is a possibility that between a call to getReferenceCount() and the call to ref() or ref_ptr constructor another thread might drop the reference count and trigger destruction. Unlikely, but I think possible. By using the ref_ptr the operations should be atomic and safe. Cheers, Ola Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
Hi Robert, Ok, I see the problem. If I get the time later today I will look into the code in depth. /Ola On Tue, 21 Aug 2012 20:43:44 +0200, Robert Osfield robert.osfi...@gmail.com wrote: Hi Ola et. al, I've looked at the problem Terrain containers and they present an interesting issue - the std::setTerrainTile* that is used can't easily be converted into an std::set osg::observer_ptrTerrainTile as the observer_ptr can have it's value changed to NULL by another thread when destructing the observed TerrainTile and std::set require their values to be const. What will be required is a custom container of osg::oberserver_ptr along the lines of osg::OberservedNodePath but written to manage it's contents in a similar way to a std::set. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org -- Using Opera's revolutionary email client: http://www.opera.com/mail/ ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
Interesting. Did you notice any slowdown? In my experience, even if the crash doesn't happen for a long time, a significant slowdown is almost immediate. Ola On Tue, 21 Aug 2012 07:41:27 +0200, Christiansen, Brad brad.christian...@thalesgroup.com.au wrote: Hi, As another data point for you, I tried to reproduce the crash on a Win 7 machine (VS2010) using a trunk build from a few weeks ago, and couldn't reproduce the crash. I left the example running for about 10 minutes with the camera spinning. Cheers, Brad -Original Message- From: osg-users-boun...@lists.openscenegraph.org [mailto:osg-users-boun...@lists.openscenegraph.org] On Behalf Of Ola Nilsson Sent: Thursday, 9 August 2012 11:41 PM To: OpenSceneGraph Users Subject: Re: [osg-users] osg(terrain) krasches on a double delete Here is some more info on how to reproduce our crash. 1. Compile the attached osgterrain program by overwriting examples/osgterrain/osgterrain.cpp and recompile. 2. Run the program with http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input. 3. Wait (no further input is necessary). However, camera movement (tile loading) produces earlier crashes. The attached image shows performance of a typical run (without user input) where the frame time cyclically increases until the program crashes. Cheers, Ola ps. The output from the program is: iteration sample_ratio frame_time On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv wrote: Hi, We have been looking for a hard-to-reproduce crash in our software that seems to originate from a double delete inside osg. I have (finally) been able to reproduce the crash using a version of the osgterrain-example that _exaggerates_ the usage pattern that crashes our application. In examples/osgterrain.cpp remove return viwer.run(); an exchange it with: while(!viewer.done()) { osg::Timer_t start_tick = osg::Timer::instance()-tick(); float sr = rand() * 1.0 / RAND_MAX ; std::cerr sr; viewer.getCamera()-setLODScale(sr*10); terrain-setSampleRatio( sr ); osg::Timer_t middle_tick = osg::Timer::instance()-tick(); std::cerr osg::Timer::instance()-delta_m(start_tick, middle_tick) std::flush; viewer.frame(); std::cerr ' ' osg::Timer::instance()-delta_m(middle_tick, osg::Timer::instance()-tick()) std::endl; } return 0; When run (tested on ive earth models generated with osgdem) the frame time slowly increases, and, after a while, it warns about deleting a still referenced object and then (after arbitrary time) crashes with a glibc error. Is this usage (setLODScale + setSampleRatio) safe? If not how should these functions be called? If it's a bug, we would be _very_ happy to have it fixed or pointers about where to look in the code. We've previously submitted a patch that switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset 12904), could this be a similar issue? My system is running Centos 6.3 (x86_64) and I compiled osg in debug mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk (r13106). Since I suspect a threading issue; OpenThreads/Config looks like this: #define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS /* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */ /* #undef _OPENTHREADS_ATOMIC_USE_SUN */ /* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */ /* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */ /* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */ /* #undef OT_LIBRARY_STATIC */ If I set a break point in the warning for deleting still referenced I get the following stack trace (using the osg 3.0.1 tag): Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 236 OSG_WARNWarning: deleting still referenced object this of type 'typeid(this).name()'std::endl; (gdb) bt #0 osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 #1 0x779e467c in osg::Object::~Object (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45 #2 0x779dd71d in osg::Node::~Node (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94 #3 0x77993953 in osg::Group::~Group (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53 #4 0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #5 0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #6 0x77a131a6 in osg::Referenced::signalObserversAndDelete (this=0xaec2160
Re: [osg-users] osg(terrain) krasches on a double delete
Bump. I would really appreciate some info on this problem. The double delete originates from the DatabasePager, but the destruction of these objects should be safe(ref counted as well as threadsafe) shouldn't it? I would be grateful if someone could take a look at the stacktrace (prev post, below) and see if there is anything suspicious going on. Cheers, Ola On Fri, 10 Aug 2012 11:01:49 +0200, Ola Nilsson o...@weatherone.tv wrote: I have now also reproduced the crash on darwin/osx in i386 mode. Cheers, Ola On Thu, 09 Aug 2012 17:41:12 +0200, Ola Nilsson o...@weatherone.tv wrote: Here is some more info on how to reproduce our crash. 1. Compile the attached osgterrain program by overwriting examples/osgterrain/osgterrain.cpp and recompile. 2. Run the program with http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input. 3. Wait (no further input is necessary). However, camera movement (tile loading) produces earlier crashes. The attached image shows performance of a typical run (without user input) where the frame time cyclically increases until the program crashes. Cheers, Ola ps. The output from the program is: iteration sample_ratio frame_time On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv wrote: Hi, We have been looking for a hard-to-reproduce crash in our software that seems to originate from a double delete inside osg. I have (finally) been able to reproduce the crash using a version of the osgterrain-example that _exaggerates_ the usage pattern that crashes our application. In examples/osgterrain.cpp remove return viwer.run(); an exchange it with: while(!viewer.done()) { osg::Timer_t start_tick = osg::Timer::instance()-tick(); float sr = rand() * 1.0 / RAND_MAX ; std::cerr sr; viewer.getCamera()-setLODScale(sr*10); terrain-setSampleRatio( sr ); osg::Timer_t middle_tick = osg::Timer::instance()-tick(); std::cerr osg::Timer::instance()-delta_m(start_tick, middle_tick) std::flush; viewer.frame(); std::cerr ' ' osg::Timer::instance()-delta_m(middle_tick, osg::Timer::instance()-tick()) std::endl; } return 0; When run (tested on ive earth models generated with osgdem) the frame time slowly increases, and, after a while, it warns about deleting a still referenced object and then (after arbitrary time) crashes with a glibc error. Is this usage (setLODScale + setSampleRatio) safe? If not how should these functions be called? If it's a bug, we would be _very_ happy to have it fixed or pointers about where to look in the code. We've previously submitted a patch that switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset 12904), could this be a similar issue? My system is running Centos 6.3 (x86_64) and I compiled osg in debug mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk (r13106). Since I suspect a threading issue; OpenThreads/Config looks like this: #define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS /* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */ /* #undef _OPENTHREADS_ATOMIC_USE_SUN */ /* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */ /* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */ /* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */ /* #undef OT_LIBRARY_STATIC */ If I set a break point in the warning for deleting still referenced I get the following stack trace (using the osg 3.0.1 tag): Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 236 OSG_WARNWarning: deleting still referenced object this of type 'typeid(this).name()'std::endl; (gdb) bt #0 osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 #1 0x779e467c in osg::Object::~Object (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45 #2 0x779dd71d in osg::Node::~Node (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94 #3 0x77993953 in osg::Group::~Group (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53 #4 0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #5 0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #6 0x77a131a6 in osg::Referenced::signalObserversAndDelete (this=0xaec2160, signalDelete=true, doDelete=true) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323 #7 0x0040b07d in osg::Referenced::unref (this=0xaec2160) at /home/ola/src
Re: [osg-users] osg(terrain) krasches on a double delete
Hi, On 20 aug 2012, at 17.47, Robert Osfield wrote: Hi Ola, On 20 August 2012 16:29, Ola Nilsson o...@weatherone.tv wrote: The double delete originates from the DatabasePager, but the destruction of these objects should be safe(ref counted as well as threadsafe) shouldn't it? I would be grateful if someone could take a look at the stacktrace (prev post, below) and see if there is anything suspicious going on. I have just tried your modified osgterrain and it crashes for me with a seg fault during the update traversal. I am just doing a debug build to see if this reveals anything more useful. Great news! While the usage your modified osgterrain shouldn't cause a crash it's certainly not the way I would recommend anyone to use osgTerrain - the setSampleRatio() feature is meant to be set once in an application rather than thrashed continuously as changing the same ratio forces a rebuild of the terrain geometry. The Camera::setLODScale() is however a light weight operation without any performance or stability consequences and can be changed every frame without problem. The question has to be why you are changing the Terrain SampleRatio. As I tried to convey in my first post, the sample program just exaggerates a usage scenario that _very_ seldom crashed our app. We obviously do not change sampleratio per frame call in our app. It can however be set per flight in a project. We will revise this usage. Ola Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
Hi Robert, This explains the behavior we have experienced, the crashes we observed were notoriously difficult to reproduce in a debug environment (gdb, valgrind, etc.). Thanks for taking your time to investigate this! Ola On 20 aug 2012, at 18.41, Robert Osfield wrote: Hi Ola, I've been doing a code review of Terrain and TerrainTile looking at how deletion of TerrainTile will affect the internal TerrainTile pointers held in Terrain that track all the tiles. There internal TerrainTile pointers in Terrain are protected by a reentrant mutex but I believe a problem can occur when the TerrainTile destructor calls Terrain::unregisterTerrainTile(TerrainTile*) when at the same time the Terrain::traverse(NodeVisitor) method is taking a copy of the _updateTerrainTileSet, when this occurs the update traversal would increment the ref count on an TerrainTile being destructed by the DatabasePager thread that is cleaning up expired tiles. This issue boils down to attempt to make a thread safe list TerrainTile observers in Terrain that doesn't quite achieve what it intends. I haven't thought deeply enough about the issue yet to know what the best solution will be. My head isn't quite into zone that will allow me to solve this one so I'll have to come back to it at a later date. A short term solution would be set the pager so that it doesn't delete objects in a background pager thread via DatabasePager::setDeleteRemovedSubgraphsInDatabaseThread(false), this won't fix the thread safety issue but will at least prevent the particular instance where the update traversal is occurring at the same time as the deletion of TerrainTile. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] osg(terrain) krasches on a double delete
I have now also reproduced the crash on darwin/osx in i386 mode. Cheers, Ola On Thu, 09 Aug 2012 17:41:12 +0200, Ola Nilsson o...@weatherone.tv wrote: Here is some more info on how to reproduce our crash. 1. Compile the attached osgterrain program by overwriting examples/osgterrain/osgterrain.cpp and recompile. 2. Run the program with http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input. 3. Wait (no further input is necessary). However, camera movement (tile loading) produces earlier crashes. The attached image shows performance of a typical run (without user input) where the frame time cyclically increases until the program crashes. Cheers, Ola ps. The output from the program is: iteration sample_ratio frame_time On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv wrote: Hi, We have been looking for a hard-to-reproduce crash in our software that seems to originate from a double delete inside osg. I have (finally) been able to reproduce the crash using a version of the osgterrain-example that _exaggerates_ the usage pattern that crashes our application. In examples/osgterrain.cpp remove return viwer.run(); an exchange it with: while(!viewer.done()) { osg::Timer_t start_tick = osg::Timer::instance()-tick(); float sr = rand() * 1.0 / RAND_MAX ; std::cerr sr; viewer.getCamera()-setLODScale(sr*10); terrain-setSampleRatio( sr ); osg::Timer_t middle_tick = osg::Timer::instance()-tick(); std::cerr osg::Timer::instance()-delta_m(start_tick, middle_tick) std::flush; viewer.frame(); std::cerr ' ' osg::Timer::instance()-delta_m(middle_tick, osg::Timer::instance()-tick()) std::endl; } return 0; When run (tested on ive earth models generated with osgdem) the frame time slowly increases, and, after a while, it warns about deleting a still referenced object and then (after arbitrary time) crashes with a glibc error. Is this usage (setLODScale + setSampleRatio) safe? If not how should these functions be called? If it's a bug, we would be _very_ happy to have it fixed or pointers about where to look in the code. We've previously submitted a patch that switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset 12904), could this be a similar issue? My system is running Centos 6.3 (x86_64) and I compiled osg in debug mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk (r13106). Since I suspect a threading issue; OpenThreads/Config looks like this: #define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS /* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */ /* #undef _OPENTHREADS_ATOMIC_USE_SUN */ /* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */ /* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */ /* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */ /* #undef OT_LIBRARY_STATIC */ If I set a break point in the warning for deleting still referenced I get the following stack trace (using the osg 3.0.1 tag): Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 236 OSG_WARNWarning: deleting still referenced object this of type 'typeid(this).name()'std::endl; (gdb) bt #0 osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 #1 0x779e467c in osg::Object::~Object (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45 #2 0x779dd71d in osg::Node::~Node (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94 #3 0x77993953 in osg::Group::~Group (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53 #4 0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #5 0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #6 0x77a131a6 in osg::Referenced::signalObserversAndDelete (this=0xaec2160, signalDelete=true, doDelete=true) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323 #7 0x0040b07d in osg::Referenced::unref (this=0xaec2160) at /home/ola/src/OpenSceneGraph-3.0.1/include/osg/Referenced:198 #8 0x0040cd3b in osg::ref_ptrosg::Node::~ref_ptr (this=0x478d410, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/include/osg/ref_ptr:35 #9 0x7799627e in std::_Destroyosg::ref_ptrosg::Node (__pointer=0x478d410) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:83 #10 0x77996084 in std::_Destroy_auxfalse::__destroyosg
Re: [osg-users] osg(terrain) krasches on a double delete
Here is some more info on how to reproduce our crash. 1. Compile the attached osgterrain program by overwriting examples/osgterrain/osgterrain.cpp and recompile. 2. Run the program with http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input. 3. Wait (no further input is necessary). However, camera movement (tile loading) produces earlier crashes. The attached image shows performance of a typical run (without user input) where the frame time cyclically increases until the program crashes. Cheers, Ola ps. The output from the program is: iteration sample_ratio frame_time On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv wrote: Hi, We have been looking for a hard-to-reproduce crash in our software that seems to originate from a double delete inside osg. I have (finally) been able to reproduce the crash using a version of the osgterrain-example that _exaggerates_ the usage pattern that crashes our application. In examples/osgterrain.cpp remove return viwer.run(); an exchange it with: while(!viewer.done()) { osg::Timer_t start_tick = osg::Timer::instance()-tick(); float sr = rand() * 1.0 / RAND_MAX ; std::cerr sr; viewer.getCamera()-setLODScale(sr*10); terrain-setSampleRatio( sr ); osg::Timer_t middle_tick = osg::Timer::instance()-tick(); std::cerr osg::Timer::instance()-delta_m(start_tick, middle_tick) std::flush; viewer.frame(); std::cerr ' ' osg::Timer::instance()-delta_m(middle_tick, osg::Timer::instance()-tick()) std::endl; } return 0; When run (tested on ive earth models generated with osgdem) the frame time slowly increases, and, after a while, it warns about deleting a still referenced object and then (after arbitrary time) crashes with a glibc error. Is this usage (setLODScale + setSampleRatio) safe? If not how should these functions be called? If it's a bug, we would be _very_ happy to have it fixed or pointers about where to look in the code. We've previously submitted a patch that switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset 12904), could this be a similar issue? My system is running Centos 6.3 (x86_64) and I compiled osg in debug mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk (r13106). Since I suspect a threading issue; OpenThreads/Config looks like this: #define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS /* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */ /* #undef _OPENTHREADS_ATOMIC_USE_SUN */ /* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */ /* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */ /* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */ /* #undef OT_LIBRARY_STATIC */ If I set a break point in the warning for deleting still referenced I get the following stack trace (using the osg 3.0.1 tag): Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 236 OSG_WARNWarning: deleting still referenced object this of type 'typeid(this).name()'std::endl; (gdb) bt #0 osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 #1 0x779e467c in osg::Object::~Object (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45 #2 0x779dd71d in osg::Node::~Node (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94 #3 0x77993953 in osg::Group::~Group (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53 #4 0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #5 0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #6 0x77a131a6 in osg::Referenced::signalObserversAndDelete (this=0xaec2160, signalDelete=true, doDelete=true) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323 #7 0x0040b07d in osg::Referenced::unref (this=0xaec2160) at /home/ola/src/OpenSceneGraph-3.0.1/include/osg/Referenced:198 #8 0x0040cd3b in osg::ref_ptrosg::Node::~ref_ptr (this=0x478d410, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/include/osg/ref_ptr:35 #9 0x7799627e in std::_Destroyosg::ref_ptrosg::Node (__pointer=0x478d410) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:83 #10 0x77996084 in std::_Destroy_auxfalse::__destroyosg::ref_ptrosg::Node* (__first=0x478d410, __last=0x478d418) at /usr/lib/gcc
[osg-users] osg(terrain) krasches on a double delete
Hi, We have been looking for a hard-to-reproduce crash in our software that seems to originate from a double delete inside osg. I have (finally) been able to reproduce the crash using a version of the osgterrain-example that _exaggerates_ the usage pattern that crashes our application. In examples/osgterrain.cpp remove return viwer.run(); an exchange it with: while(!viewer.done()) { osg::Timer_t start_tick = osg::Timer::instance()-tick(); float sr = rand() * 1.0 / RAND_MAX ; std::cerr sr; viewer.getCamera()-setLODScale(sr*10); terrain-setSampleRatio( sr ); osg::Timer_t middle_tick = osg::Timer::instance()-tick(); std::cerr osg::Timer::instance()-delta_m(start_tick, middle_tick) std::flush; viewer.frame(); std::cerr ' ' osg::Timer::instance()-delta_m(middle_tick, osg::Timer::instance()-tick()) std::endl; } return 0; When run (tested on ive earth models generated with osgdem) the frame time slowly increases, and, after a while, it warns about deleting a still referenced object and then (after arbitrary time) crashes with a glibc error. Is this usage (setLODScale + setSampleRatio) safe? If not how should these functions be called? If it's a bug, we would be _very_ happy to have it fixed or pointers about where to look in the code. We've previously submitted a patch that switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset 12904), could this be a similar issue? My system is running Centos 6.3 (x86_64) and I compiled osg in debug mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk (r13106). Since I suspect a threading issue; OpenThreads/Config looks like this: #define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS /* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */ /* #undef _OPENTHREADS_ATOMIC_USE_SUN */ /* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */ /* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */ /* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */ /* #undef OT_LIBRARY_STATIC */ If I set a break point in the warning for deleting still referenced I get the following stack trace (using the osg 3.0.1 tag): Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 236 OSG_WARNWarning: deleting still referenced object this of type 'typeid(this).name()'std::endl; (gdb) bt #0 osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236 #1 0x779e467c in osg::Object::~Object (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45 #2 0x779dd71d in osg::Node::~Node (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94 #3 0x77993953 in osg::Group::~Group (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53 #4 0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #5 0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile (this=0xaec2160, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95 #6 0x77a131a6 in osg::Referenced::signalObserversAndDelete (this=0xaec2160, signalDelete=true, doDelete=true) at /home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323 #7 0x0040b07d in osg::Referenced::unref (this=0xaec2160) at /home/ola/src/OpenSceneGraph-3.0.1/include/osg/Referenced:198 #8 0x0040cd3b in osg::ref_ptrosg::Node::~ref_ptr (this=0x478d410, __in_chrg=value optimized out) at /home/ola/src/OpenSceneGraph-3.0.1/include/osg/ref_ptr:35 #9 0x7799627e in std::_Destroyosg::ref_ptrosg::Node (__pointer=0x478d410) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:83 #10 0x77996084 in std::_Destroy_auxfalse::__destroyosg::ref_ptrosg::Node* (__first=0x478d410, __last=0x478d418) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:93 #11 0x77995dcf in std::_Destroyosg::ref_ptrosg::Node* (__first=0x478d410, __last=0x478d418) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:116 #12 0x77995847 in std::_Destroyosg::ref_ptrosg::Node*, osg::ref_ptrosg::Node (__first=0x478d410, __last=0x478d418) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:142 #13 0x77994e56 in std::vectorosg::ref_ptrosg::Node, std::allocatorosg::ref_ptrosg::Node ::~vector (this=0xaec1fd8, __in_chrg=value optimized out) at