Re: [osg-users] osg(terrain) krasches on a double delete

2012-09-05 Thread Ola Nilsson

Hi Robert,

Good. This also works for me in the osgterrain-crash-program.

I've updated our patches to match this fix, and our internal tests also  
pass.


Thanks for taking the time and sharing this great software!

Ola




On Fri, 31 Aug 2012 18:03:33 +0200, Robert Osfield  
robert.osfi...@gmail.com wrote:



Hi Ola, Brad et. al,

I have just checked in a fix, similar to Ola's suggested fix, with
this fix I can't get Ola's modified osgterrain to crash and when the
code had debugging code in place it showed that it was detected the
problem TerrainTile's that were being deleted and correctly handled
these.  The code block of interest is now:

{
OpenThreads::ScopedLockOpenThreads::ReentrantMutex
lock(_mutex);
for(TerrainTileSet::iterator itr =
_updateTerrainTileSet.begin(); itr !=_updateTerrainTileSet.end();
++itr)
{
// take a reference first to make sure that the
referenceCount can be safely read without another thread decrementing
it to zero.
(*itr)-ref();

// only if referenceCount is 2 or more indicating
there is still a reference held elsewhere is it safe to add it to list
of tiles to be updated
if ((*itr)-referenceCount()1)  
tiles.push_back(*itr);


// use unref_nodelete to avoid any issues when the
*itr TerrainTile has been deleted by another thread while this for
loop has been running.
(*itr)-unref_nodelete();
}
_updateTerrainTileSet.clear();
}

Could you please try out the svn/trunk version of the OSG and let me
know how you get on,
Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



--
Using Opera's revolutionary email client: http://www.opera.com/mail/
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-27 Thread Ola Nilsson

Hi Robert et al.,

I finally got the time to investigate the code (osgTerrain/Terrain.cpp).

As I see it, it is not necessary to introduce a new data structure to hold  
the list of update tiles. The issue was that a TerrainTile could be partly  
destructed but not unregisterTile:ed, when an Terrain::update call was  
made. Then, a TerrainTile could get its refcount bumped while being  
destructed and be deleted with a dangling ref_ptr to deleted memory.


My solution is inspired by the code in ObserverSet::addRefLock and only  
modifies Terrain::traverse. I bump the raw Terrain-pointers in the update  
list to ref_ptr and check their status before doing any work on them.  
TerrainTiles with a refcount of 1 should be discarded (but not deleted).


I have attached the complete file. Should this conversation be continued  
at osg-submissions also/instead?


Cheers,

Ola


Hi Ola et. al,

I've looked at the problem Terrain containers and they present an
interesting issue - the std::setTerrainTile* that is used can't
easily be converted into an std::set osg::observer_ptrTerrainTile 
as the observer_ptr can have it's value changed to NULL by another
thread when destructing the observed TerrainTile and std::set
require their values to be const.

What will be required is a custom container of osg::oberserver_ptr
along the lines of osg::OberservedNodePath but written to manage it's
contents in a similar way to a std::set.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Using Opera's revolutionary email client: http://www.opera.com/mail/

Terrain.cpp
Description: Binary data
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-27 Thread Ola Nilsson

Hi Robert,

Hm, there is probably something I don't see. I am new to the code...

I assumed that a TerrainTile cannot be _completely_ deleted in the  
Terrain::update call, because if it were it _must_ have passed through  
Terrain::unregisterTile and then it wouldn't be in the update set to begin  
with. Therefore it is still safe to investigate the raw pointers in the  
_updateTerrainTileSet in Terrain::update.


Do you agree on the assumption? If not can you exemplify when it doesn't  
hold?


Cheers,

Ola



On Mon, 27 Aug 2012 15:49:47 +0200, Robert Osfield  
robert.osfi...@gmail.com wrote:



Hi Ola,

I don't think your suggested change will help as you are taking a
reference to a potentially deleted object, the Terrain::_mutex that is
locked doesn't effected the children just the management of the
Terrain::_terrainTileMap and _updateTerrainTileSet.

My current thought is that we'll need to introduce a form of set that
manages a list of observer_ptr much in the same way that the
ObserverNodePath is managed, so we'll need an equivalent
ObserverNodeSet class.

Robert.

On 27 August 2012 12:46, Ola Nilsson o...@weatherone.tv wrote:

Hi Robert et al.,

I finally got the time to investigate the code (osgTerrain/Terrain.cpp).

As I see it, it is not necessary to introduce a new data structure to  
hold
the list of update tiles. The issue was that a TerrainTile could be  
partly
destructed but not unregisterTile:ed, when an Terrain::update call was  
made.
Then, a TerrainTile could get its refcount bumped while being  
destructed and

be deleted with a dangling ref_ptr to deleted memory.

My solution is inspired by the code in ObserverSet::addRefLock and only
modifies Terrain::traverse. I bump the raw Terrain-pointers in the  
update

list to ref_ptr and check their status before doing any work on them.
TerrainTiles with a refcount of 1 should be discarded (but not deleted).

I have attached the complete file. Should this conversation be  
continued at

osg-submissions also/instead?

Cheers,

Ola



Hi Ola et. al,

I've looked at the problem Terrain containers and they present an
interesting issue - the std::setTerrainTile* that is used can't
easily be converted into an std::set osg::observer_ptrTerrainTile 
as the observer_ptr can have it's value changed to NULL by another
thread when destructing the observed TerrainTile and std::set
require their values to be const.

What will be required is a custom container of osg::oberserver_ptr
along the lines of osg::OberservedNodePath but written to manage it's
contents in a similar way to a std::set.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



Using Opera's revolutionary email client: http://www.opera.com/mail/

___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



--
Using Opera's revolutionary email client: http://www.opera.com/mail/
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-27 Thread Ola Nilsson
On Mon, 27 Aug 2012 16:40:39 +0200, Robert Osfield  
robert.osfi...@gmail.com wrote:



Hi Ola,

On 27 August 2012 15:23, Ola Nilsson o...@weatherone.tv wrote:

Hm, there is probably something I don't see. I am new to the code...

I assumed that a TerrainTile cannot be _completely_ deleted in the
Terrain::update call, because if it were it _must_ have passed through
Terrain::unregisterTile and then it wouldn't be in the update set to  
begin

with.


unregisterTile only gets called in the TerrainTile destructor after
it's ref count has gone to zero.  It'd be safe if unregisterTile was


Yes, the ref count is zero but the destructor chain has not yet completed,  
by induction it hasn't passed ~TerrainTile. (If it had unregisterTile  
would have been called and completed.) Right?


In this situation memory is not yet recycled and the object's parents data  
(Referenced) should be safe to access and even manipulate. Right?


Ola

Btw. the new code passes my modified osgterrain test, which of course  
doesn't prove anything...



called prior to the TerrailTile object's ref count goes to zero.






Therefore it is still safe to investigate the raw pointers in the
_updateTerrainTileSet in Terrain::update.

Do you agree on the assumption? If not can you exemplify when it doesn't
hold?


Nope it's not safe as explained above.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



--
Using Opera's revolutionary email client: http://www.opera.com/mail/
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-27 Thread Ola Nilsson

Hi Robert,

On Mon, 27 Aug 2012 17:58:58 +0200, Robert Osfield  
robert.osfi...@gmail.com wrote:



Hi Ola,

On 27 August 2012 16:31, Ola Nilsson o...@weatherone.tv wrote:
Yes, the ref count is zero but the destructor chain has not yet  
completed,
by induction it hasn't passed ~TerrainTile. (If it had unregisterTile  
would

have been called and completed.) Right?


Thinking about your explanation, I think you are correct,the
Terrain::unregisterTile won't be allowed to complete because the
Terrain::traverse() will be holding the Terrain::_mutex so the
destructor won't be able to finish and have the memory fully deleted
and be able to be recycled.

In this situation memory is not yet recycled and the object's parents  
data

(Referenced) should be safe to access and even manipulate. Right?


Safe... possible, it still doesn't feel robust though.



Agree. If someone registered a tile that wasn't ref counted it wouldn't  
work. There are some assumptions on usage, but I think most of them apply  
with the current code.


Please advise if you have any specific scenarios that should be respected.


Btw. the new code passes my modified osgterrain test, which of course
doesn't prove anything...


It's an encouraging start though.  I do wonder if in your changes
there is a need to take a ref_ptr and then check for a
referenceCount() of 1, as if the ref count is zero to start with then
just checking against 0 without the ref_ptr should be sufficient a
test to see if the object has been deleted.


That was also my initial version. But thinking about worst case I realized  
that there is a possibility that between a call to getReferenceCount() and  
the call to ref() or ref_ptr constructor another thread might drop the  
reference count and trigger destruction. Unlikely, but I think possible.  
By using the ref_ptr the operations should be atomic and safe.


Cheers,

Ola



Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



--
Using Opera's revolutionary email client: http://www.opera.com/mail/
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-22 Thread Ola Nilsson

Hi Robert,

Ok, I see the problem. If I get the time later today I will look into the  
code in depth.


/Ola


On Tue, 21 Aug 2012 20:43:44 +0200, Robert Osfield  
robert.osfi...@gmail.com wrote:



Hi Ola et. al,

I've looked at the problem Terrain containers and they present an
interesting issue - the std::setTerrainTile* that is used can't
easily be converted into an std::set osg::observer_ptrTerrainTile 
as the observer_ptr can have it's value changed to NULL by another
thread when destructing the observed TerrainTile and std::set
require their values to be const.

What will be required is a custom container of osg::oberserver_ptr
along the lines of osg::OberservedNodePath but written to manage it's
contents in a similar way to a std::set.

Robert.
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



--
Using Opera's revolutionary email client: http://www.opera.com/mail/
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-21 Thread Ola Nilsson
Interesting. Did you notice any slowdown? In my experience, even if the  
crash doesn't happen for a long time, a significant slowdown is almost  
immediate.


Ola

On Tue, 21 Aug 2012 07:41:27 +0200, Christiansen, Brad  
brad.christian...@thalesgroup.com.au wrote:



Hi,

As another data point for you, I tried to reproduce the crash on a Win 7  
machine (VS2010) using a trunk build from a few weeks ago, and couldn't  
reproduce the crash. I left the example running for about 10 minutes  
with the camera spinning.


Cheers,
Brad

-Original Message-
From: osg-users-boun...@lists.openscenegraph.org  
[mailto:osg-users-boun...@lists.openscenegraph.org] On Behalf Of Ola  
Nilsson

Sent: Thursday, 9 August 2012 11:41 PM
To: OpenSceneGraph Users
Subject: Re: [osg-users] osg(terrain) krasches on a double delete

Here is some more info on how to reproduce our crash.

1. Compile the attached osgterrain program by overwriting  
examples/osgterrain/osgterrain.cpp and recompile.

2. Run the program with
http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input.
3. Wait (no further input is necessary). However, camera movement (tile
loading) produces earlier crashes. The attached image shows performance  
of a typical run (without user input) where the frame time cyclically  
increases until the program crashes.


Cheers,

Ola

ps. The output from the program is: iteration sample_ratio frame_time



On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv  
wrote:



Hi,

We have been looking for a hard-to-reproduce crash in our software
that seems to originate from a double delete inside osg. I have
(finally) been able to reproduce the crash using a version of the
osgterrain-example that _exaggerates_ the usage pattern that crashes
our application.

In examples/osgterrain.cpp remove return viwer.run(); an exchange it
with:

   while(!viewer.done())
 {
  osg::Timer_t start_tick = osg::Timer::instance()-tick();
  float sr = rand() * 1.0 / RAND_MAX ;
  std::cerr  sr;

  viewer.getCamera()-setLODScale(sr*10);
  terrain-setSampleRatio( sr );

  osg::Timer_t middle_tick = osg::Timer::instance()-tick();

  std::cerr  osg::Timer::instance()-delta_m(start_tick,  
middle_tick)

 std::flush;

  viewer.frame();
  std::cerr  ' '  osg::Timer::instance()-delta_m(middle_tick,
osg::Timer::instance()-tick())  std::endl;
 }
   return 0;

When run (tested on ive earth models generated with osgdem) the frame
time slowly increases, and, after a while, it warns about deleting a
still referenced object and then (after arbitrary time) crashes with a
glibc error.

Is this usage (setLODScale + setSampleRatio) safe? If not how should
these functions be called?

If it's a bug, we would be _very_ happy to have it fixed or pointers
about where to look in the code. We've previously submitted a patch
that switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset
12904), could this be a similar issue?

My system is running Centos 6.3 (x86_64) and I compiled osg in debug
mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and
trunk (r13106).

Since I suspect a threading issue; OpenThreads/Config looks like this:
#define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS
/* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */
/* #undef _OPENTHREADS_ATOMIC_USE_SUN */
/* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */
/* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */
/* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */
/* #undef OT_LIBRARY_STATIC */

If I set a break point in the warning for deleting still referenced I
get the following stack trace (using the osg 3.0.1 tag):

Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
236 OSG_WARNWarning: deleting still referenced object
this of type 'typeid(this).name()'std::endl;
(gdb) bt
#0  osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value
optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
#1  0x779e467c in osg::Object::~Object (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45
#2  0x779dd71d in osg::Node::~Node (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94
#3  0x77993953 in osg::Group::~Group (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53
#4  0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile
(this=0xaec2160, __in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#5  0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile
(this=0xaec2160, __in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#6  0x77a131a6 in osg::Referenced::signalObserversAndDelete
(this=0xaec2160

Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-20 Thread Ola Nilsson

Bump. I would really appreciate some info on this problem.

The double delete originates from the DatabasePager, but the destruction  
of these objects should be safe(ref counted as well as threadsafe)  
shouldn't it? I would be grateful if someone could take a look at the  
stacktrace (prev post, below) and see if there is anything suspicious  
going on.


Cheers,

Ola


On Fri, 10 Aug 2012 11:01:49 +0200, Ola Nilsson o...@weatherone.tv wrote:


I have now also reproduced the crash on darwin/osx in i386 mode.

Cheers,

Ola


On Thu, 09 Aug 2012 17:41:12 +0200, Ola Nilsson o...@weatherone.tv  
wrote:



Here is some more info on how to reproduce our crash.

1. Compile the attached osgterrain program by overwriting
examples/osgterrain/osgterrain.cpp and recompile.
2. Run the program with
http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input.
3. Wait (no further input is necessary). However, camera movement (tile
loading) produces earlier crashes. The attached image shows performance  
of

a typical run (without user input) where the frame time cyclically
increases until the program crashes.

Cheers,

Ola

ps. The output from the program is: iteration sample_ratio frame_time



On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv  
wrote:



Hi,

We have been looking for a hard-to-reproduce crash in our software that
seems to originate from a double delete inside osg. I have (finally)
been able to reproduce the crash using a version of the
osgterrain-example that _exaggerates_ the usage pattern that crashes  
our

application.

In examples/osgterrain.cpp remove return viwer.run(); an exchange it
with:

   while(!viewer.done())
 {
osg::Timer_t start_tick = osg::Timer::instance()-tick();
float sr = rand() * 1.0 / RAND_MAX ;
std::cerr  sr;

viewer.getCamera()-setLODScale(sr*10);
terrain-setSampleRatio( sr );   

osg::Timer_t middle_tick = osg::Timer::instance()-tick();

std::cerr  osg::Timer::instance()-delta_m(start_tick, middle_tick)
 std::flush;

viewer.frame();
std::cerr  ' '  osg::Timer::instance()-delta_m(middle_tick,
osg::Timer::instance()-tick())  std::endl;
 }
   return 0;

When run (tested on ive earth models generated with osgdem) the frame
time slowly increases, and, after a while, it warns about deleting a
still referenced object and then (after arbitrary time) crashes with a
glibc error.

Is this usage (setLODScale + setSampleRatio) safe? If not how should
these functions be called?

If it's a bug, we would be _very_ happy to have it fixed or pointers
about where to look in the code. We've previously submitted a patch  
that

switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset
12904), could this be a similar issue?

My system is running Centos 6.3 (x86_64) and I compiled osg in debug
mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk
(r13106).

Since I suspect a threading issue; OpenThreads/Config looks like this:
#define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS
/* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */
/* #undef _OPENTHREADS_ATOMIC_USE_SUN */
/* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */
/* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */
/* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */
/* #undef OT_LIBRARY_STATIC */

If I set a break point in the warning for deleting still referenced I
get the following stack trace (using the osg 3.0.1 tag):

Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
236 OSG_WARNWarning: deleting still referenced object
this of type 'typeid(this).name()'std::endl;
(gdb) bt
#0  osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value
optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
#1  0x779e467c in osg::Object::~Object (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45
#2  0x779dd71d in osg::Node::~Node (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94
#3  0x77993953 in osg::Group::~Group (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53
#4  0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile
(this=0xaec2160, __in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#5  0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile
(this=0xaec2160, __in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#6  0x77a131a6 in osg::Referenced::signalObserversAndDelete
(this=0xaec2160, signalDelete=true, doDelete=true) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323
#7  0x0040b07d in osg::Referenced::unref (this=0xaec2160) at
/home/ola/src

Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-20 Thread Ola Nilsson
Hi,

On 20 aug 2012, at 17.47, Robert Osfield wrote:

 Hi Ola,
 
 On 20 August 2012 16:29, Ola Nilsson o...@weatherone.tv wrote:
 The double delete originates from the DatabasePager, but the destruction of
 these objects should be safe(ref counted as well as threadsafe) shouldn't
 it? I would be grateful if someone could take a look at the stacktrace (prev
 post, below) and see if there is anything suspicious going on.
 
 I have just tried your modified osgterrain and it crashes for me with
 a seg fault during the update traversal.  I am just doing a debug
 build to see if this reveals anything more useful.
 

Great news!

 While the usage your modified osgterrain shouldn't cause a crash it's
 certainly not the way I would recommend anyone to use osgTerrain - the
 setSampleRatio() feature is meant to be set once in an application
 rather than thrashed continuously as changing the same ratio forces a
 rebuild of the terrain geometry.  The Camera::setLODScale() is however
 a light weight operation without any performance or stability
 consequences and can be changed every frame without problem.
 
 The question has to be why you are changing the Terrain SampleRatio.
 

As I tried to convey in my first post, the sample program just exaggerates a 
usage scenario that _very_ seldom crashed our app. We obviously do not change 
sampleratio per frame call in our app. It can however be set per flight in a 
project. We will revise this usage.

Ola

 Robert.
 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-20 Thread Ola Nilsson
Hi Robert,

This explains the behavior we have experienced, the crashes we observed were 
notoriously difficult to reproduce in a debug environment (gdb, valgrind, 
etc.). 

Thanks for taking your time to investigate this!

Ola 



On 20 aug 2012, at 18.41, Robert Osfield wrote:

 Hi Ola,
 
 I've been doing a code review of Terrain and TerrainTile looking at
 how deletion of TerrainTile will affect the internal TerrainTile
 pointers held in Terrain that track all the tiles.  There internal
 TerrainTile pointers in Terrain are protected by a reentrant mutex but
 I believe a problem can occur when the TerrainTile destructor calls
 Terrain::unregisterTerrainTile(TerrainTile*) when at the same time the
 Terrain::traverse(NodeVisitor) method is taking a copy of the
 _updateTerrainTileSet, when this occurs the update traversal would
 increment the ref count on an TerrainTile being destructed by the
 DatabasePager thread that is cleaning up expired tiles.
 

 This issue boils down to attempt to make a thread safe list
 TerrainTile observers in Terrain that doesn't quite achieve what it
 intends.  I haven't thought deeply enough about the issue yet to know
 what the best solution will be.  My head isn't quite into zone that
 will allow me to solve this one so I'll have to come back to it at a
 later date.  A short term solution would be set the pager so that it
 doesn't delete objects in a background pager thread via
 DatabasePager::setDeleteRemovedSubgraphsInDatabaseThread(false), this
 won't fix the thread safety issue but will at least prevent the
 particular instance where the update traversal is occurring at the
 same time as the deletion of TerrainTile.
 
 Robert.
 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-10 Thread Ola Nilsson

I have now also reproduced the crash on darwin/osx in i386 mode.

Cheers,

Ola


On Thu, 09 Aug 2012 17:41:12 +0200, Ola Nilsson o...@weatherone.tv wrote:


Here is some more info on how to reproduce our crash.

1. Compile the attached osgterrain program by overwriting
examples/osgterrain/osgterrain.cpp and recompile.
2. Run the program with
http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input.
3. Wait (no further input is necessary). However, camera movement (tile
loading) produces earlier crashes. The attached image shows performance  
of

a typical run (without user input) where the frame time cyclically
increases until the program crashes.

Cheers,

Ola

ps. The output from the program is: iteration sample_ratio frame_time



On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv  
wrote:



Hi,

We have been looking for a hard-to-reproduce crash in our software that
seems to originate from a double delete inside osg. I have (finally)
been able to reproduce the crash using a version of the
osgterrain-example that _exaggerates_ the usage pattern that crashes our
application.

In examples/osgterrain.cpp remove return viwer.run(); an exchange it
with:

   while(!viewer.done())
 {
osg::Timer_t start_tick = osg::Timer::instance()-tick();
float sr = rand() * 1.0 / RAND_MAX ;
std::cerr  sr;

viewer.getCamera()-setLODScale(sr*10);
terrain-setSampleRatio( sr );   

osg::Timer_t middle_tick = osg::Timer::instance()-tick();

std::cerr  osg::Timer::instance()-delta_m(start_tick, middle_tick)
 std::flush;

viewer.frame();
std::cerr  ' '  osg::Timer::instance()-delta_m(middle_tick,
osg::Timer::instance()-tick())  std::endl;
 }
   return 0;

When run (tested on ive earth models generated with osgdem) the frame
time slowly increases, and, after a while, it warns about deleting a
still referenced object and then (after arbitrary time) crashes with a
glibc error.

Is this usage (setLODScale + setSampleRatio) safe? If not how should
these functions be called?

If it's a bug, we would be _very_ happy to have it fixed or pointers
about where to look in the code. We've previously submitted a patch that
switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset
12904), could this be a similar issue?

My system is running Centos 6.3 (x86_64) and I compiled osg in debug
mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk
(r13106).

Since I suspect a threading issue; OpenThreads/Config looks like this:
#define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS
/* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */
/* #undef _OPENTHREADS_ATOMIC_USE_SUN */
/* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */
/* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */
/* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */
/* #undef OT_LIBRARY_STATIC */

If I set a break point in the warning for deleting still referenced I
get the following stack trace (using the osg 3.0.1 tag):

Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
236 OSG_WARNWarning: deleting still referenced object
this of type 'typeid(this).name()'std::endl;
(gdb) bt
#0  osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value
optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
#1  0x779e467c in osg::Object::~Object (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45
#2  0x779dd71d in osg::Node::~Node (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94
#3  0x77993953 in osg::Group::~Group (this=0xaec2160,
__in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53
#4  0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile
(this=0xaec2160, __in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#5  0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile
(this=0xaec2160, __in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#6  0x77a131a6 in osg::Referenced::signalObserversAndDelete
(this=0xaec2160, signalDelete=true, doDelete=true) at
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323
#7  0x0040b07d in osg::Referenced::unref (this=0xaec2160) at
/home/ola/src/OpenSceneGraph-3.0.1/include/osg/Referenced:198
#8  0x0040cd3b in osg::ref_ptrosg::Node::~ref_ptr
(this=0x478d410, __in_chrg=value optimized out) at
/home/ola/src/OpenSceneGraph-3.0.1/include/osg/ref_ptr:35
#9  0x7799627e in std::_Destroyosg::ref_ptrosg::Node 
(__pointer=0x478d410) at
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:83
#10 0x77996084 in
std::_Destroy_auxfalse::__destroyosg

Re: [osg-users] osg(terrain) krasches on a double delete

2012-08-09 Thread Ola Nilsson

Here is some more info on how to reproduce our crash.

1. Compile the attached osgterrain program by overwriting  
examples/osgterrain/osgterrain.cpp and recompile.
2. Run the program with  
http://www.openscenegraph.org/data/earth_bayarea/earth.ive as input.
3. Wait (no further input is necessary). However, camera movement (tile  
loading) produces earlier crashes. The attached image shows performance of  
a typical run (without user input) where the frame time cyclically  
increases until the program crashes.


Cheers,

Ola

ps. The output from the program is: iteration sample_ratio frame_time



On Tue, 07 Aug 2012 11:59:00 +0200, Ola Nilsson o...@weatherone.tv wrote:


Hi,

We have been looking for a hard-to-reproduce crash in our software that  
seems to originate from a double delete inside osg. I have (finally)  
been able to reproduce the crash using a version of the  
osgterrain-example that _exaggerates_ the usage pattern that crashes our  
application.


In examples/osgterrain.cpp remove return viwer.run(); an exchange it  
with:


   while(!viewer.done())
 {
osg::Timer_t start_tick = osg::Timer::instance()-tick();
float sr = rand() * 1.0 / RAND_MAX ;
std::cerr  sr;

viewer.getCamera()-setLODScale(sr*10);
terrain-setSampleRatio( sr );   

osg::Timer_t middle_tick = osg::Timer::instance()-tick();

	std::cerr  osg::Timer::instance()-delta_m(start_tick, middle_tick)  
 std::flush;


viewer.frame();
	std::cerr  ' '  osg::Timer::instance()-delta_m(middle_tick,  
osg::Timer::instance()-tick())  std::endl;

 }
   return 0;

When run (tested on ive earth models generated with osgdem) the frame  
time slowly increases, and, after a while, it warns about deleting a  
still referenced object and then (after arbitrary time) crashes with a  
glibc error.


Is this usage (setLODScale + setSampleRatio) safe? If not how should  
these functions be called?


If it's a bug, we would be _very_ happy to have it fixed or pointers  
about where to look in the code. We've previously submitted a patch that  
switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset  
12904), could this be a similar issue?


My system is running Centos 6.3 (x86_64) and I compiled osg in debug  
mode with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk  
(r13106).


Since I suspect a threading issue; OpenThreads/Config looks like this:
#define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS
/* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */
/* #undef _OPENTHREADS_ATOMIC_USE_SUN */
/* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */
/* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */
/* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */
/* #undef OT_LIBRARY_STATIC */

If I set a break point in the warning for deleting still referenced I  
get the following stack trace (using the osg 3.0.1 tag):


Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
236 OSG_WARNWarning: deleting still referenced object  
this of type 'typeid(this).name()'std::endl;

(gdb) bt
#0  osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value  
optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
#1  0x779e467c in osg::Object::~Object (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45
#2  0x779dd71d in osg::Node::~Node (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94
#3  0x77993953 in osg::Group::~Group (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53
#4  0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile  
(this=0xaec2160, __in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#5  0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile  
(this=0xaec2160, __in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#6  0x77a131a6 in osg::Referenced::signalObserversAndDelete  
(this=0xaec2160, signalDelete=true, doDelete=true) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323
#7  0x0040b07d in osg::Referenced::unref (this=0xaec2160) at  
/home/ola/src/OpenSceneGraph-3.0.1/include/osg/Referenced:198
#8  0x0040cd3b in osg::ref_ptrosg::Node::~ref_ptr  
(this=0x478d410, __in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/include/osg/ref_ptr:35
#9  0x7799627e in std::_Destroyosg::ref_ptrosg::Node   
(__pointer=0x478d410) at  
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:83
#10 0x77996084 in  
std::_Destroy_auxfalse::__destroyosg::ref_ptrosg::Node*  
(__first=0x478d410, __last=0x478d418) at  
/usr/lib/gcc

[osg-users] osg(terrain) krasches on a double delete

2012-08-07 Thread Ola Nilsson

Hi,

We have been looking for a hard-to-reproduce crash in our software that  
seems to originate from a double delete inside osg. I have (finally) been  
able to reproduce the crash using a version of the osgterrain-example that  
_exaggerates_ the usage pattern that crashes our application.


In examples/osgterrain.cpp remove return viwer.run(); an exchange it  
with:


  while(!viewer.done())
{
osg::Timer_t start_tick = osg::Timer::instance()-tick();
float sr = rand() * 1.0 / RAND_MAX ;
std::cerr  sr;

viewer.getCamera()-setLODScale(sr*10);
terrain-setSampleRatio( sr );   

osg::Timer_t middle_tick = osg::Timer::instance()-tick();

	std::cerr  osg::Timer::instance()-delta_m(start_tick, middle_tick)   
std::flush;


viewer.frame();
	std::cerr  ' '  osg::Timer::instance()-delta_m(middle_tick,  
osg::Timer::instance()-tick())  std::endl;

}
  return 0;

When run (tested on ive earth models generated with osgdem) the frame time  
slowly increases, and, after a while, it warns about deleting a still  
referenced object and then (after arbitrary time) crashes with a glibc  
error.


Is this usage (setLODScale + setSampleRatio) safe? If not how should these  
functions be called?


If it's a bug, we would be _very_ happy to have it fixed or pointers about  
where to look in the code. We've previously submitted a patch that  
switched to a ReentrantMutex in osgTerrain/Terrain.cpp (changeset 12904),  
could this be a similar issue?


My system is running Centos 6.3 (x86_64) and I compiled osg in debug mode  
with gcc 4.4.6. I have tested both against the 3.0.1 tag and trunk  
(r13106).


Since I suspect a threading issue; OpenThreads/Config looks like this:
#define _OPENTHREADS_ATOMIC_USE_GCC_BUILTINS
/* #undef _OPENTHREADS_ATOMIC_USE_MIPOSPRO_BUILTINS */
/* #undef _OPENTHREADS_ATOMIC_USE_SUN */
/* #undef _OPENTHREADS_ATOMIC_USE_WIN32_INTERLOCKED */
/* #undef _OPENTHREADS_ATOMIC_USE_BSD_ATOMIC */
/* #undef _OPENTHREADS_ATOMIC_USE_MUTEX */
/* #undef OT_LIBRARY_STATIC */

If I set a break point in the warning for deleting still referenced I get  
the following stack trace (using the osg 3.0.1 tag):


Breakpoint 1, osg::Referenced::~Referenced (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
236 OSG_WARNWarning: deleting still referenced object  
this of type 'typeid(this).name()'std::endl;

(gdb) bt
#0  osg::Referenced::~Referenced (this=0xaec2160, __in_chrg=value  
optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:236
#1  0x779e467c in osg::Object::~Object (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Object.cpp:45
#2  0x779dd71d in osg::Node::~Node (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Node.cpp:94
#3  0x77993953 in osg::Group::~Group (this=0xaec2160,  
__in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Group.cpp:53
#4  0x75df4ae8 in osgTerrain::TerrainTile::~TerrainTile  
(this=0xaec2160, __in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#5  0x75df4b1e in osgTerrain::TerrainTile::~TerrainTile  
(this=0xaec2160, __in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osgTerrain/TerrainTile.cpp:95
#6  0x77a131a6 in osg::Referenced::signalObserversAndDelete  
(this=0xaec2160, signalDelete=true, doDelete=true) at  
/home/ola/src/OpenSceneGraph-3.0.1/src/osg/Referenced.cpp:323
#7  0x0040b07d in osg::Referenced::unref (this=0xaec2160) at  
/home/ola/src/OpenSceneGraph-3.0.1/include/osg/Referenced:198
#8  0x0040cd3b in osg::ref_ptrosg::Node::~ref_ptr  
(this=0x478d410, __in_chrg=value optimized out) at  
/home/ola/src/OpenSceneGraph-3.0.1/include/osg/ref_ptr:35
#9  0x7799627e in std::_Destroyosg::ref_ptrosg::Node   
(__pointer=0x478d410) at  
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:83
#10 0x77996084 in  
std::_Destroy_auxfalse::__destroyosg::ref_ptrosg::Node*  
(__first=0x478d410, __last=0x478d418) at  
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:93
#11 0x77995dcf in std::_Destroyosg::ref_ptrosg::Node*  
(__first=0x478d410, __last=0x478d418) at  
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:116
#12 0x77995847 in std::_Destroyosg::ref_ptrosg::Node*,  
osg::ref_ptrosg::Node  (__first=0x478d410, __last=0x478d418)
at  
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_construct.h:142
#13 0x77994e56 in std::vectorosg::ref_ptrosg::Node,  
std::allocatorosg::ref_ptrosg::Node  ::~vector (this=0xaec1fd8,  
__in_chrg=value optimized out)
at