Re: [osg-users] Multi-threading bug related to Registry

2008-11-28 Thread Smeenk, R.J.M. (Roland)
Robert,

Note that Rick responded to your question although to the osg-users
Digest mail so it will probably not show as a follow up in your e-mail
box. See
http://lists.openscenegraph.org/pipermail/osg-users-openscenegraph.org/2
008-November/019505.html

Sorry for pushing you around, but I happen to be interested in the
outcome of this thread.

--
Roland 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf 
 Of Robert Osfield
 Sent: dinsdag 25 november 2008 15:08
 To: OpenSceneGraph Users
 Subject: Re: [osg-users] Multi-threading bug related to Registry
 
 Hi Rick,
 
 Just to confirm the issue, if you enable Registry caching of 
 imagery and use paged databases that have external image 
 files, then you get
 the below crash.   If you don't use the Registry cache what happens?
 
 How reliably does this crash occur?
 
 My hypothisis is that the object cache is being cleared by 
 another thread just at the wrong time for this particular 
 section of .ive plugin.  If it is correct then use of  
 Registry::readImage(...) that returns a Registry::ReadResult 
 rather than a Image* would be safe as ReadResult uses 
 ref_ptr internally.
 
 Robert.
 
 On Tue, Nov 25, 2008 at 1:49 PM, Rick Appleton 
 [EMAIL PROTECTED] wrote:
  Sorry for the delay on this.
 
  Here's some more information. Below is a complete callstack 
 which should be similar to when the issue happened. Basically 
 we are loading a Paged terrain database which uses IVE files 
 for data, and DDS textures for images. A quick search in OSG 
 reveals that osgDB::readImageFile is called from many plugin 
 loaders, so I suspect they would have the same issue.
 
 
 osg48-osgDBd.dll!osgDB::Registry::readImageImplementation(cons
 t 
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   fileName=textures/TxMosaic.dds, const 
 osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 1706C++
 osg48-osgDBd.dll!osgDB::Registry::readImage(const 
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   fileName=textures/TxMosaic.dds, const 
 osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 217 
 + 0x14 bytes  C++
osg48-osgDBd.dll!osgDB::readImageFile(const 
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   filename=textures/TxMosaic.dds, const 
 osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 43 + 
 0x1d bytes C++
 
 osgdb_ived.dll!ive::DataInputStream::readImage(std::basic_stri
ngchar,std::char_traitschar,std::allocatorchar  
filename=textures/TxMosaic.dds)  Line 950 + 0x44 bytesC++
 
 osgdb_ived.dll!ive::DataInputStream::readImage(ive::IncludeIma
 geMode mode=IMAGE_REFERENCE_FILE)  Line 1003 + 0x2c bytes C++
 osgdb_ived.dll!ive::DataInputStream::readImage()  
 Line 976 + 0xc bytes  C++
 
 osgdb_ived.dll!ive::Texture2D::read(ive::DataInputStream * 
 in=0x00dbec2c)  Line 54 + 0x8 bytes  C++
 
 osgdb_ived.dll!ive::DataInputStream::readStateAttribute()  
 Line 1159 + 0xc bytesC++
 
 osgdb_ived.dll!ive::StateSet::read(ive::DataInputStream * 
 in=0x00dbec2c)  Line 173 + 0x8 bytes  C++
 osgdb_ived.dll!ive::DataInputStream::readStateSet()  
 Line 1066  C++
 osgdb_ived.dll!ive::Node::read(ive::DataInputStream 
 * in=0x00dbec2c)  Line 124 + 0x8 bytes  C++
 osgdb_ived.dll!ive::Geode::read(ive::DataInputStream 
 * in=0x00dbec2c)  Line 67  C++
 osgdb_ived.dll!ive::DataInputStream::readNode()  
 Line 1507 + 0x2d bytes C++
 osgdb_ived.dll!ive::Group::read(ive::DataInputStream 
 * in=0x00dbec2c)  Line 74 + 0x8 bytes  C++
 osgdb_ived.dll!ive::DataInputStream::readNode()  
 Line 1499 + 0xc bytes  C++
 
 osgdb_ived.dll!ReaderWriterIVE::readNode(std::basic_istreamch
ar,std::char_traitschar   fin={...}, const 
osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 98 + 
 0xb bytesC++
 osgdb_ived.dll!ReaderWriterIVE::readNode(const 
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   file=veluwe.ive, const 
 osgDB::ReaderWriter::Options * options=0x01efeb88)  Line 69 + 
 0x23 bytes C++
 
 osg48-osgDBd.dll!osgDB::Registry::ReadNodeFunctor::doRead(osgD
B::ReaderWriter  rw={...})  Line 1374 + 0x40 bytes   C++
 osg48-osgDBd.dll!osgDB::Registry::read(const 
 osgDB::Registry::ReadFunctor  readFunctor={...})  Line 1527 
 + 0x22 bytes  C++
 
 osg48-osgDBd.dll!osgDB::Registry::readImplementation(const 
 osgDB::Registry::ReadFunctor  readFunctor={...}, bool 
 useObjectCache=true)  Line 1596 + 0x13 bytes  C++
 
 osg48-osgDBd.dll!osgDB::Registry::readNodeImplementation(const
  
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   fileName=veluwe.ive, const 
 osgDB::ReaderWriter::Options * options=0x01efeb88)  Line 1814 
 + 0x57 bytes   C++
 osg48-osgDBd.dll!osgDB::Registry::readNode(const 
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har

Re: [osg-users] Multi-threading bug related to Registry

2008-11-28 Thread Robert Osfield
HI Roland,

I did read the reply, and I have already reponded.

Robert.

On Fri, Nov 28, 2008 at 8:19 AM, Smeenk, R.J.M. (Roland)
[EMAIL PROTECTED] wrote:
 Robert,

 Note that Rick responded to your question although to the osg-users
 Digest mail so it will probably not show as a follow up in your e-mail
 box. See
 http://lists.openscenegraph.org/pipermail/osg-users-openscenegraph.org/2
 008-November/019505.html

 Sorry for pushing you around, but I happen to be interested in the
 outcome of this thread.

 --
 Roland

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf
 Of Robert Osfield
 Sent: dinsdag 25 november 2008 15:08
 To: OpenSceneGraph Users
 Subject: Re: [osg-users] Multi-threading bug related to Registry

 Hi Rick,

 Just to confirm the issue, if you enable Registry caching of
 imagery and use paged databases that have external image
 files, then you get
 the below crash.   If you don't use the Registry cache what happens?

 How reliably does this crash occur?

 My hypothisis is that the object cache is being cleared by
 another thread just at the wrong time for this particular
 section of .ive plugin.  If it is correct then use of
 Registry::readImage(...) that returns a Registry::ReadResult
 rather than a Image* would be safe as ReadResult uses
 ref_ptr internally.

 Robert.

 On Tue, Nov 25, 2008 at 1:49 PM, Rick Appleton
 [EMAIL PROTECTED] wrote:
  Sorry for the delay on this.
 
  Here's some more information. Below is a complete callstack
 which should be similar to when the issue happened. Basically
 we are loading a Paged terrain database which uses IVE files
 for data, and DDS textures for images. A quick search in OSG
 reveals that osgDB::readImageFile is called from many plugin
 loaders, so I suspect they would have the same issue.
 
 
 osg48-osgDBd.dll!osgDB::Registry::readImageImplementation(cons
 t
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   fileName=textures/TxMosaic.dds, const
 osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 1706C++
 osg48-osgDBd.dll!osgDB::Registry::readImage(const
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   fileName=textures/TxMosaic.dds, const
 osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 217
 + 0x14 bytes  C++
osg48-osgDBd.dll!osgDB::readImageFile(const
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   filename=textures/TxMosaic.dds, const
 osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 43 +
 0x1d bytes C++
 
 osgdb_ived.dll!ive::DataInputStream::readImage(std::basic_stri
 ngchar,std::char_traitschar,std::allocatorchar  
 filename=textures/TxMosaic.dds)  Line 950 + 0x44 bytesC++
 
 osgdb_ived.dll!ive::DataInputStream::readImage(ive::IncludeIma
 geMode mode=IMAGE_REFERENCE_FILE)  Line 1003 + 0x2c bytes C++
 osgdb_ived.dll!ive::DataInputStream::readImage()
 Line 976 + 0xc bytes  C++
 
 osgdb_ived.dll!ive::Texture2D::read(ive::DataInputStream *
 in=0x00dbec2c)  Line 54 + 0x8 bytes  C++
 
 osgdb_ived.dll!ive::DataInputStream::readStateAttribute()
 Line 1159 + 0xc bytesC++
 
 osgdb_ived.dll!ive::StateSet::read(ive::DataInputStream *
 in=0x00dbec2c)  Line 173 + 0x8 bytes  C++
 osgdb_ived.dll!ive::DataInputStream::readStateSet()
 Line 1066  C++
 osgdb_ived.dll!ive::Node::read(ive::DataInputStream
 * in=0x00dbec2c)  Line 124 + 0x8 bytes  C++
 osgdb_ived.dll!ive::Geode::read(ive::DataInputStream
 * in=0x00dbec2c)  Line 67  C++
 osgdb_ived.dll!ive::DataInputStream::readNode()
 Line 1507 + 0x2d bytes C++
 osgdb_ived.dll!ive::Group::read(ive::DataInputStream
 * in=0x00dbec2c)  Line 74 + 0x8 bytes  C++
 osgdb_ived.dll!ive::DataInputStream::readNode()
 Line 1499 + 0xc bytes  C++
 
 osgdb_ived.dll!ReaderWriterIVE::readNode(std::basic_istreamch
 ar,std::char_traitschar   fin={...}, const 
 osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 98 +
 0xb bytesC++
 osgdb_ived.dll!ReaderWriterIVE::readNode(const
 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   file=veluwe.ive, const
 osgDB::ReaderWriter::Options * options=0x01efeb88)  Line 69 +
 0x23 bytes C++
 
 osg48-osgDBd.dll!osgDB::Registry::ReadNodeFunctor::doRead(osgD
 B::ReaderWriter  rw={...})  Line 1374 + 0x40 bytes   C++
 osg48-osgDBd.dll!osgDB::Registry::read(const
 osgDB::Registry::ReadFunctor  readFunctor={...})  Line 1527
 + 0x22 bytes  C++
 
 osg48-osgDBd.dll!osgDB::Registry::readImplementation(const
 osgDB::Registry::ReadFunctor  readFunctor={...}, bool
 useObjectCache=true)  Line 1596 + 0x13 bytes  C++
 
 osg48-osgDBd.dll!osgDB::Registry::readNodeImplementation(const

 std::basic_stringchar,std::char_traitschar,std::allocatorc
 har   fileName=veluwe.ive, const
 osgDB::ReaderWriter::Options * options=0x01efeb88)  Line 1814
 + 0x57 bytes   C++
 osg48-osgDBd.dll!osgDB::Registry::readNode(const
 std::basic_stringchar,std

Re: [osg-users] Multi-threading bug related to Registry

2008-11-25 Thread Rick Appleton
Sorry for the delay on this.

Here's some more information. Below is a complete callstack which should be 
similar to when the issue happened. Basically we are loading a Paged terrain 
database which uses IVE files for data, and DDS textures for images. A quick 
search in OSG reveals that osgDB::readImageFile is called from many plugin 
loaders, so I suspect they would have the same issue.

osg48-osgDBd.dll!osgDB::Registry::readImageImplementation(const 
std::basic_stringchar,std::char_traitschar,std::allocatorchar   
fileName=textures/TxMosaic.dds, const osgDB::ReaderWriter::Options * 
options=0x01f0ceb8)  Line 1706C++
osg48-osgDBd.dll!osgDB::Registry::readImage(const 
std::basic_stringchar,std::char_traitschar,std::allocatorchar   
fileName=textures/TxMosaic.dds, const osgDB::ReaderWriter::Options * 
options=0x01f0ceb8)  Line 217 + 0x14 bytes  C++
   osg48-osgDBd.dll!osgDB::readImageFile(const 
 std::basic_stringchar,std::char_traitschar,std::allocatorchar   
 filename=textures/TxMosaic.dds, const osgDB::ReaderWriter::Options * 
 options=0x01f0ceb8)  Line 43 + 0x1d bytes C++

osgdb_ived.dll!ive::DataInputStream::readImage(std::basic_stringchar,std::char_traitschar,std::allocatorchar
  filename=textures/TxMosaic.dds)  Line 950 + 0x44 bytesC++
osgdb_ived.dll!ive::DataInputStream::readImage(ive::IncludeImageMode 
mode=IMAGE_REFERENCE_FILE)  Line 1003 + 0x2c bytes C++
osgdb_ived.dll!ive::DataInputStream::readImage()  Line 976 + 0xc bytes  
C++
osgdb_ived.dll!ive::Texture2D::read(ive::DataInputStream * 
in=0x00dbec2c)  Line 54 + 0x8 bytes  C++
osgdb_ived.dll!ive::DataInputStream::readStateAttribute()  Line 1159 + 
0xc bytesC++
osgdb_ived.dll!ive::StateSet::read(ive::DataInputStream * 
in=0x00dbec2c)  Line 173 + 0x8 bytes  C++
osgdb_ived.dll!ive::DataInputStream::readStateSet()  Line 1066  C++
osgdb_ived.dll!ive::Node::read(ive::DataInputStream * in=0x00dbec2c)  
Line 124 + 0x8 bytes  C++
osgdb_ived.dll!ive::Geode::read(ive::DataInputStream * in=0x00dbec2c)  
Line 67  C++
osgdb_ived.dll!ive::DataInputStream::readNode()  Line 1507 + 0x2d bytes 
C++
osgdb_ived.dll!ive::Group::read(ive::DataInputStream * in=0x00dbec2c)  
Line 74 + 0x8 bytes  C++
osgdb_ived.dll!ive::DataInputStream::readNode()  Line 1499 + 0xc bytes  
C++

osgdb_ived.dll!ReaderWriterIVE::readNode(std::basic_istreamchar,std::char_traitschar
   fin={...}, const osgDB::ReaderWriter::Options * options=0x01f0ceb8)  Line 
98 + 0xb bytesC++
osgdb_ived.dll!ReaderWriterIVE::readNode(const 
std::basic_stringchar,std::char_traitschar,std::allocatorchar   
file=veluwe.ive, const osgDB::ReaderWriter::Options * options=0x01efeb88)  
Line 69 + 0x23 bytes C++

osg48-osgDBd.dll!osgDB::Registry::ReadNodeFunctor::doRead(osgDB::ReaderWriter  
rw={...})  Line 1374 + 0x40 bytes   C++
osg48-osgDBd.dll!osgDB::Registry::read(const 
osgDB::Registry::ReadFunctor  readFunctor={...})  Line 1527 + 0x22 bytes  C++
osg48-osgDBd.dll!osgDB::Registry::readImplementation(const 
osgDB::Registry::ReadFunctor  readFunctor={...}, bool useObjectCache=true)  
Line 1596 + 0x13 bytes  C++
osg48-osgDBd.dll!osgDB::Registry::readNodeImplementation(const 
std::basic_stringchar,std::char_traitschar,std::allocatorchar   
fileName=veluwe.ive, const osgDB::ReaderWriter::Options * options=0x01efeb88) 
 Line 1814 + 0x57 bytes   C++
osg48-osgDBd.dll!osgDB::Registry::readNode(const 
std::basic_stringchar,std::char_traitschar,std::allocatorchar   
fileName=veluwe.ive, const osgDB::ReaderWriter::Options * options=0x01efeb88) 
 Line 232 + 0x98 bytes  C++
osg48-osgDBd.dll!osgDB::readNodeFile(const 
std::basic_stringchar,std::char_traitschar,std::allocatorchar   
filename=veluwe.ive, const osgDB::ReaderWriter::Options * options=0x01efeb88) 
 Line 69 + 0x1d bytes C++
osg48-osgDBd.dll!osgDB::readNodeFiles(osg::ArgumentParser  
arguments={...}, const osgDB::ReaderWriter::Options * options=0x01efeb88)  Line 
224 + 0x43 bytesC++
osgviewerd.exe!osgDB::readNodeFiles(osg::ArgumentParser  parser={...}) 
 Line 132 + 0x32 bytes  C++
osgviewerd.exe!main(int argc=2, char * * argv=0x01eee120)  Line 134 + 
0x9 bytes C++
osgviewerd.exe!__tmainCRTStartup()  Line 597 + 0x19 bytes   C
osgviewerd.exe!mainCRTStartup()  Line 414   C
kernel32.dll!7c816d4f() 
[Frames below may be incorrect and/or missing, no symbols loaded for 
kernel32.dll]  
kernel32.dll!7c8399f3() 



-Original Message-
From: [EMAIL PROTECTED] on behalf of Robert Osfield
Sent: Tue 18/11/2008 17:35
To: OpenSceneGraph Users
Subject: Re: [osg-users] Multi-threading bug related to Registry
 
Hi Rick,

The solution is for ref_ptrImage to be used, but where I can't say
as you don't provide full details on what code is at fault.  Which
plugins

Re: [osg-users] Multi-threading bug related to Registry

2008-11-25 Thread Robert Osfield
 134 + 
 0x9 bytes C++
osgviewerd.exe!__tmainCRTStartup()  Line 597 + 0x19 bytes   C
osgviewerd.exe!mainCRTStartup()  Line 414   C
kernel32.dll!7c816d4f()
[Frames below may be incorrect and/or missing, no symbols loaded for 
 kernel32.dll]
kernel32.dll!7c8399f3()



 -Original Message-
 From: [EMAIL PROTECTED] on behalf of Robert Osfield
 Sent: Tue 18/11/2008 17:35
 To: OpenSceneGraph Users
 Subject: Re: [osg-users] Multi-threading bug related to Registry

 Hi Rick,

 The solution is for ref_ptrImage to be used, but where I can't say
 as you don't provide full details on what code is at fault.  Which
 plugins are you invoking in this instance?  .osg? .ive?  Others?

 Robert.

 On Mon, Nov 17, 2008 at 1:23 PM, Rick Appleton
 [EMAIL PROTECTED] wrote:
 There is a nasty multi-threading bug related to osgDB::DatabasePager and the
 _objectCache in osgDB::Registry. This is the situation.

 * Thread 2 stack (DatabasePager thread):
 ReaderWriter::ReadResult Registry::readImplementation
 ReaderWriter::ReadResult Registry::readImageImplementation
 ReaderWriter::ReadResult Registry::readImage
 osg::Image* osgDB::readImageFile
 osg::Image* DataInputStream::readImage(std::string filename)
 osg::Image* DataInputStream::readImage(IncludeImageMode mode)
 osg::Image* DataInputStream::readImage()
 void Texture2D::read(DataInputStream* in)
 ..

 - After reading in the file in Registry::readImplementation the refCount of
 the new Image is 1 (ReadResult contains a ref_ptr).
 - This Image is then inserted into Registry::_objectCache with the default
 timestamp (0.0), which increments the recCount to 2.
 - The ReadResult object is returned, so no net change in refCount, until the
 stack is unwound up to osgDB::readImageFile.
 - osgDB::readImageFile takes a normal pointer to the Image, which does not
 change the refCount. When osgDB::readImageFile returns, the ReadResult
 object goes out of scope, deleting a reference to the Image. refCount is now
 1.
 - Stack unwinds until back in Texture2D::read at which point the normal
 Image* is passed into Texture2D::setImage which assigns the Image* to the
 internal ref_ptr. At this moment the refCount is increased back to 2.

 Imagine that somewhere during the stack unwind between osgDB::readImageFile
 and Texture2D::read a thread switch takes place.

 * Thread 1:
 The main thread then reaches DatabasePager::removeExpiredSubgraphs which
 does the following at the end of the function.

 - Calls Registry::updateTimeStampOfObjectsInCacheWithExternalReferences,
 which goes through the objects and increments the timestamp if the
 refCount1 (this is not true for just loaded Image, so timestamp remains
 0.0).
 - Calls Registry::removeExpiredObjectsInCache, which goes through all the
 objects and compares the timestamp with the expiry time. Because the Image
 was added with timestamp 0.0 (and the timestamp was not updated), it is
 added to objectsToRemove.
 - Registry::removeExpiredObjectsInCache then removes all the objects in
 objectsToRemove from Registry::_objectCache. This causes the refCount on the
 Image to decrement to 0, causing the actual Image object to be freed.

 At this point Thread 2 has an Image* to freed memory, but does not realize
 the object has been freed. In our case this eventually led to crashes.

 There are two solutions to this:
 - Instead of using the default timestamp of 0.0, set the time explicitly to
 the current time. This makes sure the item is not immediately discarded from
 the Registry. Even if not all loads are finished before the new item times
 out, the item will have been acquired by some other place in the code, so
 the refCount will be larger than 1 when the timeout occurs. This requires
 the current time to be passed into readImplementation. A proper solution
 would be to pass in the timestamp of the DatabaseRequest into the read*
 functions. That requires modification of a lot of functions though.
 - Changing the return types of the sequence above so there aren't any normal
 osg::Image* passed around, but only ref_ptr. This makes sure that refCount
 never goes to during the loading process. This option also requires the
 modification of a large number of functions.

 We chose a slightly less proper solution following the first option. We've
 added a function to set the time in Registry once a frame, and call that
 just before DatabasePager::updateSceneGraph (which calls
 DatabasePager::removeExpiredSubgraphs).

 Please feel free to contact me for more info as needed,
 Rick



 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


 ___
 osg-users

Re: [osg-users] Multi-threading bug related to Registry

2008-11-18 Thread Smeenk, R.J.M. (Roland)
Hi Rick,
 
it looks like Robert Osfield missed this post otherwise he already would
have reacted to it. ;)
 
--
Roland




From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Rick
Appleton
Sent: maandag 17 november 2008 14:24
To: osg-users@lists.openscenegraph.org
Subject: [osg-users] Multi-threading bug related to Registry



There is a nasty multi-threading bug related to
osgDB::DatabasePager and the _objectCache in osgDB::Registry. This is
the situation.

* Thread 2 stack (DatabasePager thread):
ReaderWriter::ReadResult Registry::readImplementation
ReaderWriter::ReadResult Registry::readImageImplementation
ReaderWriter::ReadResult Registry::readImage
osg::Image* osgDB::readImageFile
osg::Image* DataInputStream::readImage(std::string filename)
osg::Image* DataInputStream::readImage(IncludeImageMode mode)
osg::Image* DataInputStream::readImage()
void Texture2D::read(DataInputStream* in)
..

- After reading in the file in Registry::readImplementation the
refCount of the new Image is 1 (ReadResult contains a ref_ptr).
- This Image is then inserted into Registry::_objectCache with
the default timestamp (0.0), which increments the recCount to 2.
- The ReadResult object is returned, so no net change in
refCount, until the stack is unwound up to osgDB::readImageFile.
- osgDB::readImageFile takes a normal pointer to the Image,
which does not change the refCount. When osgDB::readImageFile returns,
the ReadResult object goes out of scope, deleting a reference to the
Image. refCount is now 1.
- Stack unwinds until back in Texture2D::read at which point the
normal Image* is passed into Texture2D::setImage which assigns the
Image* to the internal ref_ptr. At this moment the refCount is increased
back to 2.

Imagine that somewhere during the stack unwind between
osgDB::readImageFile and Texture2D::read a thread switch takes place.

* Thread 1:
The main thread then reaches
DatabasePager::removeExpiredSubgraphs which does the following at the
end of the function.

- Calls
Registry::updateTimeStampOfObjectsInCacheWithExternalReferences, which
goes through the objects and increments the timestamp if the refCount1
(this is not true for just loaded Image, so timestamp remains 0.0).
- Calls Registry::removeExpiredObjectsInCache, which goes
through all the objects and compares the timestamp with the expiry time.
Because the Image was added with timestamp 0.0 (and the timestamp was
not updated), it is added to objectsToRemove.
- Registry::removeExpiredObjectsInCache then removes all the
objects in objectsToRemove from Registry::_objectCache. This causes the
refCount on the Image to decrement to 0, causing the actual Image object
to be freed.

At this point Thread 2 has an Image* to freed memory, but does
not realize the object has been freed. In our case this eventually led
to crashes.

There are two solutions to this:
- Instead of using the default timestamp of 0.0, set the time
explicitly to the current time. This makes sure the item is not
immediately discarded from the Registry. Even if not all loads are
finished before the new item times out, the item will have been acquired
by some other place in the code, so the refCount will be larger than 1
when the timeout occurs. This requires the current time to be passed
into readImplementation. A proper solution would be to pass in the
timestamp of the DatabaseRequest into the read* functions. That requires
modification of a lot of functions though.
- Changing the return types of the sequence above so there
aren't any normal osg::Image* passed around, but only ref_ptr. This
makes sure that refCount never goes to during the loading process. This
option also requires the modification of a large number of functions.

We chose a slightly less proper solution following the first
option. We've added a function to set the time in Registry once a frame,
and call that just before DatabasePager::updateSceneGraph (which calls
DatabasePager::removeExpiredSubgraphs).

Please feel free to contact me for more info as needed,
Rick




This e-mail and its contents are subject to the DISCLAIMER at 
http://www.tno.nl/disclaimer/email.html
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] Multi-threading bug related to Registry

2008-11-18 Thread Robert Osfield
Hi Rick,

The solution is for ref_ptrImage to be used, but where I can't say
as you don't provide full details on what code is at fault.  Which
plugins are you invoking in this instance?  .osg? .ive?  Others?

Robert.

On Mon, Nov 17, 2008 at 1:23 PM, Rick Appleton
[EMAIL PROTECTED] wrote:
 There is a nasty multi-threading bug related to osgDB::DatabasePager and the
 _objectCache in osgDB::Registry. This is the situation.

 * Thread 2 stack (DatabasePager thread):
 ReaderWriter::ReadResult Registry::readImplementation
 ReaderWriter::ReadResult Registry::readImageImplementation
 ReaderWriter::ReadResult Registry::readImage
 osg::Image* osgDB::readImageFile
 osg::Image* DataInputStream::readImage(std::string filename)
 osg::Image* DataInputStream::readImage(IncludeImageMode mode)
 osg::Image* DataInputStream::readImage()
 void Texture2D::read(DataInputStream* in)
 ..

 - After reading in the file in Registry::readImplementation the refCount of
 the new Image is 1 (ReadResult contains a ref_ptr).
 - This Image is then inserted into Registry::_objectCache with the default
 timestamp (0.0), which increments the recCount to 2.
 - The ReadResult object is returned, so no net change in refCount, until the
 stack is unwound up to osgDB::readImageFile.
 - osgDB::readImageFile takes a normal pointer to the Image, which does not
 change the refCount. When osgDB::readImageFile returns, the ReadResult
 object goes out of scope, deleting a reference to the Image. refCount is now
 1.
 - Stack unwinds until back in Texture2D::read at which point the normal
 Image* is passed into Texture2D::setImage which assigns the Image* to the
 internal ref_ptr. At this moment the refCount is increased back to 2.

 Imagine that somewhere during the stack unwind between osgDB::readImageFile
 and Texture2D::read a thread switch takes place.

 * Thread 1:
 The main thread then reaches DatabasePager::removeExpiredSubgraphs which
 does the following at the end of the function.

 - Calls Registry::updateTimeStampOfObjectsInCacheWithExternalReferences,
 which goes through the objects and increments the timestamp if the
 refCount1 (this is not true for just loaded Image, so timestamp remains
 0.0).
 - Calls Registry::removeExpiredObjectsInCache, which goes through all the
 objects and compares the timestamp with the expiry time. Because the Image
 was added with timestamp 0.0 (and the timestamp was not updated), it is
 added to objectsToRemove.
 - Registry::removeExpiredObjectsInCache then removes all the objects in
 objectsToRemove from Registry::_objectCache. This causes the refCount on the
 Image to decrement to 0, causing the actual Image object to be freed.

 At this point Thread 2 has an Image* to freed memory, but does not realize
 the object has been freed. In our case this eventually led to crashes.

 There are two solutions to this:
 - Instead of using the default timestamp of 0.0, set the time explicitly to
 the current time. This makes sure the item is not immediately discarded from
 the Registry. Even if not all loads are finished before the new item times
 out, the item will have been acquired by some other place in the code, so
 the refCount will be larger than 1 when the timeout occurs. This requires
 the current time to be passed into readImplementation. A proper solution
 would be to pass in the timestamp of the DatabaseRequest into the read*
 functions. That requires modification of a lot of functions though.
 - Changing the return types of the sequence above so there aren't any normal
 osg::Image* passed around, but only ref_ptr. This makes sure that refCount
 never goes to during the loading process. This option also requires the
 modification of a large number of functions.

 We chose a slightly less proper solution following the first option. We've
 added a function to set the time in Registry once a frame, and call that
 just before DatabasePager::updateSceneGraph (which calls
 DatabasePager::removeExpiredSubgraphs).

 Please feel free to contact me for more info as needed,
 Rick



 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


[osg-users] Multi-threading bug related to Registry

2008-11-17 Thread Rick Appleton
There is a nasty multi-threading bug related to osgDB::DatabasePager and the 
_objectCache in osgDB::Registry. This is the situation.

* Thread 2 stack (DatabasePager thread):
ReaderWriter::ReadResult Registry::readImplementation
ReaderWriter::ReadResult Registry::readImageImplementation
ReaderWriter::ReadResult Registry::readImage
osg::Image* osgDB::readImageFile
osg::Image* DataInputStream::readImage(std::string filename)
osg::Image* DataInputStream::readImage(IncludeImageMode mode)
osg::Image* DataInputStream::readImage()
void Texture2D::read(DataInputStream* in)
..

- After reading in the file in Registry::readImplementation the refCount of the 
new Image is 1 (ReadResult contains a ref_ptr).
- This Image is then inserted into Registry::_objectCache with the default 
timestamp (0.0), which increments the recCount to 2.
- The ReadResult object is returned, so no net change in refCount, until the 
stack is unwound up to osgDB::readImageFile.
- osgDB::readImageFile takes a normal pointer to the Image, which does not 
change the refCount. When osgDB::readImageFile returns, the ReadResult object 
goes out of scope, deleting a reference to the Image. refCount is now 1.
- Stack unwinds until back in Texture2D::read at which point the normal Image* 
is passed into Texture2D::setImage which assigns the Image* to the internal 
ref_ptr. At this moment the refCount is increased back to 2.

Imagine that somewhere during the stack unwind between osgDB::readImageFile and 
Texture2D::read a thread switch takes place.

* Thread 1:
The main thread then reaches DatabasePager::removeExpiredSubgraphs which does 
the following at the end of the function.

- Calls Registry::updateTimeStampOfObjectsInCacheWithExternalReferences, which 
goes through the objects and increments the timestamp if the refCount1 (this 
is not true for just loaded Image, so timestamp remains 0.0).
- Calls Registry::removeExpiredObjectsInCache, which goes through all the 
objects and compares the timestamp with the expiry time. Because the Image was 
added with timestamp 0.0 (and the timestamp was not updated), it is added to 
objectsToRemove.
- Registry::removeExpiredObjectsInCache then removes all the objects in 
objectsToRemove from Registry::_objectCache. This causes the refCount on the 
Image to decrement to 0, causing the actual Image object to be freed.

At this point Thread 2 has an Image* to freed memory, but does not realize the 
object has been freed. In our case this eventually led to crashes.

There are two solutions to this:
- Instead of using the default timestamp of 0.0, set the time explicitly to the 
current time. This makes sure the item is not immediately discarded from the 
Registry. Even if not all loads are finished before the new item times out, the 
item will have been acquired by some other place in the code, so the refCount 
will be larger than 1 when the timeout occurs. This requires the current time 
to be passed into readImplementation. A proper solution would be to pass in the 
timestamp of the DatabaseRequest into the read* functions. That requires 
modification of a lot of functions though.
- Changing the return types of the sequence above so there aren't any normal 
osg::Image* passed around, but only ref_ptr. This makes sure that refCount 
never goes to during the loading process. This option also requires the 
modification of a large number of functions.

We chose a slightly less proper solution following the first option. We've 
added a function to set the time in Registry once a frame, and call that just 
before DatabasePager::updateSceneGraph (which calls 
DatabasePager::removeExpiredSubgraphs).

Please feel free to contact me for more info as needed,
Rick 


___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org