See below...

On Mon, Mar 23, 2009 at 8:16 PM, Gerrit Voß <[email protected]> wrote:
>
> Hi,
>
> short comment,
>
> On Mon, 2009-03-23 at 19:12 -0600, Allen Bierbaum wrote:
>> Hello all:
>>
>> I think I have finally tracked down some of the issues with our
>> application, but I wanted to write to ask if anyone can see any others
>> and comment on one issue I think may need a fix/workaround in OpenSG.
>>
>> For purposes of this discussion, our application functions as follows:
>> - There is a primordial thread P that takes care of all rendering from
>> a single scene graph
>> - There are 2 background threads (BG1, BG2) that load data from disk
>> and periodically re-sync with the primordial thread to connect up the
>> data they loaded.
>> - All threads share the same aspect
>> - No threads make changes to the same FC at the same time.  (BG1 and
>> BG2 create data, P connects that data to the scene graph)
>
> that condition does not seem to hold.
>
>> - References to FCs are held using MTRecPtr's
>>
>> 1) The first question is, what issues do the experts on the list see
>> with this type of arrangement.  Our simplistic thinking had been that
>> as long as we don't write to the same FC at the same time in the same
>> thread we would be ok.  Now that we are into it further, we are
>> worried about a couple of specific issues.
>>
>>     * Change lists: As discussed in another posting, we think that the
>> change list should be thread safe and that as long as we call
>> commitChanges() we should be good.
>
> I would expect so too.
>
>>     * Aspect store: Each FC has an aspect store.  If two threads read
>> from an FC in parallel, is it possible to corrupt the aspect store?
>> (we have seen a few crashes on testing size of the aspect store of an
>> fc)
>
> not as long as both only read.
>
>>     * FCFactory: All new field containers get registered with the
>> field container factory.  Is the registration process thread safe?
>
> these tables should be locked by a proper lock.
>
>>     * Other thread safety issues:  Is there anywhere else we are not
>> thinking off that may have some race conditions when sharing the same
>> aspect across multiple threads?
>>
>> 2) XxxPtr creation is not thread safe
>>
>> We think this one is killing us.  To put it more concretely, imagine a
>> scenario like this...
>>
>> ---- P ------
>> ChunkMaterialMTRecPtr mat = OSG::ChunkMaterial::create()
>> ....
>> SharedDataObject.setSharedMaterial(mat);
>>
>> ---- BG1 ----------
>> GeomMTRecPtr geom = ...
>> ChunkMaterialMTRecPtr mat = SharedDataObject.getSharedMaterial();
>> geom->setMaterial(mat)
>>
>> ---- BG2 ----------
>> GeomMTRecPtr geom = ...
>> ChunkMaterialMTRecPtr mat = SharedDataObject.getSharedMaterial();
>> geom->setMaterial(mat)
>
> here you modify the same object (mat) in more than one thread.

Maybe this is the part we are missing.  How does this modify the
Material object?  Or do you mean the ref count on the field container
of the material object?

>
>> There is a race condition here on the reference counting.  More
>> specifically, if multiple ChunkMaterialMTRecPtr's are created to the
>> same FC at close to the same time, the reference count can get out of
>> sync (we think).  This can lead to dangling references and/or access
>> to destroyed objects.
>>
>> This may seem like a no-brainer issue, just don't do it.  But it gets
>> more complex then that.  For example this issue makes it so the simple
>> geom utilities in OpenSG are not thread safe because they call
>> getDefaultMaterial, which in turn creates RecPtrs that can get out of
>> sync.
>>
>> More generally, the problem is that RefPtrs can't be treated the same
>> as C++ primitive types.  They can not be read from in parallel and
>> stay consistent.
>
> ?? why is reading from them a problem. If at all it should be writing
> to them.

They can't be read from to create new copies of the same object
because in so doing the ref count can get out of sync.  This is
because as you pointed out, there is a write (inc/dec) on the
underlying ref count and that is a race condition.  This all makes
sense when thinking through the implementation, but when using them in
application code it just seems strange because what looks like a read
(ie. making a copy or passing by copy) on a refptr is actually a write
hazard.

I don't know how to solve this for sure, just wanted to point it out.

>> The thing that really tripped us up is that we are
>> used to using the boost shared_ptr implementation and it does allow
>> for this degreee of thread safety.  They explain in much better then I
>> can: 
>> http://www.boost.org/doc/libs/1_38_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety
>> (copied here for easy access)
>>
>
> you compare the wrong thing, instrusive_ptr is closest to what we do.

I will take a look.

> It was somewhere on my list to look for atomic operations to make the
> ref counting safer. Looks like I have to push it up a little bit. Will
> have a look at it.

boost/smart_ptr/details has an implementation that is cross platform
and is lock free on the primary OpenSG platforms.  May be worth
reusing directly out of boost now that OpenSG uses boost.

Thanks,
Allen

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Opensg-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensg-users

Reply via email to