Sure answer is: we have to be thread-safe, otherwise it's just waiting for nasty heisenbug to happen.
Now, our atomic_ops.h is based on Jemalloc from 3 years ago. There are some interesting twists happened in the upstream's atomic.h supporting C11 atomics and such. We can totally: 1. Optionally update our atomic ops 2. Switch away from OSAtomic on OSX and use different implementation On Tue, Jan 26, 2016 at 11:49 PM, Brecht Van Lommel < [email protected]> wrote: > It's fine to use __sync_fetch_and_and on OS X, no need to use > OSAtomic* functions. Using 8 bit __sync_fetch_and_and on OS X compiles > fine for me. > > OpenImageIO removed OSAtomic* almost 3 years ago, and we've had no > problem with that: > > https://github.com/OpenImageIO/oiio/commit/14d5d8e572322e5e0decdd1b7d3f4d0484819473 > > On Tue, Jan 26, 2016 at 9:57 PM, Bastien Montagne <[email protected]> > wrote: > > Hi devs, > > > > So, while working on switching from OMP to BLI_task, I've hit today an > > rather hairy issue in pbvh_update_normals() (in BKE's pbvh.c). > > > > Current OMP-based code of that function executes in (roughly) 28ms. > > However, reading it, it's obvious it's lacking a `#pragma omp critical` > > section > > around main part of the second OMP loop, since affected MVert may be > > used by several nodes, and hence suffer concurrency here. > > > > Now, if I add that critical section, code now takes over 100ms to run! > > > > Changing to BLI_task parrallelized code, with same correct protection > > (using a spinlock) I can get about 70ms. Using some gcc atomic op > > (__sync_fetch_and_and), since the atomic ME_VERT_PBVH_UPDATE flag > > checking and clearing is enough to prevent concurrency here, > > it can go down to 20ms. > > > > Sad part of the story: because MVert->flag is and 8bit var, there is no > > clean way to do the same under OSX, we can only hack around > > OSAtomicTestAndClear but it's far from nice and clean (need to retrieve > > bit number from bitmask eg.). > > > > Now comes the funny question: since the probability of thread > > concurrency here is very low (most vertices are used by only one bvhnode, > > and the 'fragile' part of the code is *very* short, two bitflag > > operations), do we want to care about absolute safety of computed > > normals here? > > Missing 'critical section' in second loop never hurted us so far it > seems... > > > > Need your thoughts here, before I loose more time on this hell. ;) > > _______________________________________________ > > Bf-committers mailing list > > [email protected] > > http://lists.blender.org/mailman/listinfo/bf-committers > _______________________________________________ > Bf-committers mailing list > [email protected] > http://lists.blender.org/mailman/listinfo/bf-committers > -- With best regards, Sergey Sharybin _______________________________________________ Bf-committers mailing list [email protected] http://lists.blender.org/mailman/listinfo/bf-committers
