Hi Larry, I have no objections, and am happy to see one less dependency, plus the better performance.
Best, Erich On Jun 21, 2012, at 11:05 AM, Larry Gritz wrote: > Erich, did you have any objections to switching OIIO and OSL to defaulting to > USE_TBB=0? (Or anybody else?) > > I'm not going to rip the TBB code out quite yet, so anybody is still free to > compile with USE_TBB=1 if they think they'll get better performance that way > or don't trust our locks. > > It's not a slam-dunk win in all circumstances, but mostly what I'm seeing > across several platforms is that the our new code (USE_TBB=0) is slightly > slower than TBB for a small number of threads, but MUCH faster than TBB for > high thread counts. > > -- lg > > > On Jun 14, 2012, at 5:17 PM, Erich Ocean wrote: > >> Hi Larry, >> >> Would this also affect OSL? Or would TBB still be needed there? >> >> Best, Erich >> >> On Jun 14, 2012, at 5:13 PM, Larry Gritz wrote: >> >>> Anybody else want to comment or report timings? >>> >>> Are there any objections to my committing this to the master (i.e. improve >>> the non-TBB case, but not yet change the default for USE_TBB)? >>> >>> I'm also still open to changing USE_TBB to default to 0, or even removing >>> TBB entirely, if everybody's timings show that non-TBB is not significantly >>> worse than TBB across a wide variety of hardware and OS configurations. >>> >>> >>> >>> >>> On Jun 12, 2012, at 12:23 AM, Larry Gritz wrote: >>> >>>> https://github.com/OpenImageIO/oiio/pull/273 >>>> >>>> This was from a few months back, I didn't have enough success to push for >>>> a commit, but now I have a major update: >>>> >>>> After some weekend reading, I had some ideas for how to improve the spin >>>> locks and just pushed an update in which, on my two machines (a Mac OS X >>>> laptop, and a 12-core Linux box), my non-TBB spinlock now outperforms the >>>> TBB spinlocks (by a significant margin on Linux!). >>>> >>>> Could those of you on this thread please try this out? Check out the >>>> branch for this pull request, do "make nuke ; make USE_TBB=1 ; make test >>>> TEST=spinlock" and then the same thing again with USE_TBB=0 (be sure to do >>>> a full make nuke first). >>>> >>>> If the rest of you find the same thing -- that the USE_TBB=0 benchmark is >>>> no slower than the USE_TBB=1 -- then perhaps we can finally retire TBB. >>>> >>>> For the curious, the three improvements were: >>>> >>>> (a) exponential backoff for spin lock contention in the same manner that >>>> TBB was doing; >>>> >>>> (b) try_lock can spin more efficiency with read-only (non-bus-locking) >>>> check until there's a good chance that the lock was released. In >>>> particular, you should not spin like this: >>>> >>>> while (! try_lock(&thelock)) ; >>>> >>>> but instead >>>> >>>> while (! try_lock(&thelock)) >>>> while (thelock) ; >>>> >>>> This is because the try_lock does a compare-and-swap, which is a writing >>>> operation, which will lock the bus. In the latter version, if the initial >>>> (bus-locking) try-lock fails, it will do a read-only (non-bus-locking!) >>>> check until the lock appears to be free, then do the safe/locking one >>>> again. >>>> >>>> (c) on x86_64, it's safe for spin_lock::unlock() to do a normal unlocked >>>> write. The memory ordering of these chips is such that it doesn't need >>>> the memory fences of a full atomic write. I found numerous sources for >>>> this on the net, and it is a big win. Apparently, it also works on some >>>> 32 bit x86's, most of the recent chips, but I am not very interested in >>>> sorting out which x86 chips it's safe to do it on, especially since >>>> anybody heavily threaded enough to be concerned with this optimization is >>>> on a 64 bit system. >>>> >>>> OK, so let me know! If it works, it's a performance improvement as well >>>> as a way for us to shed the pesky TBB dependency. Win, win. >>>> >>>> >>>> >>>> >>>> On Apr 3, 2012, at 10:34 AM, Larry Gritz wrote: >>>> >>>>> I would very much appreciate if people could grab the code from this pull >>>>> request, and run: >>>>> >>>>> build/ARCH/libOpenImageIO/atomic_test >>>>> build/ARCH/libOpenImageIO/spinlock_test >>>>> >>>>> and report the timings after a fresh compile with USE_TBB ON, and again >>>>> with USE_TBB OFF. (On Linux and OSX, if you prefer running from 'make', >>>>> then just run 'make nuke ; make USE_TBB=1' or 'make nuke ; make >>>>> USE_TBB=0'. On Windows, or if you prefer otherwise, just set the CMake >>>>> variable USE_TBB to ON or OFF, respectively.) >>>>> >>>>> I'm especially keen to hear the results from people on Windows, as that >>>>> is the major platform that I have no way to test myself. >>>>> >>>>> If this makes it appear that there is no speed penalty from falling back >>>>> on the gcc & win32 intrinsics, compared to TBB, then I think we should >>>>> simplify our lives by removing TBB use entirely from OIIO. >>>>> >>>>> >>>>> >>>>> On Apr 3, 2012, at 10:25 AM, Larry Gritz wrote: >>>>> >>>>>> Make atomic_test and spinlock_test run for many more iterations, and >>>>>> time their results. >>>>>> >>>>>> This allows us to rigorously compare the speed of our atomics and spin >>>>>> locks. >>>>>> Also make their output a little neater by locking around the status >>>>>> printouts, and >>>>>> eliminate the #if's that provide safety for Boost < 1.35, which is no >>>>>> longer supported. >>>>>> >>>>>> You can merge this Pull Request by running: >>>>>> >>>>>> git pull https://github.com/lgritz/oiio lg-atomic >>>>>> >>>>>> Or you can view, comment on it, or merge it online at: >>>>>> >>>>>> https://github.com/OpenImageIO/oiio/pull/273 >>>>>> >>>>>> -- Commit Summary -- >>>>>> >>>>>> * Make atomic_test and spinlock_test run for many more iterations, and >>>>>> time their results. >>>>>> >>>>>> -- File Changes -- >>>>>> >>>>>> M src/libOpenImageIO/atomic_test.cpp (38) >>>>>> M src/libOpenImageIO/spinlock_test.cpp (29) >>>>>> >>>>>> -- Patch Links -- >>>>>> >>>>>> https://github.com/OpenImageIO/oiio/pull/273.patch >>>>>> https://github.com/OpenImageIO/oiio/pull/273.diff >>>>>> >>>>>> --- >>>>>> Reply to this email directly or view it on GitHub: >>>>>> https://github.com/OpenImageIO/oiio/pull/273 >>>>>> _______________________________________________ >>>>>> Oiio-dev mailing list >>>>>> [email protected] >>>>>> http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org >>>>> >>>>> -- >>>>> Larry Gritz >>>>> [email protected] >>>>> >>>>> >>>>> _______________________________________________ >>>>> Oiio-dev mailing list >>>>> [email protected] >>>>> http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org >>>> >>>> -- >>>> Larry Gritz >>>> [email protected] >>>> >>>> >>>> _______________________________________________ >>>> Oiio-dev mailing list >>>> [email protected] >>>> http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org >>> >>> -- >>> Larry Gritz >>> [email protected] >>> >>> >>> _______________________________________________ >>> Oiio-dev mailing list >>> [email protected] >>> http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org >> >> _______________________________________________ >> Oiio-dev mailing list >> [email protected] >> http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org >> > > -- > Larry Gritz > [email protected] > > > _______________________________________________ > Oiio-dev mailing list > [email protected] > http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org _______________________________________________ Oiio-dev mailing list [email protected] http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org
