ok, i can confirm that on my 64bit mac, both clang and gcc use cmpxchgl rather than cmpxchg i'll whip up a strawman patch on head that can be cherrypicked / tested out by ryan et al
On Sat, Feb 1, 2014 at 1:12 AM, Carter Schonwald <[email protected] > wrote: > Hey Ryan, > looking at this closely > Why isn't CAS using CMPXCHG8B on 64bit architectures? Could that be the > culprit? > > Could the issue be that we've not had a good stress test that would create > values that are equal on the 32bit range, but differ on the 64bit range, > and you're hitting that? > > Could you try seeing if doing that change fixes things up? > (I may be completely wrong, but just throwing this out as a naive > "obvious" guess) > > > On Sat, Feb 1, 2014 at 12:58 AM, Ryan Newton <[email protected]> wrote: > >> Then again... I'm having trouble seeing how the spec on page 3-149 of the >> Intel manual would allow the behavior I'm seeing: >> >> >> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf >> >> Nevertheless, this is exactly the behavior we're seeing with the current >> Haskell primops. Two threads simultaneously performing the same CAS(p,a,b) >> can both think that they succeeded. >> >> >> >> >> >> On Sat, Feb 1, 2014 at 12:33 AM, Ryan Newton <[email protected]> wrote: >> >>> I commented on the commit here: >>> >>> >>> https://github.com/ghc/ghc/commit/521b792553bacbdb0eec138b150ab0626ea6f36b >>> >>> The problem is that our "cas" routine in SMP.h is similar to the C >>> compiler intrinsic __sync_val_compare_and_swap, in that it returns the old >>> value. But it seems we cannot use a comparison against that old value to >>> determine whether or not the CAS succeeded. (I believe the CAS may fail >>> due to contention, but the old value may happen to look like our old value.) >>> >>> Unfortunately, this didn't occur to me until it started causing bugs [1] >>> [2]. Fixing casMutVar# fixes these bugs. However, the way I'm currently >>> fixing CAS in the "atomic-primops" package is by using >>> __sync_bool_compare_and_swap: >>> >>> >>> https://github.com/rrnewton/haskell-lockfree/commit/f9716ddd94d5eff7420256de22cbf38c02322d7a#diff-be3304b3ecdd8e1f9ed316cd844d711aR200 >>> >>> What is the best fix for GHC itself? Would it be ok for GHC to include >>> a C compiler intrinsic like __sync_val_compare_and_swap? Otherwise we need >>> another big ifdbef'd function like "cas" in SMP.h that has the >>> architecture-specific inline asm across all architectures. I can write the >>> x86 one, but I'm not eager to try the others. >>> >>> Best, >>> -Ryan >>> >>> [1] https://github.com/iu-parfunc/lvars/issues/70 >>> [2] https://github.com/rrnewton/haskell-lockfree/issues/15 >>> >>> >> >> _______________________________________________ >> ghc-devs mailing list >> [email protected] >> http://www.haskell.org/mailman/listinfo/ghc-devs >> >> >
_______________________________________________ ghc-devs mailing list [email protected] http://www.haskell.org/mailman/listinfo/ghc-devs
