2017-12-10 2:19 GMT+01:00 Petr Vorel <petr.vo...@gmail.com>: > I test it (in my github fork of) LTP project [1], but unfortunately some > tests which heavy > test it fails.
With a large number of threads, the test-and-test-and-set option will significantly cut down the coherency traffic between caches. It's a much better solution than a pure test-and-test. > The tests just get slightly fewer incrementation than it should get (6373821 > instead of 6400000). > > The implementation in github is for SPARC32, but I adjusted the code to test > it on SPARC64 > and (of course) it behaves the same. > > Any idea what can be wrong? Are you testing on true v8 hardware or on v9 ? On v9 (and v8+) you might be running in RMO (Relaxed Memory Order), in which case you probably need some extra "membar" to force the update of the variable [1]. By default, the code probably only work for PSO and TSO modes (the only two that exist in v8) if adding a "membar #MemIssue" before and after the update of the variable solve the problem, then memory ordering is the culprit. (this forces way too strong an ordering, but is useful for a test). "membar" is v9/v8+ only. If you're on true v8 HW - darn. PSO still could be the problem. try with "stbar" surrounding the variable update. "stbar" is a bit weaker that some variants of "membar", but is in v8. You might want to take a look on how much ordering instructions the kernel uses, after all :-( Low-level parallelism is hard :-) Cordially, Romain [1] e.g. <http://www.oracle.com/technetwork/server-storage/solaris10/index-142944.html>, <https://cr.yp.to/2005-590/sparcv9.pdf> -- Romain Dolbeau