You'll have to tell us the algorithm you're trying to implement or context 
because:

  1. If your threads only rarely need concurrent access to the array, you'll be 
fine, but you might as well use `std/atomics` `fetchAdd` and skip locks.
  2. If you have constant concurrent access to the array, the lock will be a 
bottleneck and your code might be slower than serial code. `fetchAdd` may help 
but I would expect it would be still slower due to each update invalidating the 
cache of the other cores leading to "cache thrashing"
  3. If threads can access to distinct cells in the array, you can parallelize 
without locks or atomics and enjoy good speedup.


Reply via email to