Re: Multi-threading and data sharing

mratsim Fri, 11 Oct 2019 16:15:28 -0700

Don't update concurrently the same data structure that will create a huge 
contention bottleneck due to the heavy synchronization and our program might 
become slower than single-threaded due to cache thrashing.


Assuming your algorithm is tree-like (I'm familiar with go bot but not chess 
bot) ideally you have a tree-datastructure and:

  * Either you launch a thread on separate branches and keep the 
synchronization restricted to 2 threads (sub-branches) a.k.a. branch-parallelism
  * Or you duplicate your data structure per-thread to avoid paying 
synchronisation cost a.k.a. tree-parallelism



If you want to know how big the synchronization cost can be my fibonacci 
benchmark of GCC implementation of OpenMP versus LLVM shows a factor 100..000x 
( 
[https://github.com/mratsim/weave/tree/master/benchmarks/fibonacci](https://github.com/mratsim/weave/tree/master/benchmarks/fibonacci)
 ).

Today you can use my experimental weave code that gives you async like 
semantics and very efficient multithreading, for usage see:

  * 
[https://github.com/mratsim/weave/blob/master/e04_channel_based_work_stealing/async_internal.nim#L138-L180](https://github.com/mratsim/weave/blob/master/e04_channel_based_work_stealing/async_internal.nim#L138-L180)
  * 
[https://github.com/mratsim/weave/blob/master/e04_channel_based_work_stealing/async_for_internal.nim#L129-L151](https://github.com/mratsim/weave/blob/master/e04_channel_based_work_stealing/async_for_internal.nim#L129-L151)



It's experimental code nonetheless it's working and backed by 3 years of PhD 
research. I suggest you submodule the library.

In the future I hope to make it a proper high-level library, see [Project 
Picasso RFC](https://forum.nim-lang.org/t/5083).

Alternatively you can use Nim threadpools but they suffer from the same issue 
as GCC OpenMP:

  * it uses a single global queue that enqueues/dequeues all tasks
  * consequently it cannot do load balancing (work-stealing)
  * it chokes if tasks are small (say 1ms/task) and the queue datastructure 
becomes the contention point.

Re: Multi-threading and data sharing

Reply via email to