I need to use some per-thread, pre-allocated data-structures. The point is to
avoid re-creating large arrays except just once, the first time each thread
uses one.
With Python+C multiprocessing, I simply use a global. Each process has its own
global, and Python manages a pool of processes which each persists its only
global data. Very convenient.
My Nim version, with multi-threading, is 3x faster than the PyC version on my
Mac. But it crashes on Linux, or runs _very_ slowly and consumes quickly
exploding amounts of memory. I _think_ this is because I am using
`{.threadvar.}` for the thread-local variable, and on Mac it persists while on
Linux ... well I'm really not sure what's going on. But it runs fine on Linux
single-threaded iff I drop threadvar.
So I'm wondering if threadvar works differently on Mac than on Linux. And
beyond that pure curiosity, I'm wondering how best to handle per-thread
persistence with a threadpool.
This is what works on Mac but not reliably (apparently) on Linux:
var top {.threadvar.}: ref Foo
proc go() {.thread.} =
if top.isNil:
top = makeNewFoo() # pre-allocate large mutable sequences etc.
...