A way to do this without spawning threads manually:
```d
import std.parallelism : TaskPool, parallel, taskPool, defaultPoolThreads;
import std.stdio : writeln;
import std.range : iota;
enum NSWEPT = 1_000_000;
enum NCPU = 4;
void main() {
import core.atomic : atomicLoad, atomicOp;
shared(uint) value;
defaultPoolThreads(NCPU);
TaskPool pool = taskPool();
foreach(_; pool.parallel(iota(NSWEPT))) {
atomicOp!"+="(value, 1);
}
writeln(pool.size);
writeln(atomicLoad(value));
}
```
Unfortunately I could only use the default task pool, creating a new one
took too long on run.dlang.io.
I also has to decrease NSWEPT because anything larger would take too long.