Simple parallel foreach and summation/reduction

Chris Katko via Digitalmars-d-learn Wed, 19 Sep 2018 22:36:02 -0700

All I want to do is loop from 0 to [constant] with a for orforeach, and have it split up across however many cores I have.


    ulong sum;
    foreach(i; [0 to 1 trillion])
      {
      //flip some dice using
      float die_value = uniform(0F,12F);
      if(die_value > [constant]) sum++;
      }
    writeln("The sum is %d", sum);


However, there are two caveats.:

- One: I can't throw a range of values into an array and foreachon that like many examples use. Because 1 trillion (counting fromzero) might be a little big for an array. (I'm using 1 trillionto illustrate a specific bottleneck / problem form.)


 - I want to merge the results at the end.

Which means I either need to use mutexes (BAD. NO. BOO. HISS.)or each "thread" would need to know if it's separate, and thenstore their sums in, say, a thread[#].sum variable and then onceall were completed, add those sums together.

I know this is an incredibly simple conceptual problem to solve.So I feel like I'm missing some huge, obvious, answer for doingit elegantly in D.

And this just occurred to me, if I had a trillion foreach, willthat make 1 trillion threads? What I want is, IIRC, what OpenMPdoes. It divides up your range (blocks of sequential numbers) bythe number of threads. So domain of [1 to 1000] with ten threadswould become workloads on the indexes of [1-100], [101-200],[201-300], and so on. for each CPU. They each get a 100 elementchunk.


So I guess foreach won't work here for that, will it? Hmmm...

----> But again, conceptually this is simple: I have, say, 1trillion sequential numbers. I want to assign a "block" (or"range") to each CPU core. And since their math does not actuallyinterfer with each other, I can simply sum each core's results atthe end.


Thanks,
--Chris

Simple parallel foreach and summation/reduction

Reply via email to