All I want to do is loop from 0 to [constant] with a for or foreach, and have it split up across however many cores I have.

    ulong sum;
    foreach(i; [0 to 1 trillion])
      {
      //flip some dice using
      float die_value = uniform(0F,12F);
      if(die_value > [constant]) sum++;
      }
    writeln("The sum is %d", sum);

However, there are two caveats.:

- One: I can't throw a range of values into an array and foreach on that like many examples use. Because 1 trillion (counting from zero) might be a little big for an array. (I'm using 1 trillion to illustrate a specific bottleneck / problem form.)

 - I want to merge the results at the end.

Which means I either need to use mutexes (BAD. NO. BOO. HISS.) or each "thread" would need to know if it's separate, and then store their sums in, say, a thread[#].sum variable and then once all were completed, add those sums together.

I know this is an incredibly simple conceptual problem to solve. So I feel like I'm missing some huge, obvious, answer for doing it elegantly in D.

And this just occurred to me, if I had a trillion foreach, will that make 1 trillion threads? What I want is, IIRC, what OpenMP does. It divides up your range (blocks of sequential numbers) by the number of threads. So domain of [1 to 1000] with ten threads would become workloads on the indexes of [1-100], [101-200], [201-300], and so on. for each CPU. They each get a 100 element chunk.

So I guess foreach won't work here for that, will it? Hmmm...

----> But again, conceptually this is simple: I have, say, 1 trillion sequential numbers. I want to assign a "block" (or "range") to each CPU core. And since their math does not actually interfer with each other, I can simply sum each core's results at the end.

Thanks,
--Chris

Reply via email to