Re: Concurrency message passing

2021-08-17 Thread Steven Schveighoffer via Digitalmars-d-learn

On 8/17/21 2:36 PM, JG wrote:
Thanks for the suggestions and explanations. I am not sure what to do in 
my case though. The situation is as follows. I have a struct that is 
populated via user input not necessarily at single instance (so that 
seems to rule out immutable). On the other
hand while it is being populate it is only accessible from one thread so 
that
makes using shared messy. After being populated it should be passed to 
the other thread and no references are kept.


You are allowed to cast to immutable if no other mutable references are 
used. I recommend using `assumeUnique`.


What I am doing currently is populating the struct and casting to shared 
when I push into a synchronized queue (no references to its data are 
kept in the first thread). Is what I am doing wrong and can it be 
achieved using message passing?


Yeah, build it like Ali says, and then cast in order to pass it.

-Steve


Re: Concurrency message passing

2021-08-17 Thread Ali Çehreli via Digitalmars-d-learn

On 8/17/21 11:36 AM, JG wrote:

>>> Maybe message parsing isn't the
>>> correct solution?

I use message passing in many of my programs.

> After being populated it should be passed to
> the other thread and no references are kept.

Then you simply cast to-and-from 'shared' and be happy with it. :) I 
explain what I do here:


  https://www.youtube.com/watch?v=dRORNQIB2wA&t=1735s

Ali



Re: Concurrency message passing

2021-08-17 Thread JG via Digitalmars-d-learn
On Tuesday, 17 August 2021 at 12:24:14 UTC, Steven Schveighoffer 
wrote:

On 8/17/21 7:05 AM, JG wrote:

Hi

I have a program with two threads. One thread produces data 
that is put in a queue
and then consumed by the other thread. I initially built a 
custom queue to do this, but thought this should have some 
standard solution in D? I looked at std.concurrency and 
thought that message passing could be used. However, the 
problem is that I get the following error.


Error: static assert:  "Aliases to mutable thread-local data 
not allowed."


I am not sure how to solve this.  Maybe message parsing isn't 
the correct solution?

Is there some standard solution to this in D?


Data with references needs to be marked either immutable or 
shared in order to be passed using std.concurrency. D is strict 
about not sharing thread-local data, because then you can use 
the type system to prove lock-free code is valid.


However, data that has no references (aliases) should be 
passable regardless of mutability, because you are passing a 
copy.


-Steve


Thanks for the suggestions and explanations. I am not sure what 
to do in my case though. The situation is as follows. I have a 
struct that is populated via user input not necessarily at single 
instance (so that seems to rule out immutable). On the other
hand while it is being populate it is only accessible from one 
thread so that
makes using shared messy. After being populated it should be 
passed to the other thread and no references are kept.


What I am doing currently is populating the struct and casting to 
shared when I push into a synchronized queue (no references to 
its data are kept in the first thread). Is what I am doing wrong 
and can it be achieved using message passing?






Re: Concurrency message passing

2021-08-17 Thread Steven Schveighoffer via Digitalmars-d-learn

On 8/17/21 7:05 AM, JG wrote:

Hi

I have a program with two threads. One thread produces data that is put 
in a queue
and then consumed by the other thread. I initially built a custom queue 
to do this, but thought this should have some standard solution in D? I 
looked at std.concurrency and thought that message passing could be 
used. However, the problem is that I get the following error.


Error: static assert:  "Aliases to mutable thread-local data not allowed."

I am not sure how to solve this.  Maybe message parsing isn't the 
correct solution?

Is there some standard solution to this in D?


Data with references needs to be marked either immutable or shared in 
order to be passed using std.concurrency. D is strict about not sharing 
thread-local data, because then you can use the type system to prove 
lock-free code is valid.


However, data that has no references (aliases) should be passable 
regardless of mutability, because you are passing a copy.


-Steve


Re: Concurrency message passing

2021-08-17 Thread jfondren via Digitalmars-d-learn

On Tuesday, 17 August 2021 at 11:05:09 UTC, JG wrote:

Hi

I have a program with two threads. One thread produces data 
that is put in a queue
and then consumed by the other thread. I initially built a 
custom queue to do this, but thought this should have some 
standard solution in D? I looked at std.concurrency and thought 
that message passing could be used. However, the problem is 
that I get the following error.


Error: static assert:  "Aliases to mutable thread-local data 
not allowed."


I am not sure how to solve this.


The error tells you what you can't do. You can do anything other 
than that. What you *should* do depends on what exactly you're 
trying to do. Here are some examples:


Sending shared mutable data:

```d
import std.concurrency, std.stdio, core.atomic;

void incr() {
auto counts = receiveOnly!(shared(int)[]);
foreach (ref n; counts)
atomicOp!"+="(n, 1); // shared(int) can't just be +='d
ownerTid.send(true);
}

void main() {
shared(int)[] counts = [0, 0, 0]; // no issues passing this
spawn(&incr).send(counts);
receiveOnly!bool;
writeln(counts);
}
```

Sending immutable data:

```d
import std.concurrency, std.stdio, std.typecons;

void greeter() {
auto who = receiveOnly!(string);
writeln("Hello, ", who);
}

void main() {
char[] who = "John".dup; // mutable&thread-local, can't be 
sent
spawn(&greeter).send(who.idup); // create immutable copy to 
send

}
```

Sending scalar data:

```d
import std.concurrency, std.stdio;

__gshared int[3] counts;

void incr() {
auto indexes = receiveOnly!(int, int);
foreach (ref n; counts[indexes[0] .. indexes[1]])
n++;
ownerTid.send(true);
}

void main() {
spawn(&incr).send(1, 2);
receiveOnly!bool;
writeln(counts);
}
```


Concurrency message passing

2021-08-17 Thread JG via Digitalmars-d-learn

Hi

I have a program with two threads. One thread produces data that 
is put in a queue
and then consumed by the other thread. I initially built a custom 
queue to do this, but thought this should have some standard 
solution in D? I looked at std.concurrency and thought that 
message passing could be used. However, the problem is that I get 
the following error.


Error: static assert:  "Aliases to mutable thread-local data not 
allowed."


I am not sure how to solve this.  Maybe message parsing isn't the 
correct solution?

Is there some standard solution to this in D?


Re: Grokking concurrency, message passing and Co

2010-07-12 Thread sybrandy

I'd expect databases to have quite odd performance characteristics
compared to a more normal application though; You'd expect them to be IO
bound most of the time, so having more threads than cores sounds like a
reasonable thing to do.

If you aren't waiting on the disc, then more threads aren't going to
help, they'll just be contending for cpu time as BCS said.

Still that's one of the drawbacks with multi threading, there's really
not many absolute statements you can make or things you can assume.

You'll have to profile you application and fiddle with it to find out
how to get the best performance out of it.



Yeah, that was my assumption.  It just sounded like there was more to it.

Thanks.

Casey


Re: Grokking concurrency, message passing and Co

2010-07-12 Thread div0

On 12/07/2010 02:50, sybrandy wrote:

The rule of thumb is don't bother spawning more threads than you
have cpus. You're just wasting resources mostly.


You REALLY don't want more threads trying to run than you have cores.
Threads in a wait state, are less of an issue, but they still use up
resources.




Personally I'd never use more threads than cores as a program design,
but I'm a performance whore. ;)



Could you explain why you believe this a bit more? I'm curious because
I've worked with databases and the rule of thumb there was you should
configure them to use twice as many threads as number of cores/CPUs. The
reasoning, I believe, is that if one thread on the CPU is stalled,
another thread can execute in it's place. And yes, I believe this was
before hyperthreading.

Though I wasn't informed of the specifics as to why, I'm guessing it's
mainly because with a database, there's a good chance you have to
perform some sort of disk I/O. Therefore instead of having the CPU wait
while the I/O is occurring, another thread can start executing while the
original is waiting.

Casey


I'd expect databases to have quite odd performance characteristics 
compared to a more normal application though; You'd expect them to be IO 
bound most of the time, so having more threads than cores sounds like a 
reasonable thing to do.


If you aren't waiting on the disc, then more threads aren't going to 
help, they'll just be contending for cpu time as BCS said.


Still that's one of the drawbacks with multi threading, there's really 
not many absolute statements you can make or things you can assume.


You'll have to profile you application and fiddle with it to find out 
how to get the best performance out of it.


--
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk


Re: Grokking concurrency, message passing and Co

2010-07-12 Thread BLS

On 12/07/2010 14:18, Lars T. Kyllingstad wrote:

I'm pretty sure they will be soon, perhaps even in the next release:

http://www.dsource.org/projects/druntime/changeset/321

-Lars


Thanks Lars.. Good news ! Hope the auto x = whatever(); //  thing for 
ddoc is also solved than..


Re: Grokking concurrency, message passing and Co

2010-07-12 Thread Lars T. Kyllingstad
On Mon, 12 Jul 2010 13:56:53 +0200, BLS wrote:

> On 12/07/2010 10:35, Justin Spahr-Summers wrote:
>> On Mon, 12 Jul 2010 00:03:53 +0200, BLS  wrote:
>>>
>>> On 11/07/2010 21:29, Philippe Sigaud wrote:
 I tried this because I was reading an article on Scala's actors,
 where they talk about millions of actors. I guess they are quite
 different.
>>>
>>> Google for fibers or have a look at the dreactor project on dsource.
>>> Tango has fiber support (afaik).
>>>
>>> hth bjoern
>>
>> Phobos2 has a Fiber class in core.thread as well.
> 
> Thanks Justin,
> just wish that core. documents are comparable to the std. ones. (At
> least) Beside.. why core is not documented at all ? bjoern


I'm pretty sure they will be soon, perhaps even in the next release:

http://www.dsource.org/projects/druntime/changeset/321

-Lars


Re: Grokking concurrency, message passing and Co

2010-07-12 Thread BLS

On 12/07/2010 10:35, Justin Spahr-Summers wrote:

On Mon, 12 Jul 2010 00:03:53 +0200, BLS  wrote:


On 11/07/2010 21:29, Philippe Sigaud wrote:

I tried this because I was reading an article on Scala's actors, where
they talk about millions of actors. I guess they are quite different.


Google for fibers or have a look at the dreactor project on dsource.
Tango has fiber support (afaik).

hth bjoern


Phobos2 has a Fiber class in core.thread as well.


Thanks Justin,
just wish that core. documents are comparable to the std. ones. (At least)
Beside.. why core is not documented at all ?
bjoern


Re: Grokking concurrency, message passing and Co

2010-07-12 Thread Justin Spahr-Summers
On Mon, 12 Jul 2010 00:03:53 +0200, BLS  wrote:
> 
> On 11/07/2010 21:29, Philippe Sigaud wrote:
> > I tried this because I was reading an article on Scala's actors, where
> > they talk about millions of actors. I guess they are quite different.
> 
> Google for fibers or have a look at the dreactor project on dsource.
> Tango has fiber support (afaik).
> 
> hth bjoern

Phobos2 has a Fiber class in core.thread as well.


Re: Grokking concurrency, message passing and Co

2010-07-11 Thread sybrandy

The rule of thumb is don't bother spawning more threads than you
have cpus. You're just wasting resources mostly.


You REALLY don't want more threads trying to run than you have cores.
Threads in a wait state, are less of an issue, but they still use up
resources.




Personally I'd never use more threads than cores as a program design,
but I'm a performance whore. ;)



Could you explain why you believe this a bit more?  I'm curious because 
I've worked with databases and the rule of thumb there was you should 
configure them to use twice as many threads as number of cores/CPUs. 
The reasoning, I believe, is that if one thread on the CPU is stalled, 
another thread can execute in it's place.  And yes, I believe this was 
before hyperthreading.


Though I wasn't informed of the specifics as to why, I'm guessing it's 
mainly because with a database, there's a good chance you have to 
perform some sort of disk I/O.  Therefore instead of having the CPU wait 
while the I/O is occurring, another thread can start executing while the 
original is waiting.


Casey


Re: Grokking concurrency, message passing and Co

2010-07-11 Thread BCS

Hello div0,


On 11/07/2010 21:43, BCS wrote:


In what way?


Sometimes it just makes your program design easier if you fork a
process / spawn a thread; than trying to manage a thread pool and
allocating work to a fixed number of threads. Programmer time is more
expensive than cpu time and it depends who's paying the bill.



Ah, yes. I agree with that. But then, in most of the cases I can think of 
where that's a good idea, the forked thread spends most of it's time waiting 
and thus contributes a almost nothing (or very rarely, depending on how you 
look at it) to the number of threads trying to run.



(hmm, haven't you said that before?)


Actually, I've argued almost the reverse. The end users time is very valuable, 
his CPU's only slightly less so, the programers time much less so, and his 
build farm's time is practically free. But then you have to keep in mind 
the 80/20 rule, premature optimization, and all that stuff.


My thinking is along the lines of: if a programmer can, in 10 minutes, save 
every user 1 second a day, then you only need 5 users using the program for 
6 months to recoup his time in full. And if you don't have even that many 
users, go work on something that does!



And In re-reading my post, I noticed it can be read it two ways. The
way I intended it, is as a "yes and" statement: while often there is
little to be gained by having more threads than cpus/cores, there is
almost always nothing to be gained by having more thread trying to do
stuff at the same time than you have cpus/cores to run them on.


Personally I'd never use more threads than cores as a program design,
but I'm a performance whore. ;)


--
... <





Re: Grokking concurrency, message passing and Co

2010-07-11 Thread BLS

On 11/07/2010 21:29, Philippe Sigaud wrote:

I tried this because I was reading an article on Scala's actors, where
they talk about millions of actors. I guess they are quite different.


Google for fibers or have a look at the dreactor project on dsource.
Tango has fiber support (afaik).

hth bjoern



Re: Grokking concurrency, message passing and Co

2010-07-11 Thread div0

On 11/07/2010 21:43, BCS wrote:

Hello div0,


On 11/07/2010 20:00, BCS wrote:


Hello div0,


The rule of thumb is don't bother spawning more threads than you
have cpus. You're just wasting resources mostly.


You REALLY don't want more threads trying to run than you have cores.
Threads in a wait state, are less of an issue, but they still use up
resources.


I think you're going a bit far there.


In what way?


Sometimes it just makes your program design easier if you fork a process 
/ spawn a thread; than trying to manage a thread pool and allocating 
work to a fixed number of threads. Programmer time is more expensive 
than cpu time and it depends who's paying the bill.


(hmm, haven't you said that before?)


And In re-reading my post, I noticed it can be read it two ways. The way
I intended it, is as a "yes and" statement: while often there is little
to be gained by having more threads than cpus/cores, there is almost
always nothing to be gained by having more thread trying to do stuff at
the same time than you have cpus/cores to run them on.



Personally I'd never use more threads than cores as a program design, 
but I'm a performance whore. ;)


--
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk


Re: Grokking concurrency, message passing and Co

2010-07-11 Thread BCS

Hello div0,


On 11/07/2010 20:00, BCS wrote:


Hello div0,


The rule of thumb is don't bother spawning more threads than you
have cpus. You're just wasting resources mostly.


You REALLY don't want more threads trying to run than you have cores.
Threads in a wait state, are less of an issue, but they still use up
resources.


I think you're going a bit far there.


In what way?

And In re-reading my post, I noticed it can be read it two ways. The way 
I intended it, is as a "yes and" statement: while often there is little to 
be gained by having more threads than cpus/cores, there is almost always 
nothing to be gained by having more thread trying to do stuff at the same 
time than you have cpus/cores to run them on.


--
... <





Re: Grokking concurrency, message passing and Co

2010-07-11 Thread div0

On 11/07/2010 20:29, Philippe Sigaud wrote:


On Sun, Jul 11, 2010 at 20:00, div0 mailto:d...@users.sourceforge.net>> wrote:

On 11/07/2010 15:28, Philippe Sigaud wrote:

- Why is a 2 threads version repeatedly thrice as fast as a no
thread version?
I thought it'd be only twice as fast.


Well if you are running on windows, my guess is that your 2nd cpu is
completely free of tasks, so the thread running on that one is more
efficient. I'm pretty sure that the windows task scheduler trys as
much as possible to keep threads running on the last cpu they where
on, to reduce cache flushes and memory paging.

On your first cpu you'll be paying for the task switching amongst
the OS threads & programs. Even if they aren't using much actual cpu
time, there's still a very high cost to perform the task swaps.

Your program is obviously very artificial; with a more normal
program you'll see the ratio drop back down to less than twice as fast.


OK, I get it. Thanks for the explanation!


- It's fun to see the process running under windows: first only
50% CPU (1
thread), then it jumps to ~100%, while the memory is slowly
growing. Then
brutally back to 50% CPU (no thread).
- 1024 threads are OK, but I cannot reach 2048. Why? What is the
limit for the
number of spawn I can do? Would that be different if each
threads spawn two
sub-threads instead of the main one generating 2048?


How many did you expect to run? Under 'doze each thread by default
gets a megabyte of virtual address space for it's stack. So at about
1000 threads you'll be using all of your programs 2GB of address
then the thread spawn will fail.


That's it, I can get much higher than 1000, for 2 GB of RAM.
Then it fail with a core.thread.ThreadException: could not create thread.


There's always an OS specific limit on the number of real threads 
available to a process. It varies a lot but you probably shouldn't count 
on any more than a thousand or so. See the docs for whatever platform 
you want to run on.




I tried this because I was reading an article on Scala's actors, where
they talk about millions of actors. I guess they are quite different.


Yes a lot of languages use (what I call) fake threads. That is you have 
a real thread which the language run time uses to simulate multi threading.


So scala, erlang and various other languages do that and you can 
obviously role your own implementation if you really need.


However when using fake threads it's basically cooperative multi 
threading and unless the runtime specifically handles it (ie python or 
some other interpreted language), a fake thread could stall all the 
other threads that are running in the same real thread.





Not sure about linux, but a similar limitation must apply.

The rule of thumb is don't bother spawning more threads than you
have cpus. You're just wasting resources mostly.


OK. So it means for joe average not much more than 2-8 threads?


Well, query for the number of cpus at runtime and then spawn that many 
threads. Even today, you can buy a PC with anywhere from 1 to 8 cores 
(and therefore 1 to 16 real threads running at the same time) so making 
a hard coded limit seems like a poor idea.


I'd have a configure file that specifies the number of threads so people 
can tune the performance for their specific machine and requirements. 
Even if you've got many cores, you end user might not actually want your 
application to hammer them all.


--
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk


Re: Grokking concurrency, message passing and Co

2010-07-11 Thread div0

On 11/07/2010 20:00, BCS wrote:

Hello div0,


The rule of thumb is don't bother spawning more threads than you have
cpus. You're just wasting resources mostly.



You REALLY don't want more threads trying to run than you have cores.
Threads in a wait state, are less of an issue, but they still use up
resources.



I think you're going a bit far there.

How many h/w threads does a core support?
1, 2, 4 or even 8 today (on power 7 machines).

And tomorrow?

You can only use basic rules of thumb with multi threading unless you 
are going to spend time and effort profiling your application on a 
specific machine or do some kind of run time profiling & configuration.


--
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk


Re: Grokking concurrency, message passing and Co

2010-07-11 Thread Philippe Sigaud
On Sun, Jul 11, 2010 at 20:00, div0  wrote:

> On 11/07/2010 15:28, Philippe Sigaud wrote:
>
>  - Why is a 2 threads version repeatedly thrice as fast as a no thread
>> version?
>> I thought it'd be only twice as fast.
>>
>
> Well if you are running on windows, my guess is that your 2nd cpu is
> completely free of tasks, so the thread running on that one is more
> efficient. I'm pretty sure that the windows task scheduler trys as much as
> possible to keep threads running on the last cpu they where on, to reduce
> cache flushes and memory paging.
>
> On your first cpu you'll be paying for the task switching amongst the OS
> threads & programs. Even if they aren't using much actual cpu time, there's
> still a very high cost to perform the task swaps.
>
> Your program is obviously very artificial; with a more normal program
> you'll see the ratio drop back down to less than twice as fast.
>
>
OK, I get it. Thanks for the explanation!


>
>  - It's fun to see the process running under windows: first only 50% CPU (1
>> thread), then it jumps to ~100%, while the memory is slowly growing. Then
>> brutally back to 50% CPU (no thread).
>> - 1024 threads are OK, but I cannot reach 2048. Why? What is the limit for
>> the
>> number of spawn I can do? Would that be different if each threads spawn
>> two
>> sub-threads instead of the main one generating 2048?
>>
>
> How many did you expect to run? Under 'doze each thread by default gets a
> megabyte of virtual address space for it's stack. So at about 1000 threads
> you'll be using all of your programs 2GB of address then the thread spawn
> will fail.
>

That's it, I can get much higher than 1000, for 2 GB of RAM.
Then it fail with a core.thread.ThreadException: could not create thread.

I tried this because I was reading an article on Scala's actors, where they
talk about millions of actors. I guess they are quite different.



>
> Not sure about linux, but a similar limitation must apply.
>
> The rule of thumb is don't bother spawning more threads than you have cpus.
> You're just wasting resources mostly.
>

OK. So it means for joe average not much more than 2-8 threads?

Thanks!

Philippe


Re: Grokking concurrency, message passing and Co

2010-07-11 Thread BCS

Hello div0,


The rule of thumb is don't bother spawning more threads than you have
cpus. You're just wasting resources mostly.



You REALLY don't want more threads trying to run than you have cores. Threads 
in a wait state, are less of an issue, but they still use up resources.


--
... <





Re: Grokking concurrency, message passing and Co

2010-07-11 Thread div0

On 11/07/2010 15:28, Philippe Sigaud wrote:


- Why is a 2 threads version repeatedly thrice as fast as a no thread version?
I thought it'd be only twice as fast.


Well if you are running on windows, my guess is that your 2nd cpu is 
completely free of tasks, so the thread running on that one is more 
efficient. I'm pretty sure that the windows task scheduler trys as much 
as possible to keep threads running on the last cpu they where on, to 
reduce cache flushes and memory paging.


On your first cpu you'll be paying for the task switching amongst the OS 
threads & programs. Even if they aren't using much actual cpu time, 
there's still a very high cost to perform the task swaps.


Your program is obviously very artificial; with a more normal program 
you'll see the ratio drop back down to less than twice as fast.



- It's fun to see the process running under windows: first only 50% CPU (1
thread), then it jumps to ~100%, while the memory is slowly growing. Then
brutally back to 50% CPU (no thread).
- 1024 threads are OK, but I cannot reach 2048. Why? What is the limit for the
number of spawn I can do? Would that be different if each threads spawn two
sub-threads instead of the main one generating 2048?


How many did you expect to run? Under 'doze each thread by default gets 
a megabyte of virtual address space for it's stack. So at about 1000 
threads you'll be using all of your programs 2GB of address then the 
thread spawn will fail.


Not sure about linux, but a similar limitation must apply.

The rule of thumb is don't bother spawning more threads than you have 
cpus. You're just wasting resources mostly.


Though of course if it makes the program easier to write and maintain 
you may well decide to hell with the wasted resources.


--
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk


Re: Grokking concurrency, message passing and Co

2010-07-11 Thread Simen kjaeraas

Philippe Sigaud  wrote:

- Why is a 2 threads version repeatedly thrice as fast as a no thread  
version?

I thought it'd be only twice as fast.


No idea.


- 1024 threads are OK, but I cannot reach 2048. Why? What is the limit  
for the
number of spawn I can do? Would that be different if each threads spawn  
two

sub-threads instead of the main one generating 2048?


How many can you do, and what happens when you can't make more?


- What is the official way to verify that all threads are done? As you  
can

see, I used a while loop, but I'm reaching in the dark, there.


Tids pass a message of type MsgType.LinkDead when they close.
I'm not entirely sure how to check this, as I have no experience with
std.concurrency.

There is also thread_joinAll in core.thread, which "Joins all non-daemon
threads that are currently running."


- I have to read TDPL again: in thread, is there any way to get the Tid  
of the

thread that spawned you apart from waiting for it to send its Tid?


No. However, there is a private TLS variable in std.concurrency that
refers to the Tid of the owner thread.

Also, you could consider sending it as part of the function parameters.



- is there any way to broadcast a message to a list of threads?


Nope. Well, of course, one could make a function what does it.

--
Simen


Grokking concurrency, message passing and Co

2010-07-11 Thread Philippe Sigaud
Hi,

so, after reading TPDL, I decided to test my comprehension of message-passing
and such. Please note that before this day, I never wrote any concurrent code,
ever.

I wanted to begin with a simple problem: summing all numbers in a range. So I
used the following code:

import std.concurrency, std.stdio, std.perf;

void main() {
immutable low = 0;
immutable high = 1_000_000_000;

auto c = new PerformanceCounter();

foreach(NN; 0..10) { // 1 to 512 threads
immutable N = 2^^NN;
int C = 0;
immutable step = (high-low)/N;
writeln("# of threads: ", N);

c.start;
foreach(n; 0..N)
{
auto tid = spawn(&foo, low+n*step, low+(n+1)*step);
tid.send(thisTid);
}

ulong sum;
while(C < N) {
auto end = receiveOnly!(bool, ulong)();
if (end._0 == true) {
C++;
sum += end._1;
}
}
c.stop;
writefln("%s ms. Sum = %s", c.milliseconds,sum);
}

writeln("No thread: ");
c.start;
ulong count;
foreach(i; low..high) count += i;
c.stop;
writefln("%s ms. Sum = %s", c.milliseconds,count);

//writeln(Minus!((S!Z), S!(S!Z)).stringof);
//writeln(Times!(S!(S!Z), S!(S!Z)).stringof);

}

void foo(int low, int high)
{
ulong sum;
auto msg = receiveOnly!(Tid)();
foreach(i; low..high) sum += i;
msg.send(true,sum);
}

It cuts the range into N parts, generate N parallel threads that will each sum
1/Nth of the range and return the result. The main thread summ it all.

Just for fun, I had to do that for N being 1, 2, 4, 8, ..., 512 threads.

The results:

the no threads version takes about 6 to 9s to complete on my machine.
The 2 to 512 threads take 1.8s repeatedly. I can understand that as this
machine has two cores.

So:

- It works! Woo hoo! Look Ma, I'm using message-passing!
- Maybe what I'm dogin is more parallelism than concurrency. But I admit being
more interested in the former than the latter.
- Why is a 2 threads version repeatedly thrice as fast as a no thread version?
I thought it'd be only twice as fast.
- It's fun to see the process running under windows: first only 50% CPU (1
thread), then it jumps to ~100%, while the memory is slowly growing. Then
brutally back to 50% CPU (no thread).
- 1024 threads are OK, but I cannot reach 2048. Why? What is the limit for the
number of spawn I can do? Would that be different if each threads spawn two
sub-threads instead of the main one generating 2048?

- What is the official way to verify that all threads are done? As you can
see, I used a while loop, but I'm reaching in the dark, there.
- I have to read TDPL again: in thread, is there any way to get the Tid of the
thread that spawned you apart from waiting for it to send its Tid?
- is there any way to broadcast a message to a list of threads?

Philippe