Thanks for the advice. You mentioned in the past that the documentation was inadequate but didn't give enough specifics as to how until now. As the author of the library, things seem obvious to me that don't seem obvious to anyone else, so I don't feel that I'm in a good position to judge the quality of the documentation and where it needs improvement. I plan to fix most of the issues you raised, but I've left comments for the few that I can't/won't fix or believe are based on misunderstandings below.

On 3/18/2011 11:29 PM, Andrei Alexandrescu wrote:
1. Library proper:

* "In the case of non-random access ranges, parallel foreach is still
usable but buffers lazily to an array..." Wouldn't strided processing
help? If e.g. 4 threads the first works on 0, 4, 8, ... second works on
1, 5, 9, ... and so on.

You can have this if you want, by setting the work unit size to 1. Setting it to a larger size just causes more elements to be buffered, which may be more efficient in some cases.


* I'm unclear on the tactics used by lazyMap. I'm thinking the obvious
method should be better: just use one circular buffer. The presence of
two dependent parameters makes this abstraction difficult to operate with.

* Same question about asyncBuf. What is wrong with a circular buffer
filled on one side by threads and on the consumed from the other by the
client? I can think of a couple of answers but it would be great if they
were part of the documentation.

Are you really suggesting I give detailed rationales for implementation decisions in the documentation? Anyhow, the two reasons for this choice are to avoid needing synchronization/atomic ops/etc. on every write to the buffer (which we would need since it can be read and written concurrently and we need to track whether we have space to write to) and because parallel map works best when it operates on relatively large buffers, resulting in minimal synchronization overhead per element. (Under the hood, the buffer is filled and then eager parallel map is called.)

* Why not make workerIndex a ulong and be done with it?

I doubt anyone's really going to create anywhere near 4 billion TaskPool threads over the lifetime of a program. Part of the point of TaskPool is recycling threads rather than paying the overhead of creating and destroying them. Using a ulong on a 32-bit architecture would make worker-local storage substantially slower. workerIndex is how worker-local storage works under the hood, so it needs to be fast.

> * No example for workerIndex and why it's useful.

It should just be private. The fact that it's public is an artifact of when I was designing worker-local storage and didn't know how it was going to work yet. I never thought to revisit this until now. It really isn't useful to client code.

* Is stop() really trusted or just unsafe? If it's forcibly killing
threads then its unsafe.

It's not forcibly killing threads. As the documentation states, it has no effect on jobs already executing, only ones in the queue. Furthermore, it's needed unless makeDaemon is called. Speaking of which, makeDaemon and makeAngel should probably be trusted, too.

* defaultPoolThreads - should it be a @property?

Yes. In spirit it's a global variable. It requires some extra machinations, though, to be threadsafe, which is why it's not implemented as a simple global variable.

* No example for task().

???? Yes there is, for both flavors, though these could admittedly be improved. Only the safe version doesn't have an example, and this is just a more restricted version of the function pointer case, so it seems silly to make a separate example for it.

* What is 'run' in the definition of safe task()?

It's just the run() adapter function.  Isn't that obvious?

Reply via email to