Re: Yet another parallel foreach + continue question

2021-07-21 Thread Steven Schveighoffer via Digitalmars-d-learn

On 7/19/21 10:58 PM, H. S. Teoh wrote:


I didn't check the implementation to verify this, but I'm pretty sure
`break`, `continue`, etc., in the parallel foreach body does not change
which iteration gets run or not.


`break` should be undefined behavior (it is impossible to know which 
loops have already executed by that point). `continue` should be fine.


Noted in the 
[docs](https://dlang.org/phobos/std_parallelism.html#.TaskPool.parallel):


	Breaking from a parallel foreach loop via a break, labeled break, 
labeled continue, return or goto statement throws a ParallelForeachError.


I would say `continue` is ok (probably just implemented as an early 
return), but all those others are going to throw an error (unrecoverable).


-Steve


Re: Yet another parallel foreach + continue question

2021-07-20 Thread Ali Çehreli via Digitalmars-d-learn

On 7/19/21 5:07 PM, seany wrote:

> Consider :
>
>  for (int i = 0; i < max_value_of_i; i++) {
>  foreach ( j, dummyVar; myTaskPool.parallel(array_to_get_j_from,
> my_workunitSize) {
>
>  if ( boolean_function(i,j) ) continue;
>  double d = expensiveFunction(i,j);
>  // ... stuff ...
>  }
>  }

Arranging the code to its equivalent may reveal the answer:

if (!boolean_function(i, j)) {
  double d = expensiveFunction(i, j);
  // ... stuff ...
}

We removed 'continue' and nothing changed and your question disappeared. :)

> I understand, that the parallel iterator will pick lazily values of `j`
> (up to `my_workunitsize`), and execute the for loop for those values in
> its own thread.

Yes.

> Say, values of `j` from `10`to `20` is filled where `my_workunitsize` =
> 11. Say, at `j = 13` the `boolean_function` returns true.
>
> Will then the for loop just jump to the next value of `j = 14` like a
> normal for loop?

Yes.

> I am having a bit of difficulty to understand this.
> Thank you.

parallel is only for performance gain. The 2 knobs that it provides are 
also for performance reasons:


1) "Use just this many cores, not all"

2) "Process this many elements, not 100 (the default)" because otherwise 
context switches are too expensive


Other than that, it shouldn't be any different from running the loop 
regularly.


Ali



Re: Yet another parallel foreach + continue question

2021-07-19 Thread seany via Digitalmars-d-learn

On Tuesday, 20 July 2021 at 02:58:50 UTC, H. S. Teoh wrote:
On Tue, Jul 20, 2021 at 02:39:58AM +, seany via 
Digitalmars-d-learn wrote:

> [...]

[...]

[...]


Logically speaking, the size of the work unit should not change 
the semantics of the loop. That's just an implementation detail 
that should not affect the semantics of the overall 
computation.  In order to maintain consistency, loop iterations 
should not affect each other (unless they deliberately do so, 
e.g., read/write from a shared variable -- but parallel foreach 
itself should not introduce such a dependency).


[...]


Okey, thank you.

If you later have some time, and find out about the exact 
implementation - and help me to understand it -  I would be most 
grateful.


I have checked: [this 
link](https://github.com/dlang/phobos/blob/master/std/parallelism.d) - but did not understand completely.


Re: Yet another parallel foreach + continue question

2021-07-19 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jul 20, 2021 at 02:39:58AM +, seany via Digitalmars-d-learn wrote:
> On Tuesday, 20 July 2021 at 02:31:14 UTC, H. S. Teoh wrote:
> > On Tue, Jul 20, 2021 at 01:07:22AM +, seany via Digitalmars-d-learn
> > wrote:
> > > On Tuesday, 20 July 2021 at 00:37:56 UTC, H. S. Teoh wrote:
> > > > [...]
> > > 
> > > Ok, therefore it means that, if at `j = 13 `i use a continue, then
> > > the thread where I had `10`... `20` as values of `j`, will only
> > > execute for `j = 10, 11, 12 ` and will not reach `14`or later ?
> > 
> > No, it will.
> > 
> > Since each iteration is running in parallel, the fact that one of
> > them terminated early should not affect the others.
[...]
> Even tho, the workunit specified 11 values to a single thread?

Logically speaking, the size of the work unit should not change the
semantics of the loop. That's just an implementation detail that should
not affect the semantics of the overall computation.  In order to
maintain consistency, loop iterations should not affect each other
(unless they deliberately do so, e.g., read/write from a shared variable
-- but parallel foreach itself should not introduce such a dependency).

I didn't check the implementation to verify this, but I'm pretty sure
`break`, `continue`, etc., in the parallel foreach body does not change
which iteration gets run or not.

Think of it this way: when you use a parallel foreach, what you're
essentially asking for is that, logically speaking, *all* loop
iterations start in parallel (even though in actual implementation that
doesn't actually happen unless you have as many CPUs as you have
iterations). Meaning that by the time a thread gets to the `continue` in
a particular iteration, *all* of the other iterations may already have
started executing.  So it doesn't make sense for any of them to get
interrupted just because this particular iteration executes a
`continue`.  Doing otherwise would introduce all sorts of weird
inconsistent semantics that are hard (if not impossible) to reason
about.

While I'm not 100% sure this is what the current parallel foreach
implementation actually does, I'm pretty sure that's the case. It
doesn't make sense to do it any other way.


T

-- 
Ph.D. = Permanent head Damage


Re: Yet another parallel foreach + continue question

2021-07-19 Thread seany via Digitalmars-d-learn

On Tuesday, 20 July 2021 at 02:31:14 UTC, H. S. Teoh wrote:
On Tue, Jul 20, 2021 at 01:07:22AM +, seany via 
Digitalmars-d-learn wrote:

On Tuesday, 20 July 2021 at 00:37:56 UTC, H. S. Teoh wrote:
> [...]

Ok, therefore it means that, if at `j = 13 `i use a continue, 
then the thread where I had `10`... `20` as values of `j`, 
will only execute for `j = 10, 11, 12 ` and will not reach 
`14`or later ?


No, it will.

Since each iteration is running in parallel, the fact that one 
of them terminated early should not affect the others.



T


Even tho, the workunit specified 11 values to a single thread?


Re: Yet another parallel foreach + continue question

2021-07-19 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jul 20, 2021 at 01:07:22AM +, seany via Digitalmars-d-learn wrote:
> On Tuesday, 20 July 2021 at 00:37:56 UTC, H. S. Teoh wrote:
> > On Tue, Jul 20, 2021 at 12:07:10AM +, seany via Digitalmars-d-learn
> > wrote:
> > > [...]
> > [...]
> > 
> > I didn't test this, but I'm pretty sure `continue` inside a parallel
> > foreach loop simply terminates that iteration early; I don't think
> > it will skip to the next iteration.
> > 
> > [...]
> 
> Ok, therefore it means that, if at `j = 13 `i use a continue, then the
> thread where I had `10`... `20` as values of `j`, will only execute
> for `j = 10, 11, 12 ` and will not reach `14`or later ?

No, it will.

Since each iteration is running in parallel, the fact that one of them
terminated early should not affect the others.


T

-- 
Skill without imagination is craftsmanship and gives us many useful objects 
such as wickerwork picnic baskets.  Imagination without skill gives us modern 
art. -- Tom Stoppard


Re: Yet another parallel foreach + continue question

2021-07-19 Thread seany via Digitalmars-d-learn

On Tuesday, 20 July 2021 at 00:37:56 UTC, H. S. Teoh wrote:
On Tue, Jul 20, 2021 at 12:07:10AM +, seany via 
Digitalmars-d-learn wrote:

[...]

[...]

I didn't test this, but I'm pretty sure `continue` inside a 
parallel foreach loop simply terminates that iteration early; I 
don't think it will skip to the next iteration.


[...]


Ok, therefore it means that, if at `j = 13 `i use a continue, 
then the thread where I had `10`... `20` as values of `j`, will 
only execute for `j = 10, 11, 12 ` and will not reach `14`or 
later ?




Re: Yet another parallel foreach + continue question

2021-07-19 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jul 20, 2021 at 12:07:10AM +, seany via Digitalmars-d-learn wrote:
> Consider :
> 
> for (int i = 0; i < max_value_of_i; i++) {
> foreach ( j, dummyVar; myTaskPool.parallel(array_to_get_j_from,
> my_workunitSize) {
> 
> if ( boolean_function(i,j) ) continue;
> double d = expensiveFunction(i,j);
> // ... stuff ...
> }
> }
> 
> I understand, that the parallel iterator will pick lazily values of
> `j` (up to `my_workunitsize`), and execute the for loop for those
> values in its own thread.
> 
> Say, values of `j` from `10`to `20` is filled where `my_workunitsize`
> = 11.  Say, at `j = 13` the `boolean_function` returns true.
> 
> Will then the for loop just jump to the next value of `j = 14`  like a
> normal for loop? I am having a bit of difficulty to understand this.
[...]

I didn't test this, but I'm pretty sure `continue` inside a parallel
foreach loop simply terminates that iteration early; I don't think it
will skip to the next iteration.

Basically, what .parallel does under the hood is to create N jobs (where
N is the number of items to iterate over), representing N instances of
the loop body, and assign them to M worker threads to execute. Then it
waits until all N jobs have been completed before it returns.  Which
order the worker threads will pick up the loop body instances is not
specified, and generally is not predictable from user code.

The loop body in this case is translated into a delegate that gets
passed to the task pool's .opApply method; each worker thread that picks
up a job simply invokes the delegate with the right value of the loop
variable. A `continue` translates to returning a specific magic value
from the delegate that tells .opApply that the loop body finished early.
AFAIK, the task pool does not act on this return value, i.e., the other
instances of the loop body will execute regardless.


T

-- 
Time flies like an arrow. Fruit flies like a banana.