Hi!
I'm new to Julia and I find it wonderful (especially for maths
programming), but I have difficulties writing concurrent programs in it -
it seems that I don't understand some basic concepts here.
I usually work with languages which support thread-based concurrency and
allow spawning a number of threads/tasks sharing the same memory space
(e.g. Scala, Rust, etc.). However, now I need to write a few math programs
in Julia. The algorithms are quite CPU-heavy but can easily be parallelized
(most of them boil down to for loops with independent iterations), and
Julia does have an ability to write parallel programs. However, I just
don't understand *how* to write them properly.
Here is an outline of one of the programs (not in any particular language):
(1) coeffs <- compute approximation of function
(2) f(s) = a function which uses coeffs
(3) data <- compute some data using f
(4) plot data
My current implementation (partially working) can be found here
<https://gist.github.com/netvl/8cfc0fb8dc487b0f536c>:
(1), (2), (3) and (4) must obviously be executed in order because each step
depends on the results of previous step. However, algorithms in (1) and
(3), which are the most time-consuming ones, can easily be parallelized
(they are two big for loops filling one-dimensional and two-dimensional
arrays, correspondingly).
In "conventional" language I would just spawn a bunch of threads or start a
thread pool which will compute these arrays in parallel, and then just use
their data directly in the main thread. This is the most straightforward
and easiest way with regular thread-based concurrency. The thread pool is
only used locally in steps (1) and (3), and spawned threads are not
concerned with the rest of the program *at all*.
Julia, however, does not have thread abstraction. Instead it provides
workers, which are separate processes, and it also provides means to
transfer data between processes. Unfortunately, I couldn't find proper
documentation on how to use these tools in nontrivial situations (official
Julia documentation is just not sufficient), and this is a big problem -
"conventional" approach just doesn't work here. The biggest problem, as far
as I can see, is that these workers are not "local" - they must "know"
about other parts of the program to work correctly, even if they are used
only for several specific computations "locally".
First of all, `f` must be available on all workers in order for (3) to work
at all. The suggested approach is to use `@everywhere` macro:
coeffs = approx(...)
@everywhere function f(s)
# coeffs are used here
end
But `f` uses `coeffs` computed on the previous step, so `coeffs` should
also be `@everywhere`'d - otherwise step (3) fails in workers. But this
means that approximation will be computed multiple times, one on each
worker, correct? I have some text printed in step (1), and if I write
something like
@everywhere coeffs = approx(...)
I can see that `approx()` is called from all of the workers (though for
some reason their output is only printed after step (4), just before the
program ends). So, the first problem is that I don't know how to define a
piece of data which should only be computed once, probably in parallel, and
then shared among all workers. `@everywhere` is close, as it defines a
binding in all workers, but it invokes the computation multiple times. I
couldn't come up with simple workaround for this.
Second, by extension of the above I have put `@everywhere` literally
everywhere. You can see in the gist that the main script contains
@everywhere on almost every line. It looks weird and I just don't believe
that this is the right way, but without all of these `@everywhere`s workers
will fail with various kinds of "undefined" errors. This *may* be expected
as the workers are independent programs, but this is rather surprising from
"thread-based" point of view.
And finally, when I rewrote my program to run in parallel, plotting on step
(4) just stopped working. I mean, when I run my program, it just hangs in
`plot()` function after writing "done." and the plot window is not
displayed. It looks like some kind of interference with PyPlot, but I don't
know the exact reason and I don't know how to "debug" it.
So, my question is: how do I write a program which has multiple "local"
parallel sections which should be executed sequentially due to data
dependencies between them? As a subquestion, how to transfer data between
workers if it should only be computed once and if this data is used
indirectly through other definitions? As an another subquestion, how do you
avoid writing `@everywhere` everywhere when there is a lot of definitions
which should be available for all workers? And why plotting stopped working?
Thanks in advance,
Vladimir.