Hi,

So I do a lot of batch processing, for machine learning.
I have a lot of RAM, 45Gb, and 12 cores.

My normal method for parallel processing is to replicate any common shared 
read-only memory across all workers, using @everywhere.
Then process my data with pmap, or a variation of my own pmapreduce.

On my work yesterday, I couldnot replicate my common shared memory across 
all workers, as it was too large (17Gb).
So I left it on processor 1, and told the workers to do a remotecall_fetch 
to retrieve it.
This seems to have worked very well, as the workers quickly get out of sync 
so beg from processor 1 at different times.
And processor 1 is normally unused til the worker are all done anyway.

I was thinking about it after I went home, and realised that this was a 
very rough approximation of the Actor model
<https://en.wikipedia.org/wiki/Actor_model>(If I am recalling the Actor 
model correctly).

What I am thinking,
is I could have 1 worker per service -- where a service is combination of 
data and functions that operate on it.
Which is more than 1 worker per core.
When ever a worker needs that data, it remotecall_fetches the function on 
the services worker.
(Potentially it does smarter things than a remotecall_fetch, so it is 
unblocking.)


Is this sensible?

Reply via email to