Thinking of a different design:
1. Master python process builds and compiles all theano functions like
normal (for GPU), and pickles them.
2. Worker processes initialize on other GPUs and unpickle all the functions.
3. User calls wrapped theano functions in master process, which signals to
workers.
4. Workers run infinite loop, waiting for signal of what to do (some switch
statement), e.g.:
a. call some function (can take inputs from multiprocessing shared
variables) and communicate result
b. copy multiprocessing shared variables to update local theano GPU
shared variables
c. do collective GPU comms.
d. etc.
The workers are "dumb" and never have to bother with any graphs. It's a
bit of a pain to set up the multiprocessing shared variables (have to
declare data sizes ahead of time) but not so bad.
What I'm running into trouble with now is the theano shared variables.
They get unpickled under the function's input_storage, but each function
ends up with a separate set of objects here. I can manipulate them
individually, but *is there a way to get multiple unpickled functions to
refer to the same memory for corresponding shared variables?* (Simply
setting the input_storage entries to another function's does not work.)
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.