On Tue, Jul 9, 2013 at 12:09 PM, <fsald...@gmail.com> wrote: > I am a beginner with Python, coming from R, and I am having problems with > parallelization with the multiprocessing module. I know that other people > have asked similar questions but the answers were mostly over my head. > > Here is my problem: I tried to execute code in parallel in two ways: > > 1) In a plain xyz.py file without calling main() > 2) In a xyz.py file that calls main > > Under 1) I was able to run parallel processes but: > > a) The whole script runs from the beginning up to the line where p1.start() > or p2.start() is called. That is, if I had 10 processes p1, p2, ..., p10 the > whole file would be run from the beginning up to the line where the command > pX.start() is called. Maybe it has to be that way so that these processes get > the environment they need, but I doubt it.
See the multiprocessing programming guidelines at: http://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming In particular, read the section titled "Safe importing of main module" under "17.2.3.2 Windows". The child process needs to import the main module, which means that anything that isn't protected by "if __name__ == '__main__'" is going to get executed in the child process. This also appears to be the cause of the error that you pasted. > Under 2) I get problems with pickling. See below > > > from multiprocessing import * > > > def main(): > > print('\nRunning ' + __name__ + "\n") > > freeze_support() > > def f(name): > print('hello', name) > > p = Process(target=f, args=('bob',)) > p.start() > p.join() > > if __name__ == '__main__': > main() Read the section "More picklability" under the above documentation link. I suspect the problem here is that the target function f that you're using is a closure, and when it tries to find it in the child process it can't, because main() hasn't been called there and so the function isn't defined. Try moving it to the module namespace. -- http://mail.python.org/mailman/listinfo/python-list