On Wed, Oct 18, 2017 at 10:13 AM, Ian Kelly <ian.g.ke...@gmail.com> wrote: > On Wed, Oct 18, 2017 at 9:46 AM, Jason <jasonh...@gmail.com> wrote: >> #When I change line19 to True to use the multiprocessing stuff it all slows >> down. >> >> from multiprocessing import Process, Manager, Pool, cpu_count >> from timeit import default_timer as timer >> >> def f(a,b): >> return dict_words[a]-b > > Since the computation is so simple my suspicion is that the run time > is dominated by IPC, in other words the cost of sending objects back > and forth outweighs the gains you get from parallelization. > > What happens if you remove dict_words from the Manager and just pass > dict_words[a] across instead of just a? Also, I'm not sure why > dict_keys is a managed list to begin with since it only appears to be > handled by the main process.
Timings from my system: # Original code without using Manager $ python test.py CPUs: 12 <built-in function map> 0.0757319927216 100000 1320445.9094 <bound method Pool.map of <multiprocessing.pool.Pool object at 0x7fb11f93a390>> 0.143120765686 100000 698710.627495 # Original code with Manager $ python test.py CPUs: 12 <built-in function map> 5.5354039669 100000 18065.5288391 <bound method Pool.map of <multiprocessing.pool.Pool object at 0x7fdc61f07490>> 4.3253660202 100000 23119.4307101 # Modified code without Manager and avoiding sharing the dict $ python test.py CPUs: 12 <built-in function map> 0.0657241344452 100000 1521511.09854 <bound method Pool.map of <multiprocessing.pool.Pool object at 0x7ff29c636350>> 0.0966320037842 100000 1034853.83811 -- https://mail.python.org/mailman/listinfo/python-list