On 02/09/12 06:48, eryksun wrote:
from multiprocessing import Pool, cpu_count
from itertools import izip_longest, imap
FILE_IN = '...'
FILE_OUT = '...'
NLINES = 1000000 # estimate this for a good chunk_size
BATCH_SIZE = 8
def func(batch):
""" test func """
import os, time
time.sleep(0.001)
return "%d: %s\n" % (os.getpid(), repr(batch))
if __name__ == '__main__': # <-- required for Windows
Why?
What difference does that make in Windows?
file_in, file_out = open(FILE_IN), open(FILE_OUT, 'w')
nworkers = cpu_count() - 1
with file_in, file_out:
batches = izip_longest(* [file_in] * BATCH_SIZE)
if nworkers > 0:
pool = Pool(nworkers)
chunk_size = NLINES // BATCH_SIZE // nworkers
result = pool.imap(func, batches, chunk_size)
else:
result = imap(func, batches)
file_out.writelines(result)
just curious.
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor