On 02/09/12 06:48, eryksun wrote:


     from multiprocessing import Pool, cpu_count
     from itertools import izip_longest, imap

     FILE_IN = '...'
     FILE_OUT = '...'

     NLINES = 1000000 # estimate this for a good chunk_size
     BATCH_SIZE = 8

     def func(batch):
         """ test func """
         import os, time
         time.sleep(0.001)
         return "%d: %s\n" % (os.getpid(), repr(batch))

     if __name__ == '__main__': # <-- required for Windows

Why?
What difference does that make in Windows?

         file_in, file_out = open(FILE_IN), open(FILE_OUT, 'w')
         nworkers = cpu_count() - 1

         with file_in, file_out:
             batches = izip_longest(* [file_in] * BATCH_SIZE)
             if nworkers > 0:
                 pool = Pool(nworkers)
                 chunk_size = NLINES // BATCH_SIZE // nworkers
                 result = pool.imap(func, batches, chunk_size)
             else:
                 result = imap(func, batches)
             file_out.writelines(result)

just curious.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to