Pool Module: iterator does not yield consistently with different chunksizes
I've been playing around with custom iterators to map into Pool. When I run the code below: def arif(arr): return arr def permutate(n): k = 0 a = list(range(6)) while kn: for i in range(6): a.insert(0, a.pop(5)+6) #yield a[:] -- produces correct results yield a k += 1 return def main(): from multiprocessing import Pool pool = Pool() chksize = 15 for x in pool.imap_unordered(arif, permutate(100), chksize): print(x) if __name__==__main__: main() will output something like this: [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [144, 145, 146, 147, 148, 149] ... where results are duplicated number of times equal to chunk size, and the results between the gap are lost. Using a[:] instead, i get: [6, 7, 8, 9, 10, 11] [12, 13, 14, 15, 16, 17] [18, 19, 20, 21, 22, 23] [24, 25, 26, 27, 28, 29] [30, 31, 32, 33, 34, 35] [36, 37, 38, 39, 40, 41] [42, 43, 44, 45, 46, 47] [48, 49, 50, 51, 52, 53] it comes out okay. Any explanation for such behavior? Ahmad Syukri -- http://mail.python.org/mailman/listinfo/python-list
Re: Pool Module: iterator does not yield consistently with different chunksizes
syockit wrote: I've been playing around with custom iterators to map into Pool. When I run the code below: def arif(arr): return arr def permutate(n): k = 0 a = list(range(6)) while kn: for i in range(6): a.insert(0, a.pop(5)+6) #yield a[:] -- produces correct results yield a k += 1 return def main(): from multiprocessing import Pool pool = Pool() chksize = 15 for x in pool.imap_unordered(arif, permutate(100), chksize): print(x) if __name__==__main__: main() will output something like this: [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [144, 145, 146, 147, 148, 149] ... where results are duplicated number of times equal to chunk size, and the results between the gap are lost. Using a[:] instead, i get: [6, 7, 8, 9, 10, 11] [12, 13, 14, 15, 16, 17] [18, 19, 20, 21, 22, 23] [24, 25, 26, 27, 28, 29] [30, 31, 32, 33, 34, 35] [36, 37, 38, 39, 40, 41] [42, 43, 44, 45, 46, 47] [48, 49, 50, 51, 52, 53] it comes out okay. Any explanation for such behavior? Ahmad Syukri Python passes references araound, not copies. Consider it = permutate(100) chunksize = 15 from itertools import islice while True: chunk = tuple(islice(it, chunksize)) if not chunk: break # dispatch items in chunk print chunk chunksize items are calculated before they are dispatched. When you yield the same list every time in permutate() previous items in the chunk will see any changes you make on the list with the intention to update it to the next value. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Pool Module: iterator does not yield consistently with different chunksizes
syockit wrote: I've been playing around with custom iterators to map into Pool. When I run the code below: def arif(arr): return arr def permutate(n): k = 0 a = list(range(6)) while kn: for i in range(6): a.insert(0, a.pop(5)+6) #yield a[:] -- produces correct results yield a k += 1 return def main(): from multiprocessing import Pool pool = Pool() chksize = 15 for x in pool.imap_unordered(arif, permutate(100), chksize): print(x) if __name__==__main__: main() will output something like this: [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [144, 145, 146, 147, 148, 149] ... where results are duplicated number of times equal to chunk size, and the results between the gap are lost. Using a[:] instead, i get: [6, 7, 8, 9, 10, 11] [12, 13, 14, 15, 16, 17] [18, 19, 20, 21, 22, 23] [24, 25, 26, 27, 28, 29] [30, 31, 32, 33, 34, 35] [36, 37, 38, 39, 40, 41] [42, 43, 44, 45, 46, 47] [48, 49, 50, 51, 52, 53] it comes out okay. Any explanation for such behavior? Ahmad Syukri While I didn't actually try to follow all your code, I suspect your problem is that when you yield the same object multiple times, they're being all saved, and then when evaluated, they all have the final value. If the values are really independent, somebody has to copy the list, and the [:] does that. DaveA -- http://mail.python.org/mailman/listinfo/python-list