Simon Forman on Thursday 27 Jul 2006 22:47 wrote: > def run(request, response, func=dummy_func): > ''' > Get items from the request Queue, process them > with func(), put the results along with the > Thread's name into the response Queue. > > Stop running once an item is None. > ''' > name = currentThread().getName() > while 1: > item = request.get() > if item is None: > break > response.put((name, func(item))) >
Meanwhile, instead of sitting idle and waiting for a reply, I thought of trying to understand the code (the example by Simon). Good part is that I was able to use it. :-) Here's the changed code: from Queue import Queue from threading import Thread, currentThread # Imports for the dummy testing func. from time import sleep from random import random NUMTHREADS = 3 def run(request, response, func=download_from_web): ''' Get items from the request Queue, process them with func(), put the results along with the Thread's name into the response Queue. Stop running once an item is None. ''' name = currentThread().getName() while 1: item = request.get() (sUrl, sFile, download_size, checksum) = stripper(item) if item is None: break response.put((name, func(sUrl, sFile, sSourceDir, None))) # Create two Queues for the requests and responses requestQueue = Queue() responseQueue = Queue() # Pool of NUMTHREADS Threads that run run(). thread_pool = [ Thread( target=run, args=(requestQueue, responseQueue) ) for i in range(NUMTHREADS) ] # Start the threads. for t in thread_pool: t.start() # Queue up the requests. for item in lRawData: requestQueue.put(item) # Shut down the threads after all requests end. # (Put one None "sentinel" for each thread.) for t in thread_pool: requestQueue.put(None) # Don't end the program prematurely. # # (Note that because Queue.get() is blocking by # default this isn't strictly necessary. But if # you were, say, handling responses in another # thread, you'd want something like this in your # main thread.) for t in thread_pool: t.join() I'd like to put my understanding over here and would be happy if people can correct me at places. So here it goes: Firstly the code initializes the number of threads. Then it moves on to initializing requestQueue() and responseQueue(). Then it moves on to thread_pool, where it realizes that it has to execute the function run(). From NUMTHREADS in the for loop, it knows how many threads it is supposed to execute parallelly. So once the thread_pool is populated, it starts the threads. Actually, it doesn't start the threads. Instead, it puts the threads into the queue. Then the real iteration, about which I was talking in my earlier post, is done. The iteration happens in one go. And requestQueue.put(item) puts all the items from lRawData into the queue of the run(). But there, the run() already known its limitation on the number of threads. No, I think the above statement is wrong. The actual pool about the number of threads is stored by thread_pool. Once its pool (at a time 3 as per this example) is empty, it again requests for more threads using the requestQueue() And in function run(), when the item of lRawData is None, the thread stops. The the cleanup and checks of any remaining threads is done. Is this all correct ? I also do have a couple of questions more which would be related to locks. But I'd post them once I get done with this part. Thanks, Ritesh -- Ritesh Raj Sarraf RESEARCHUT - http://www.researchut.com "Necessity is the mother of invention." "Stealing logic from one person is plagiarism, stealing from many is research." "The great are those who achieve the impossible, the petty are those who cannot - rrs" -- http://mail.python.org/mailman/listinfo/python-list