I am using multiprocessing.Queue to implement classic producer-consumer
pattern. To let consumers know all the tasks are done. I use a special None
as sentinel. And I use Lock and Value(counter) to judge if current process
is the last producer. If it is the last producer, it will put as many None
as the number of consumers to the queue.

```
def producer(queue, lock, running_counters, num_consumers):
    while task_not_done:
        result = do something
        queue.put(result)

    # I am done
    with lock:
        running_counters.value -= 1
        if running_counters.value == 0: # last one, put sentinel Nones
            for _ in range(num_consumers):
                queue.put(None)


def consumer(queue):
    while True:
        data = queue.get()
        if data is None:
            break
        do something with data

def main():
    queue = mp.Queue(maxsize=1_000_000)
    running_counters = mp.Value('i', NUM_PRODUCER)
    lock = mp.Lock()
    producers = []
    for _ in range(NUM_PRODUCER):
        p = mp.Process(target=producer, args=(queue, lock,
running_counters, NUM_CONSUMER))
        p.start()
        producers.append(p)

    consumers = []
    for _ in range(NUM_CONSUMER):
        p = mp.Process(target=consumer, args=(queue,))
        p.start()
        consumers.append(p)
    [p.join() for p in producers]
    [p.join() for p in consumers]
```

I run this code in one machine, it works. But in another it deadlocks. when
it deadlocks, I find that even all the consumers finished, the queue is
still not empty. So the producers can't finish.


I am very confused about this. Why use lock has synchronization problem. It
seems when the last producer put None to queue, other producers still can
put normal data to queue. I am not familiar with python multiprocessing.
And unlike java/c++, I can't find clear documents about python memory model
which define happens-before or fence or related topic. I know python
multiprocessing is not multithreads which all threads share the memory
space of the same process.
Some web blogs say mp.Queue.put() will use background thread to send
message. Maybe when last producer put None to the queue(it's fast), other
producers' put is still running?  If it's true, how I can know all
background sending is done?
-- 
https://mail.python.org/mailman3//lists/python-list.python.org

Reply via email to