On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska <[email protected]> wrote:
> During my experiments with parallel workers I sometimes saw the "master" and
> worker process blocked. The master uses shm queue to send data to the worker,
> both sides nowait==false. I concluded that the following happened:
>
> The worker process set itself as a receiver on the queue after
> shm_mq_wait_internal() has completed its first check of "ptr", so this
> function left sender's procLatch in reset state. But before the procLatch was
> reset, the receiver still managed to read some data and set sender's procLatch
> to signal the reading, and eventually called its (receiver's) WaitLatch().
>
> So sender has effectively missed the receiver's notification and called
> WaitLatch() too (if the receiver already waits on its latch, it does not help
> for sender to call shm_mq_notify_receiver(): receiver won't do anything
> because there's no new data in the queue).
>
> Below is my patch proposal.
Another good catch. However, I would prefer to fix this without
introducing a "continue" as I think that will make the control flow
clearer. Therefore, I propose the attached variant of your idea.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index e765cea..8e10822 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -777,7 +777,7 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data,
return SHM_MQ_DETACHED;
}
- if (available == 0)
+ if (available == 0 && !mqh->mqh_counterparty_attached)
{
shm_mq_result res;
@@ -805,6 +805,13 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data,
mqh->mqh_counterparty_attached = true;
}
+ /*
+ * The receiver may have read some data after attaching, so we
+ * must not wait without rechecking the queue state.
+ */
+ }
+ else if (available == 0)
+ {
/* Let the receiver know that we need them to read some data. */
res = shm_mq_notify_receiver(mq);
if (res != SHM_MQ_SUCCESS)
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers