On Tue, Dec 13, 2016 at 11:19 AM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-12-12 16:46:38 +0900, Michael Paquier wrote:
>> OK, I think that I have spotted an issue after additional read of the
>> code. When a WSA event is used for read/write activity on a socket,
>> the same WSA event gets reused again and again. That's fine for
>> performance reasons
> It's actually also required to not loose events,
> i.e. correctness. Windows events are edge, not level triggered.

Windows might not lose the event, but PostgreSQL can certainly lose
the event.  In the Windows implementation of WaitEventSetBlock, we'll
ignore WL_SOCKET_READABLE if WL_LATCH_SET also fires.  Now, since the
event is edge-triggered[1], it never fires again and we hang.  Oops.
Even if we rewrote WaitEventSetBlock to always process all of the
events, there's nothing in the latch API that requires the caller to
read from the socket the first time WL_SOCKET_READABLE shows up.  The
caller is perfectly entitled to write:

if (rv & WL_LATCH_SET)
   // do stuff
else if (rv & WL_SOCKET_READABLE)
   // do other stuff

I suppose we could make a coding rule that the caller must not rely on
WL_SOCKET_READABLE being level-triggered, but I think that's likely to
lead to a stream of bugs that only show up on Windows, because most
people are used to the Linux semantics and will naturally write code
like the above which isn't robust in the face of an edge-triggered
event -- just as you yourself did in WaitEventSetBlock.

> The
> previous implementation wasn't correct.  So just resetting things ain't
> ok.

How was it incorrect?  The previous implementation *worked* and the
new one doesn't, at least in the case complained of here.

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to