On Wednesday, April 17, 2013 4:19 PM Florian Pflug wrote: > On Apr17, 2013, at 12:22 , Amit Kapila <amit.kap...@huawei.com> wrote: > > Do you mean to say that as an error has occurred, so it would not be > able to > > flush received WAL, which could result in loss of WAL? > > I think even if error occurs, it will call flush in WalRcvDie(), > before > > terminating WALReceiver. > > Hm, true, but for that to prevent the problem the inner processing > loop needs to always read up to EOF before it exits and we attempt > to send a reply. Which I don't think it necessarily does. Assume, > that the master sends a chunk of data, waits a bit, and finally > sends the shutdown record and exits. The slave might then receive > the first chunk, and it might trigger sending a reply. At the time > the reply is sent, the master has already sent the shutdown record > and closed the connection, and we'll thus fail to reply and abort. > Since the shutdown record has never been read from the socket, > XLogWalRcvFlush won't flush it, and the slave ends up behind the > master. > > Also, since XLogWalRcvProcessMsg responds to keep-alives messages, > we might also error out of the inner processing loop if the server > closes the socket after sending a keepalive but before we attempt > to respond. > > Fixing this on the receive side alone seems quite messy and fragile. > So instead, I think we should let the master send a shutdown message > after it has sent everything it wants to send, and wait for the client > to acknowledge it before shutting down the socket. > > If the client fails to respond, we could log a fat WARNING.
Your explanation seems to be okay, but I think before discussing the exact solution, If the actual problem can be reproduced, then it might be better to discuss this solution. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers