Re: [HACKERS] Overhauling our interrupt handling
Hello, > > I think I should finilize my commitfest item for this issue, with > > .. "Rejected"? > > Fine with me. done. > > > 0001: Replace walsender's latch with the general shared latch. > > > > > > New patch that removes ImmediateInteruptOK behaviour from > > > walsender. I > > > think that's a rather good idea, because walsender currently seems > > > to > > > assume WaitLatchOrSocket is reentrant - which I don't think is > > > really > > > guaranteed. > > > Hasn't been reviewed yet, but I think it's not far from being > > > committable. > > > > Deesn't this patchset containing per-socket basis non-blocking > > control for win32? It should make the code (above the win32 > > socket layer itself) more simpler. > > I don't think so - we still rely on it unfortunately. Does "it" mean win32_noblock? Or the nonblocking bare win32 socket? The win32-per-sock-blkng-cntl patch in the below message should cover both of them. http://www.postgresql.org/message-id/54060ae5.5020...@vmware.com If you are saying it should be a patch separate from this, I'll do so. regareds, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling our interrupt handling
Hi, On 2015-01-15 15:05:08 +0900, Kyotaro HORIGUCHI wrote: > Hello, I'd synced up this at last. > > I think I should finilize my commitfest item for this issue, with > .. "Rejected"? Fine with me. > > All the patches in the series up to 0008 hav ecommit messages providing > > more detail. A short description of each patch follows: > > > > 0001: Replace walsender's latch with the general shared latch. > > > > New patch that removes ImmediateInteruptOK behaviour from walsender. I > > think that's a rather good idea, because walsender currently seems to > > assume WaitLatchOrSocket is reentrant - which I don't think is really > > guaranteed. > > Hasn't been reviewed yet, but I think it's not far from being > > committable. > > Deesn't this patchset containing per-socket basis non-blocking > control for win32? It should make the code (above the win32 > socket layer itself) more simpler. I don't think so - we still rely on it unfortunately. > > 0004: Process 'die' interrupts while reading/writing from the client socket. > > > > This is the reason Horiguchi-san started this thread. > > > > I think the important debate here is whether we think it's > > acceptable that there are situations (a full socket buffer, but a > > alive connection) where the client previously got an error, but > > not anymore afterwards. I think that's much better than having > > unkillable connections, but I can see others being of a different > > opinion. > > > This patch yields a code a bit confusion like following. > > | secure_raw_write(Port *port, const void *ptr, size_t len) > | { > .. > | w = WaitLatchOrSocket(MyLatch, > | if (w & WL_LATCH_SET) > ... > | errno = EINTR; > | else if (w & WL_SOCKET_WRITEABLE) > | goto wloop; > | > | errno = save_errno; > > The errno set when WL_LATCH_SET always vanishes. Specifically, > the EINTR set by SIGTERM(WL_LATCH_SET) is overwritten by > EAGAIN. As the result, pg_terminte_backend() cannot kill the > backend till the write blocking is released. errno = save_errno > should be the alternative of the line "errno = EINTR" and I > confirmed that the change leads to the desirable (as of me) > behavior. Ugh, that's the result stupid last minute "cleanup". You're right. > > 0006: Don't allow immediate interupts during authentication anymore. > > > > So far we've set ImmediateInterruptOK to true during large parts > > of the client authentication - that's not all that pretty, > > interrupts might arrive while we're in some system routines. > > > > Due to patches 0003/0004 we now are able to safely serve > > interrupts during client communication which is the major are > > where we want to adhere to authentication_timeout. > > > > I additionally placed some CHECK_FOR_INTERRUPTS()s in some > > somewhat randomly chosen places in auth.c. Those don't completely > > get back the previous 'appruptness' (?) of reacting to > > interrupts, but that's partially for the better, because we don't > > interrupt foreign code anymore. > > Simplly as a comment on style, this patch introduces checks of > ClientAuthInProgress twice successively into > ProcessInterrupts(). Isn't it better to make it like following? > > | /* As in quickdie, ... > | if (ClientAuthInProgress) > | { > |if (whereToSendOutput == DestRemote) whereToSendOutput = DestNone; > |ereport(FATAL, Hm, yes. > > 0008: Remove remnants of ImmediateInterruptOK handling. > > > > Now that ImmediateInterruptOK is never set to true anymore, we can > > remove related code and comments. > > > > New and not reviewed. > > walreceiver.c still has WalRcvImmediateInterruptOK as mentioned > below, apart from whether it should be changed or not, the > following comment remains. > | * This is very much like what regular backends do with ImmediateInterruptOK, > | * ProcessInterrupts() etc. Yep. As mentioned below, it doesn't use the same infrastructure, so I'd rather treat this separately. This set is more than big enough. Thanks for looking! Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling our interrupt handling
Hello, I'd synced up this at last. I think I should finilize my commitfest item for this issue, with .. "Rejected"? > This mail is a answer to > http://archives.postgresql.org/message-id/20150110022542.GF12509%40alap3.anarazel.de > but I decided that it's a good go move it into a new thread since the > patchseries has outgrown its original target. It's invasive and touches > many arcane areas of the code, so that I think more eyes than a long > deep thread is likely to have, are a good thing. > > As a short description of the goal of the patchseries: > This started out as steps on the way to be able to safely handle > terminate_backend() et al when we're reading/writing from the > client. E.g. because the client is dead and we haven't noticed yet and > are stuck writing, or because we want to implement a idle_in_transaction > timeout. > > Making these changes allowed me to see that not doing significant work > in signal handlers can make code much simpler and more robust. The > number of bugs postgres had in the past that involved doing too much in > signal handlers is significant. Thus I've since extended the > patchseries to get rid of nearly all nontrivial work in signal > handlers. It sounds pretty nice. > All the patches in the series up to 0008 hav ecommit messages providing > more detail. A short description of each patch follows: > > 0001: Replace walsender's latch with the general shared latch. > > New patch that removes ImmediateInteruptOK behaviour from walsender. I > think that's a rather good idea, because walsender currently seems to > assume WaitLatchOrSocket is reentrant - which I don't think is really > guaranteed. > Hasn't been reviewed yet, but I think it's not far from being > committable. Deesn't this patchset containing per-socket basis non-blocking control for win32? It should make the code (above the win32 socket layer itself) more simpler. > 0002: Use a nonblocking socket for FE/BE communication and block using > latches. > > Has previously been reviewed by Heikki. I think Noah also had a > look, although I'm not sure how close that was. > > I think this can be committed soon. > > 0003: Introduce and use infrastructure for interrupt processing during client > reads. > > From here on ImmediateInterruptOK isn't set during client > communication. Normal interrupts and sinval/async interrupts are > processed outside of signal handlers. Especially the sinval/async > greatly simplify the respective code. > > Has been reviewed by Heikki in an earlier version - I think I > addressed all the review comments. > > I'd like somebody else to look at the sinval/async.c changes > before committing. I really like the simplification this change > allowed. > > I think this patch might not be safe without 0004 because I can't > convince myself that it's safe to interrupt latch waits with work > that actually might also use the same latch (sinval > interrupts). But it's easier to review this way as 0004 is quite > big. > > 0004: Process 'die' interrupts while reading/writing from the client socket. > > This is the reason Horiguchi-san started this thread. > > I think the important debate here is whether we think it's > acceptable that there are situations (a full socket buffer, but a > alive connection) where the client previously got an error, but > not anymore afterwards. I think that's much better than having > unkillable connections, but I can see others being of a different > opinion. This patch yields a code a bit confusion like following. | secure_raw_write(Port *port, const void *ptr, size_t len) | { .. | w = WaitLatchOrSocket(MyLatch, | if (w & WL_LATCH_SET) ... | errno = EINTR; | else if (w & WL_SOCKET_WRITEABLE) | goto wloop; | | errno = save_errno; The errno set when WL_LATCH_SET always vanishes. Specifically, the EINTR set by SIGTERM(WL_LATCH_SET) is overwritten by EAGAIN. As the result, pg_terminte_backend() cannot kill the backend till the write blocking is released. errno = save_errno should be the alternative of the line "errno = EINTR" and I confirmed that the change leads to the desirable (as of me) behavior. > 0005: Move deadlock and other interrupt handling in proc.c out of signal > handlers. > > I'm surprised how comparatively simple this turned out to be. For > robustness and understanding I think this is a great improvement. > > New and not reviewed at all. Needs significant review. But works > in my simple testcases. > > 0006: Don't allow immediate interupts during authentication anymore. > > So far we've set ImmediateInterruptOK to true during large parts > of the client authentication - that's not all that pretty, > interrupts might arrive while we're in so