Re: [PATCH] Comm::TcpAcceptor::doAccept fd limit handling is broken
On 07/16/2014 06:46 PM, Amos Jeffries wrote: > On 17/07/2014 11:10 a.m., Alex Rousskov wrote: >> On 07/16/2014 02:38 PM, Rainer Weikusat wrote: >>> Alex Rousskov writes: On 07/16/2014 11:11 AM, Rainer Weikusat wrote: > This is broken because it re-enables readable notifications even if no > connection was accepted. >> In other words, it re-enables a check for new connections when * we know that we currently cannot accept any new connections, * Squid is under stress already, and * our code does not handle multiple deferences of the same TcpAcceptor object well. >> >> >>> There's an important 4th point: The check is re-enabled when it is known >>> that it will immediately return true again because the connection which >>> was not accepted is still pending (hence, the socket is still readable), >>> more details below. >> >> Agreed! >> >> >>> I changed the TcpAcceptor/ AcceptLimiter code to act as if >>> only a single file descriptor was available for client connections and >>> created two connections via netcat. What it should do in this case is to >>> ignore the second connection until the first one was closed. Instead, it >>> went into an endless 'defer this connection' loop, as mentioned above, >>> because the pending 2nd connection meant the socket remained readable. >> >> OK, I think I now better understand what you meant by "one connection at >> the same time". And even if I do not, we seem to agree on the fix to >> solve the problem, which is even more important :-). >> >> >>> updated patch: >>> >>> --- mad-squid/src/comm/TcpAcceptor.cc 8 Jan 2013 17:36:44 - >>> 1.1.1.2 >>> +++ mad-squid/src/comm/TcpAcceptor.cc 16 Jul 2014 20:36:35 - >>> @@ -204,7 +204,6 @@ >>> } else { >>> afd->acceptNext(); >>> } >>> -SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, >>> 0); >>> >>> } catch (const std::exception &e) { >>> fatalf("FATAL: error while accepting new client connection: %s\n", >>> e.what()); >>> @@ -262,6 +261,8 @@ >>> debugs(5, 5, HERE << "Listener: " << conn << >>> " accepted new connection " << newConnDetails << >>> " handler Subscription: " << theCallSub); >>> + >>> +if (!isLimited) SetSelect(conn->fd, COMM_SELECT_READ, >>> Comm::TcpAcceptor::doAccept, this, 0); >>> notify(flag, newConnDetails); >>> } >> >> >> I am happy to commit the above (after splitting the new if-statement >> into two lines), but since this is not an emergency, I would prefer to >> wait a little for any objections or second opinions. > > Looks like a reasonable temporary solution to me for the stable > releases. I have no problems with it going in if we continue to work on > the queue fixes for trunk. Hi Amos, Since I cannot promise to satisfy your "work on the queue fixes for trunk" precondition, committing this fix to trunk (or removing the precondition) becomes your call. > Regarding that queue I think it can be cleaned up once and for all by > making doAccept() a proper Call and using the new comm/Read.h API. Yes, but with some adjustments: Calling acceptOne() or a similar method asynchronously is indeed the right way to do this. However, the method that AcceptLimiter will call should not use the comm/Read API unless AcceptLimiter is willing to provide the same arguments that a regular read callback gets. I do not think this complication (and a slight lie that the socket was just selected for reading) are worth it. We should just call a parameterless TcpAcceptor method. acceptOne() may be directly usable for this purpose. doAccept() provides okToAccept() checks that are harmful in this we-have-already-decided-to-resume-listening context and replacements for job protections already offered by async calls API. If doAccept() remains as it is, we should copy its fatal()-related stuff into TcpAcceptor::callException() though. > The queue becomes a std::queue Yes, or, if one worries about efficiency in this context, this can be optimized further by reusing (and adjusting) AsyncCallQueue instead. I probably would not worry about efficiency in this context, but let me know if I should detail that optimized solution. > and the nasty > removeDead() drops in favour of a TcpAcceptor internal Call::cancel(). Remembering the call for the sole purpose of canceling it is harmful when such cancellations are rare: The async calls API already provides protection from calling dead/gone jobs, and we should just use that in this case. Another fix if somebody is going to work on queue optimization is removal of isLimited: AFAICT, after all the changes, a "limited" acceptor cannot have a read scheduled, and an acceptor with a read scheduled cannot be limited. HTH, Alex.
Re: [PATCH] Comm::TcpAcceptor::doAccept fd limit handling is broken
On 17/07/2014 11:10 a.m., Alex Rousskov wrote: > On 07/16/2014 02:38 PM, Rainer Weikusat wrote: >> Alex Rousskov writes: >>> On 07/16/2014 11:11 AM, Rainer Weikusat wrote: This is broken because it re-enables readable notifications even if no connection was accepted. > >>> In other words, it re-enables a check for new connections when >>> >>> * we know that we currently cannot accept any new connections, >>> * Squid is under stress already, and >>> * our code does not handle multiple deferences of the same TcpAcceptor >>> object well. > > >> There's an important 4th point: The check is re-enabled when it is known >> that it will immediately return true again because the connection which >> was not accepted is still pending (hence, the socket is still readable), >> more details below. > > Agreed! > > >> I changed the TcpAcceptor/ AcceptLimiter code to act as if >> only a single file descriptor was available for client connections and >> created two connections via netcat. What it should do in this case is to >> ignore the second connection until the first one was closed. Instead, it >> went into an endless 'defer this connection' loop, as mentioned above, >> because the pending 2nd connection meant the socket remained readable. > > OK, I think I now better understand what you meant by "one connection at > the same time". And even if I do not, we seem to agree on the fix to > solve the problem, which is even more important :-). > > >> updated patch: >> >> --- mad-squid/src/comm/TcpAcceptor.cc8 Jan 2013 17:36:44 - >> 1.1.1.2 >> +++ mad-squid/src/comm/TcpAcceptor.cc16 Jul 2014 20:36:35 - >> @@ -204,7 +204,6 @@ >> } else { >> afd->acceptNext(); >> } >> -SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, >> 0); >> >> } catch (const std::exception &e) { >> fatalf("FATAL: error while accepting new client connection: %s\n", >> e.what()); >> @@ -262,6 +261,8 @@ >> debugs(5, 5, HERE << "Listener: " << conn << >> " accepted new connection " << newConnDetails << >> " handler Subscription: " << theCallSub); >> + >> +if (!isLimited) SetSelect(conn->fd, COMM_SELECT_READ, >> Comm::TcpAcceptor::doAccept, this, 0); >> notify(flag, newConnDetails); >> } > > > I am happy to commit the above (after splitting the new if-statement > into two lines), but since this is not an emergency, I would prefer to > wait a little for any objections or second opinions. Looks like a reasonable temporary solution to me for the stable releases. I have no problems with it going in if we continue to work on the queue fixes for trunk. Regarding that queue I think it can be cleaned up once and for all by making doAccept() a proper Call and using the new comm/Read.h API. The queue becomes a std::queue and the nasty removeDead() drops in favour of a TcpAcceptor internal Call::cancel(). No more Job raw pointers! NP that this change will not fix the stable releases due to missing API. Amos
Re: [PATCH] Comm::TcpAcceptor::doAccept fd limit handling is broken
On 07/16/2014 02:38 PM, Rainer Weikusat wrote: > Alex Rousskov writes: >> On 07/16/2014 11:11 AM, Rainer Weikusat wrote: >>> This is broken because it re-enables readable notifications even if no >>> connection was accepted. >> In other words, it re-enables a check for new connections when >> >> * we know that we currently cannot accept any new connections, >> * Squid is under stress already, and >> * our code does not handle multiple deferences of the same TcpAcceptor >> object well. > There's an important 4th point: The check is re-enabled when it is known > that it will immediately return true again because the connection which > was not accepted is still pending (hence, the socket is still readable), > more details below. Agreed! > I changed the TcpAcceptor/ AcceptLimiter code to act as if > only a single file descriptor was available for client connections and > created two connections via netcat. What it should do in this case is to > ignore the second connection until the first one was closed. Instead, it > went into an endless 'defer this connection' loop, as mentioned above, > because the pending 2nd connection meant the socket remained readable. OK, I think I now better understand what you meant by "one connection at the same time". And even if I do not, we seem to agree on the fix to solve the problem, which is even more important :-). > updated patch: > > --- mad-squid/src/comm/TcpAcceptor.cc 8 Jan 2013 17:36:44 - 1.1.1.2 > +++ mad-squid/src/comm/TcpAcceptor.cc 16 Jul 2014 20:36:35 - > @@ -204,7 +204,6 @@ > } else { > afd->acceptNext(); > } > -SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, 0); > > } catch (const std::exception &e) { > fatalf("FATAL: error while accepting new client connection: %s\n", > e.what()); > @@ -262,6 +261,8 @@ > debugs(5, 5, HERE << "Listener: " << conn << > " accepted new connection " << newConnDetails << > " handler Subscription: " << theCallSub); > + > +if (!isLimited) SetSelect(conn->fd, COMM_SELECT_READ, > Comm::TcpAcceptor::doAccept, this, 0); > notify(flag, newConnDetails); > } I am happy to commit the above (after splitting the new if-statement into two lines), but since this is not an emergency, I would prefer to wait a little for any objections or second opinions. Thank you, Alex.