Re: [PATCH] Comm::TcpAcceptor::doAccept fd limit handling is broken

2014-07-17 Thread Alex Rousskov
On 07/16/2014 06:46 PM, Amos Jeffries wrote:
> On 17/07/2014 11:10 a.m., Alex Rousskov wrote:
>> On 07/16/2014 02:38 PM, Rainer Weikusat wrote:
>>> Alex Rousskov  writes:
 On 07/16/2014 11:11 AM, Rainer Weikusat wrote:
> This is broken because it re-enables readable notifications even if no
> connection was accepted. 
>>
 In other words, it re-enables a check for new connections when

 * we know that we currently cannot accept any new connections,
 * Squid is under stress already, and
 * our code does not handle multiple deferences of the same TcpAcceptor
 object well.
>>
>>
>>> There's an important 4th point: The check is re-enabled when it is known
>>> that it will immediately return true again because the connection which
>>> was not accepted is still pending (hence, the socket is still readable),
>>> more details below.
>>
>> Agreed!
>>
>>
>>> I changed the TcpAcceptor/ AcceptLimiter code to act as if
>>> only a single file descriptor was available for client connections and
>>> created two connections via netcat. What it should do in this case is to
>>> ignore the second connection until the first one was closed. Instead, it
>>> went into an endless 'defer this connection' loop, as mentioned above,
>>> because the pending 2nd connection meant the socket remained readable.
>>
>> OK, I think I now better understand what you meant by "one connection at
>> the same time". And even if I do not, we seem to agree on the fix to
>> solve the problem, which is even more important :-).
>>
>>
>>> updated patch:
>>>
>>> --- mad-squid/src/comm/TcpAcceptor.cc   8 Jan 2013 17:36:44 -   
>>> 1.1.1.2
>>> +++ mad-squid/src/comm/TcpAcceptor.cc   16 Jul 2014 20:36:35 -
>>> @@ -204,7 +204,6 @@
>>>  } else {
>>>  afd->acceptNext();
>>>  }
>>> -SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, 
>>> 0);
>>>  
>>>  } catch (const std::exception &e) {
>>>  fatalf("FATAL: error while accepting new client connection: %s\n", 
>>> e.what());
>>> @@ -262,6 +261,8 @@
>>>  debugs(5, 5, HERE << "Listener: " << conn <<
>>> " accepted new connection " << newConnDetails <<
>>> " handler Subscription: " << theCallSub);
>>> +
>>> +if (!isLimited) SetSelect(conn->fd, COMM_SELECT_READ, 
>>> Comm::TcpAcceptor::doAccept, this, 0);
>>>  notify(flag, newConnDetails);
>>>  }
>>
>>
>> I am happy to commit the above (after splitting the new if-statement
>> into two lines), but since this is not an emergency, I would prefer to
>> wait a little for any objections or second opinions.
> 
> Looks like a reasonable temporary solution to me for the stable
> releases. I have no problems with it going in if we continue to work on
> the queue fixes for trunk.


Hi Amos,

Since I cannot promise to satisfy your "work on the queue fixes for
trunk" precondition, committing this fix to trunk (or removing the
precondition) becomes your call.


> Regarding that queue I think it can be cleaned up once and for all by
> making doAccept() a proper Call and using the new comm/Read.h API.

Yes, but with some adjustments:

Calling acceptOne() or a similar method asynchronously is indeed the
right way to do this.

However, the method that AcceptLimiter will call should not use the
comm/Read API unless AcceptLimiter is willing to provide the same
arguments that a regular read callback gets. I do not think this
complication (and a slight lie that the socket was just selected for
reading) are worth it. We should just call a parameterless TcpAcceptor
method. acceptOne() may be directly usable for this purpose.

doAccept() provides okToAccept() checks that are harmful in this
we-have-already-decided-to-resume-listening context and replacements for
job protections already offered by async calls API. If doAccept()
remains as it is, we should copy its fatal()-related stuff into
TcpAcceptor::callException() though.


> The queue becomes a std::queue 

Yes, or, if one worries about efficiency in this context, this can be
optimized further by reusing (and adjusting) AsyncCallQueue instead. I
probably would not worry about efficiency in this context, but let me
know if I should detail that optimized solution.


> and the nasty
> removeDead() drops in favour of a TcpAcceptor internal Call::cancel().

Remembering the call for the sole purpose of canceling it is harmful
when such cancellations are rare: The async calls API already provides
protection from calling dead/gone jobs, and we should just use that in
this case.

Another fix if somebody is going to work on queue optimization is
removal of isLimited: AFAICT, after all the changes, a "limited"
acceptor cannot have a read scheduled, and an acceptor with a read
scheduled cannot be limited.


HTH,

Alex.



Re: [PATCH] Comm::TcpAcceptor::doAccept fd limit handling is broken

2014-07-16 Thread Amos Jeffries
On 17/07/2014 11:10 a.m., Alex Rousskov wrote:
> On 07/16/2014 02:38 PM, Rainer Weikusat wrote:
>> Alex Rousskov  writes:
>>> On 07/16/2014 11:11 AM, Rainer Weikusat wrote:
 This is broken because it re-enables readable notifications even if no
 connection was accepted. 
> 
>>> In other words, it re-enables a check for new connections when
>>>
>>> * we know that we currently cannot accept any new connections,
>>> * Squid is under stress already, and
>>> * our code does not handle multiple deferences of the same TcpAcceptor
>>> object well.
> 
> 
>> There's an important 4th point: The check is re-enabled when it is known
>> that it will immediately return true again because the connection which
>> was not accepted is still pending (hence, the socket is still readable),
>> more details below.
> 
> Agreed!
> 
> 
>> I changed the TcpAcceptor/ AcceptLimiter code to act as if
>> only a single file descriptor was available for client connections and
>> created two connections via netcat. What it should do in this case is to
>> ignore the second connection until the first one was closed. Instead, it
>> went into an endless 'defer this connection' loop, as mentioned above,
>> because the pending 2nd connection meant the socket remained readable.
> 
> OK, I think I now better understand what you meant by "one connection at
> the same time". And even if I do not, we seem to agree on the fix to
> solve the problem, which is even more important :-).
> 
> 
>> updated patch:
>>
>> --- mad-squid/src/comm/TcpAcceptor.cc8 Jan 2013 17:36:44 -   
>> 1.1.1.2
>> +++ mad-squid/src/comm/TcpAcceptor.cc16 Jul 2014 20:36:35 -
>> @@ -204,7 +204,6 @@
>>  } else {
>>  afd->acceptNext();
>>  }
>> -SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, 
>> 0);
>>  
>>  } catch (const std::exception &e) {
>>  fatalf("FATAL: error while accepting new client connection: %s\n", 
>> e.what());
>> @@ -262,6 +261,8 @@
>>  debugs(5, 5, HERE << "Listener: " << conn <<
>> " accepted new connection " << newConnDetails <<
>> " handler Subscription: " << theCallSub);
>> +
>> +if (!isLimited) SetSelect(conn->fd, COMM_SELECT_READ, 
>> Comm::TcpAcceptor::doAccept, this, 0);
>>  notify(flag, newConnDetails);
>>  }
> 
> 
> I am happy to commit the above (after splitting the new if-statement
> into two lines), but since this is not an emergency, I would prefer to
> wait a little for any objections or second opinions.

Looks like a reasonable temporary solution to me for the stable
releases. I have no problems with it going in if we continue to work on
the queue fixes for trunk.

Regarding that queue I think it can be cleaned up once and for all by
making doAccept() a proper Call and using the new comm/Read.h API. The
queue becomes a std::queue and the nasty
removeDead() drops in favour of a TcpAcceptor internal Call::cancel().
No more Job raw pointers!
 NP that this change will not fix the stable releases due to missing API.

Amos


Re: [PATCH] Comm::TcpAcceptor::doAccept fd limit handling is broken

2014-07-16 Thread Alex Rousskov
On 07/16/2014 02:38 PM, Rainer Weikusat wrote:
> Alex Rousskov  writes:
>> On 07/16/2014 11:11 AM, Rainer Weikusat wrote:
>>> This is broken because it re-enables readable notifications even if no
>>> connection was accepted. 

>> In other words, it re-enables a check for new connections when
>>
>> * we know that we currently cannot accept any new connections,
>> * Squid is under stress already, and
>> * our code does not handle multiple deferences of the same TcpAcceptor
>> object well.


> There's an important 4th point: The check is re-enabled when it is known
> that it will immediately return true again because the connection which
> was not accepted is still pending (hence, the socket is still readable),
> more details below.

Agreed!


> I changed the TcpAcceptor/ AcceptLimiter code to act as if
> only a single file descriptor was available for client connections and
> created two connections via netcat. What it should do in this case is to
> ignore the second connection until the first one was closed. Instead, it
> went into an endless 'defer this connection' loop, as mentioned above,
> because the pending 2nd connection meant the socket remained readable.

OK, I think I now better understand what you meant by "one connection at
the same time". And even if I do not, we seem to agree on the fix to
solve the problem, which is even more important :-).


> updated patch:
> 
> --- mad-squid/src/comm/TcpAcceptor.cc 8 Jan 2013 17:36:44 -   1.1.1.2
> +++ mad-squid/src/comm/TcpAcceptor.cc 16 Jul 2014 20:36:35 -
> @@ -204,7 +204,6 @@
>  } else {
>  afd->acceptNext();
>  }
> -SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, 0);
>  
>  } catch (const std::exception &e) {
>  fatalf("FATAL: error while accepting new client connection: %s\n", 
> e.what());
> @@ -262,6 +261,8 @@
>  debugs(5, 5, HERE << "Listener: " << conn <<
> " accepted new connection " << newConnDetails <<
> " handler Subscription: " << theCallSub);
> +
> +if (!isLimited) SetSelect(conn->fd, COMM_SELECT_READ, 
> Comm::TcpAcceptor::doAccept, this, 0);
>  notify(flag, newConnDetails);
>  }


I am happy to commit the above (after splitting the new if-statement
into two lines), but since this is not an emergency, I would prefer to
wait a little for any objections or second opinions.


Thank you,

Alex.