On Tue, Jul 25, 2017 at 7:49 PM, Ruediger Pluem <rpl...@apache.org> wrote:
>
>
> On 07/25/2017 03:32 PM, Yann Ylavic wrote:
>> On Tue, Jul 25, 2017 at 9:54 AM, Ruediger Pluem <rpl...@apache.org> wrote:
>>>
>>> On 07/25/2017 01:37 AM, Yann Ylavic wrote:
>>>>
>>>> This change may be disputable...
>>>>
>>>> I wanted to:
>>>> 1. avoid blocking in any case (we can't where this is called),
>>>> 2. avoid issueing a nonblocking "normal" close() which lets the system 
>>>> handle
>>>>    the burden of possibly leaking descriptors (until its own linger is 
>>>> done).
>>>
>>> What is the real use case for 2.?
>>
>> There are three cases where abort_socket_nonblocking() can be called:
>>
>> a. close_worker_sockets(), as of ungraceful shutdown/restart, to kill sockets
>>    used by all the workers),
>>
>> b. stop_lingering_close() to stop reading definitely from the socket when the
>>    linger timeout in place expires,
>>
>> c. some push2worker() failure, we can't handle a new connection (or in write
>>    completion state) because of an internal error (a graceful is scheduled 
>> now
>>    since this commit).
>>
>> For a. I think RST is the best we can do since we are forcibly
>> terminating active connections (possibly many), which we know nothing
>> about (flush/writ-ability), and for which normal close() could block
>> for an indefinite time (either in mpm_event if the socket is kept
>> blocking, or in the system if it's set nonblocking).
>>
>> For b. the socket is already shutdown(), we lingered with the
>> appropriate timeout, it's time to RST further data (a normal close()
>> would be the same I think).
>>
>> For c. we haven't read anything from a new connection or write
>> completion is now ready, in both cases the socket is writable (a new
>> connection is unlikely acceptable with a zero window) so we could
>> close() normally afterall.
>> Still we abort the connection outside the underlying protocol (HTTP or
>> else), knowingly with pending data in or out.
>> I think a normal close() could be misinterpreted in the write case (as
>> a usual end of connection, think non-HTTP/upgraded data in async),
>> while it would be equivalent for the newly accepted connection case
>> (if we close with data already received and ack-ed, the system will
>> RST by itself).
>>
>> So possibly these are all legitimate RST cases (we do like TIME_WAITs,
>> but it seems not reachable with any of the above points probably I
>> think), but I may be missing something...
>>
>
> I am still not convinced why we couldn't leave that job of
> closing down the sockets (non blocking from application perspective) to the 
> OS.

It could be considered a leak, the application (httpd) doesn't take
care of its resources and increases the number of orphaned sockets
which the system has to handle anyway (tcp memory increases and can
reach some limits, system has to be fine tuned to support more
orphans, ...).
Ungraceful restarts can be costly in that regard, new generation may
have issues creating sockets or performing if system (TCP) resources
become short.

> The best argument might be the case where we are in write completion
> and close the connection regularly and this could be perceived on client
> side as having received the full response. OTOH when do we sent replies
> these days that are neither transfer encoded chunked or have a content-length
> header (to talk about the HTTP layer)? So this case does not seem to be 
> strong either.

trunk can already go async in mod_proxy_wstunnel for example, this is
not HTTP anymore there after the Upgrade.

> So I am personally -0.5 on that part of the commit, but I am happy to
> hear what others think.

No strong opinion to push that change either, I tend to think that
it's the right thing to do.
But 2.4.x is currently blocking in the a. case above, so it likely
happens that the child processes terminate before all active
connections are effectively closed, hence orphan sockets already exist
today I think, which is not a raised issue AFAICT.
b. and c. are more sporadic, so the system can probably handle their load.

Still the pathological case of any of [abc] above (multilpe successive
ungraceful shutdowns, many connections expiring in lingering close,
...) is something httpd should care about IMHO.


Regards,
Yann.

Reply via email to