On Mon, Aug 13, 2012 at 8:32 AM, Joe Orton <jor...@redhat.com> wrote:
> On Fri, Aug 10, 2012 at 01:31:07PM -0400, Jeff Trawick wrote:
>> We picked up that apr_socket_opt_set() from the async-dev branch with
>> r327872, though the timeout calls in there were changed subsequently.
>> I wonder if that call is stray and it doesn't get along with the
>> timeout handling on Windows because of the SO_RCVTIMEO/SO_SENDTIMEO
>> use on Windows, in lieu of non-blocking socket + poll like on Unix.
>>
>> At the time it was added, the new code was
>>
>> apr_socket_timeout_set(client_socket, 0)
>> apr_socket_opt_set(client_socket, APR_SO_NONBLOCK, 1)
>>
>> (redundant unless there was some APR glitch at the time)
>
> Hmmmm, is this right?
>
> For event, the listening socket, and hence the accepted socket, is
> always set to non-blocking in the MPM.
>
> For non-event on Unix, the listening socket, and hence the accepted
> socket, is set to non-blocking IFF there are multiple listeners.

But the underlying descriptor for the accepted gets set to
non-blocking soon after accept() (see below).

>
> So that line is not redundant in the non-event, single listener
> configuration on Unix... no?
>
> Regards, Joe

Background:  With the APR socket implementation on Unix, the
underlying socket descriptor is placed in non-blocking mode when the
application (httpd) sets a timeout >= 0.

For a single listener worker configuration with mod_reqtimeout
disabled and the default Timeout of 60 seconds, here's what happens
w.r.t. blocking and timeouts on the connected sockets once
core_create_conn() is entered.  This is with the bug present (i.e.,
with the needless apr_socket_opt_set()).  At the time
core_create_conn() is entered, the apr_socket_t has timeout -1 and the
only option set is APR_TCP_NODELAY).

* core_pre_connection() sets timeout to 60000000  (this makes the
underlying socket non-blocking on Unix)
* SSL needs to read but first calls flush and ap_core_output_filter()
makes the APR socket non-blocking (bug according to me ;) )
* also called from SSL flush, writev_nonblocking() sets timeout to 0
(that also ensures that the underlying socket is non-blocking)
* writev_nonblocking() restores the timeout (60000000)
* also called from SSL flush, writev_nonblocking() sets timeout to 0
and then restores the timeout (60000000)
* in some complicated flow from default_handler through SSL,
writev_nonblocking sets timeout to 0 then restores the timeout
* lingering_close() sets timeout to 2000000

So the underlying socket descriptor is still made non-blocking on Unix
even before the bug is encountered, as part of the timeout
implementation.  And that call to mark the socket non-blocking with
apr_socket_opt_set() is out of sync with the rest of the code that
sets timeout to 0 or >0 as necessary.  (And of course it is out of
sync in a way that exposes the difference on Windows.)

Does that explanation work for you?

-- 
Born in Roswell... married an alien...
http://emptyhammock.com/

Reply via email to