On Mon, Sep 20, 2021 at 12:43 PM ste...@eissing.org <ste...@eissing.org> wrote: > > > Am 20.09.2021 um 12:27 schrieb Ruediger Pluem <rpl...@apache.org>: > > > > On 9/20/21 11:17 AM, ste...@eissing.org wrote: > >> > >>> Am 14.09.2021 um 13:43 schrieb Ruediger Pluem <rpl...@apache.org>: > >>> > >>>> --- httpd/httpd/trunk/server/mpm/event/event.c (original) > >>>> +++ httpd/httpd/trunk/server/mpm/event/event.c Wed Jul 3 13:46:31 2019 > >>>> @@ -1153,10 +1153,11 @@ read_request: > >>>> else if (ap_filter_should_yield(c->output_filters)) { > >>>> pending = OK; > >>>> } > >>>> - if (pending == OK) { > >>>> + if (pending == OK || (pending == DECLINED && > >>>> + cs->pub.sense == CONN_SENSE_WANT_READ)) { > >>>> /* Still in WRITE_COMPLETION_STATE: > >>>> - * Set a write timeout for this connection, and let the > >>>> - * event thread poll for writeability. > >>>> + * Set a read/write timeout for this connection, and let the > >>>> + * event thread poll for read/writeability. > >>>> */ > >>>> cs->queue_timestamp = apr_time_now(); > >>>> notify_suspend(cs); > >>> > >>> Hm. Showing following code lines from trunk for my next question: > >>> > >>> update_reqevents_from_sense(cs, -1); > >>> > >>> if cs->pub.sense == CONN_SENSE_WANT_READ we subscribe on to POLLIN only > >>> on the pfd > >>> > >>> apr_thread_mutex_lock(timeout_mutex); > >>> TO_QUEUE_APPEND(cs->sc->wc_q, cs); > >>> rv = apr_pollset_add(event_pollset, &cs->pfd); > >>> > >>> If the read event triggers we will get back to the > >>> > >>> if (cs->pub.state == CONN_STATE_WRITE_COMPLETION) { > >>> int pending = DECLINED; > >>> > >>> above. This means we try flush the output queue if it is not empty and if > >>> it is empty after this we would fall through to > >>> set cs->pub.state = CONN_STATE_CHECK_REQUEST_LINE_READABLE; and then add > >>> the pfd to the pollset again and poll again. This should > >>> trigger the read event again and we would process the input. But if I see > >>> this correctly we would need to poll twice for getting > >>> the read data processed.
For CONN_STATE_WRITE_COMPLETION + CONN_SENSE_WANT_READ we indeed reach here and will POLLIN, then once readable we come back with CONN_STATE_WRITE_COMPLETION + CONN_SENSE_DEFAULT so if pending != OK we'll reach the below: else if (ap_run_input_pending(c) == OK) { cs->pub.state = CONN_STATE_READ_REQUEST_LINE; goto read_request; } and thus ap_run_process_connection() directly. > >>> OTOH if we do not manage to clear the output queue above we would add the > >>> pfd to the pollset again but this time for writing and > >>> only once all output has been processed we would take care of the reading. > >>> Maybe we should take care of both? > >> > >> Hmm, but can it? In my understanding CONN_STATE_WRITE_COMPLETION is > >> intended to flush the out buffers before putting in new work. > > > > But a single call dies not need to flush all pending data from my > > understanding provided that no need to flush the data is found > > like too much in memory data or too much EOR buckets which mean that we > > have too much finished requests in the output queue. > > Hence CONN_STATE_WRITE_COMPLETION could happen multiple times only writing > > pieces from the data in the output queue. If pending == OK while we asked for CONN_SENSE_WANT_READ we indeed lose CONN_SENSE_WANT_READ when coming back here after POLLOUT. What about this patch (attached)? I tried to comment a bit the WRITE_COMPLETION_STATE too, because it's not only about writing... Cheers; Yann.
Index: server/mpm/event/event.c =================================================================== --- server/mpm/event/event.c (revision 1893073) +++ server/mpm/event/event.c (working copy) @@ -999,7 +1001,7 @@ static void process_socket(apr_thread_t *thd, apr_ { conn_rec *c; long conn_id = ID_FROM_CHILD_THREAD(my_child_num, my_thread_num); - int clogging = 0, from_wc_q = 0; + int clogging = 0; apr_status_t rv; int rc = OK; @@ -1101,9 +1103,6 @@ read_request: rc = OK; } } - else if (cs->pub.state == CONN_STATE_WRITE_COMPLETION) { - from_wc_q = 1; - } } /* * The process_connection hooks above should set the connection state @@ -1146,20 +1145,28 @@ read_request: cs->pub.state = CONN_STATE_LINGER; } + /* CONN_STATE_WRITE_COMPLETION is a misnomer, this is actually the + * user driver polling state which can be POLLOUT but also POLLIN + * depending on cs->pub.sense. But even for CONN_SENSE_WANT_READ + * we want the outgoing pipe to be empty before POLLIN, so there + * will always be write completion first. + */ if (cs->pub.state == CONN_STATE_WRITE_COMPLETION) { - int pending = DECLINED; + int pending, want_read = 0; - ap_update_child_status(cs->sbh, SERVER_BUSY_WRITE, NULL); - - if (from_wc_q) { - from_wc_q = 0; /* one shot */ - pending = ap_run_output_pending(c); + if (cs->pub.sense == CONN_SENSE_DEFAULT) { + /* If the user did not ask for READ or WRITE explicitely then + * we are in normal post request processing, i.e. busy write + * in the scoreboard. + */ + ap_update_child_status(cs->sbh, SERVER_BUSY_WRITE, NULL); } - else if (ap_filter_should_yield(c->output_filters)) { - pending = OK; + else if (cs->pub.sense == CONN_SENSE_WANT_READ) { + want_read = 1; } - if (pending == OK || (pending == DECLINED && - cs->pub.sense == CONN_SENSE_WANT_READ)) { + + pending = ap_run_output_pending(c); + if (pending == OK || (pending == DECLINED && want_read)) { /* Still in WRITE_COMPLETION_STATE: * Set a read/write timeout for this connection, and let the * event thread poll for read/writeability. @@ -1167,7 +1174,19 @@ read_request: cs->queue_timestamp = apr_time_now(); notify_suspend(cs); + /* If we are asked to poll for read but can't flush all the pending + * data first then we need to preserve CONN_SENSE_WANT_READ until + * the flush is complete, so that we finally make sure that the + * connection is readable (POLLIN) before processing it again. + */ + if (pending == OK && want_read) { + cs->pub.sense = CONN_SENSE_WANT_WRITE; + } update_reqevents_from_sense(cs, -1); + if (pending == OK && want_read) { + cs->pub.sense = CONN_SENSE_WANT_READ; + } + apr_thread_mutex_lock(timeout_mutex); TO_QUEUE_APPEND(cs->sc->wc_q, cs); rv = apr_pollset_add(event_pollset, &cs->pfd);