On 18/11/2011 11:51 a.m., Alex Rousskov wrote:
Hello,

     Closed bug 3190, open bug 3420, open bug 3417, and several
unreported problems deal with stuck requests, early responses, aborted
requests/responses and combination of those. The attached patch attempts
to solve many of these mushrooming problems, but it is a significant
change that may introduce other bugs. Please review.

The patch is against v3.1 because most sufferers are running that
version. I will port changes to trunk if approved. Technical details are
copied below from the patch preamble.

Fine. The change itself appears very simple. How much testing has it been through? When this goes in it will be enough to push out a new 3.1 release, so we have a week to get some solid production use feedback.



Before these changes, the client side used a single "closing" state to
handle two different error conditions:

   1. We stopped receiving request body because of some error.
   2. We stopped sending response because of some error.

When a "directional" error occurred, we try to keep the transaction
going in the other direction (e.g., to give ICAP the entire request or
to give HTTP client the entire response). However, because there was
just one "closing" state, the code failed to correctly detect or process
many corner cases, resulting in stuck transactions and !theConsumer
assertions/exceptions due to races between enableAutoConsumption() and
expectNoConsumption() calls.

This patch replaces the "closing" state with two direction-specific "we
stopped sending/receiving" flags.

Now, when the response sending code is done, it now checks whether the
receiving code stopped and closes the connection as needed. This is done
both when we encounter a sending error
(ClientSocketContext::initiateClose) and when we successfully sent the
entire response to the client (ClientSocketContext::keepaliveNextRequest).

Similarly, when the request body reading code is done, it now checks
whether the receiving code stopped and closes the connection as needed.
This is done both when we encounter a receiving error
(ConnStateData::noteBodyConsumerAborted) and when we successfully
receive the entire request body from the client
(ClientSocketContext::writeComplete).

TODO: This patch focuses on various error cases. We might still have
problems when there is an early HTTP response and no errors of any kind.
I marked the corresponding old code with an XXX.

I'm not sure what you are thinking here. Early response with a keep-alive should not be hitting the close / stopSending() code path unless its an error. And early response with a Connection:close leaving the clients side (because server-side pconn is irrelevant) can be closed fully just fine after the write. Do we need any clean and purge logics to empty the pipeline without errors being set on that early close case?

The edge case is likely to be ClientSocketContext::keepaliveNextRequest() where pinning closes the conn. That is one error case and could potentially lead to problems. I suspect that it should be doing stopSending("pinned server gone") before it does comm_close(). Do you agree?

For the commit. Please rename ClientSocketContext::initiateClose to closeOnSendError() or something clearer. Or remove entirely. It is only used a few times in client_side.cc


Amos

Reply via email to