Re: [PATCH] ConnOpener fixes

Amos Jeffries Wed, 30 Jan 2013 19:18:37 -0800

On 29/01/2013 6:32 a.m., Alex Rousskov wrote:

Hello,


     The attached patch fixes several ConnOpener problems by relying on
AsyncJob protections while maintaining a tighter grip on various I/O and
sleep states. This patch is identical to the last ConnOpener patch
posted in PREVIEW state. I am now done with valgrind testing. No leaks
were detected, although I doubt my tests were comprehensive enough to
trigger all possible conditions where the unpatched code leaks.

I started with Rainer Weikusat's timeout polishing patch posted a few
days ago, but all bugs are mine.


Here are some of the addressed problems:

* Connection descriptor was not closed when attempting to reconnect
after failures. We now properly close on failures, sleep with descriptor
closed, and then reopen.

* Timeout handler was not cleaned up properly in some cases, causing
memory leaks (for the handler Pointer) and possibly timeouts that were
fired (for then-active handler) after the connection was passed to the
initiator.

* Comm close handler was not cleaned up properly.

* statCounter.syscalls.sock.closes counter was not updated on FD closure.

* Waiting pending accepts were not kicked on FD closure.

* Connection timeout was enforced for each connection attempt instead of
applying to all attempts taken together.

and possibly other problems. The full extent of all side-effects of
mishandled race conditions and state conflicts is probably unknown.


TODO (outside this project scope):
   Polish comm_close() to always reset "Select" state.
   Make connect_timeout documentation in squid.conf less ambiguous.
   Move prevalent conn_ debugging to the status() method?
   Polish Comm timeout handling to always reset .timeout on callback?
   Revise eventDelete() to delete between-I/O sleep timeout?


HTH,

Alex.


One performane optimization I can see:

* if the callback_ is != NULL but has been cancelled by the receivingJob. Your code now calls sendAnswer() and schedules the alreadycancelled Call.Previously this was tested for in swanSong() to simply NULL thecallback_ pointer instead of scheduling a dead Call.NP: it would probably be a good idea to test for thecallback_->cancelled() case and debugs() report it instead of scheduling.

The result of omitting this check saves one if() test per connection byadding the full overheads of memory, scheduling, and CPU for oneAsyncCall queue operation per connection. ConnOpener operates on theserver-side where multiple client-side Jobs may result in TIMEOUT orABORTED. This cancellation case is a small but common case.



+1. Other than the above performance tweak this looks fine for commit.


Amos

Re: [PATCH] ConnOpener fixes

Reply via email to