On 19/01/2013 12:19 p.m., Rainer Weikusat wrote:
Amos Jeffries <[email protected]> writes:
On 01/17/2013 02:53 PM, Rainer Weikusat wrote:
Alex Rousskov <[email protected]> writes:
On 01/17/2013 10:56 AM, Rainer Weikusat wrote:
Then, the ::connect
method (or some code running on behalf of that) would close
temporaryFd_ before scheduling a 'delayed connect retry' which would
open a new socket. A ::timeout running then won't need to free
write_data and won't have to cancel the early_abort call.
Agreed.
If the write handler was scheduled
by the time timout runs, this happened because 'something definite' is
known about the state of the connection attempt: Either, it suceeded
or the OS returned an error. Throwing this 'real information' away in
order to save a line of code seems not right to me even if it is a
fringe case of a fringe case.
As I said, one would not be just saving lines of code. One would be most
likely introducing bugs (that is exactly what happened in the past with
this code). We have at most two "real information" pieces: a timeout and
I/O status. The first to get to ConnOpener wins.
We have the problem that we do not know whether the future scheduled
::connect was scheduled with a success (usually is) or a IN_PROGRESS
which would re-start another in-progress event cycle. The Job won't
know that datum until that scheduled Call is run.
Presently, that's not much of a problem because the retry code is
broken since it doesn't really retry the connect but just re-schedules
the ::connect method call which will call comm_connect_addr with an
unchanged state which should (I haven't tested this) either return the
previous error or (more likely) return another 'false positive'
'connection succeeded' result which should ultimatively trigger the
same assert I wrote about in my first mail to this list.
After fixing this, the ::connect code could look at the calls_.timeout_
member to determine if it is still supposed to retry or if a timeout
happened in between and retries should stop.
If the timeout already happened then doneConnecting() has already
stopped the Job and the connect() Call will not be run as Alex pointed out.
calls_.* are local pointers held by ConnOpener for the sole purpose of
calling cancel() when necessary. They are *not* unset when the event
system of Squid schedules the Call. There is a separate pointer held in
the FD system (fd_table[].*Handler) which is cleared when the Call is
scheduled by event system or unset via Comm::SetSelect. This
Comm::ConnOpener layer is not exactly a good place to be playing with
those directly (the write_data is a problem hack and needs to die ASAP,
but not yet).
NP: We have already agreed to throw away the later arriving or bogus
'connection succeeded' information if timeout() wins the Call queue race
and uses doneConnecting().
Amos