Re: client_side and comm_close

2008-04-21 Thread Amos Jeffries

Alex Rousskov wrote:

On Mon, 2008-04-21 at 15:48 +1200, Amos Jeffries wrote:

comm_close(fd) API:

1) No I/O callbacks will be dialed after comm_close is called (including
the time when comm_close is running).


Sounds good.


2) All close callbacks registered before comm_close was called will be
called asynchronously, sometime after comm_close exits.


Sound good.


3) The code can ask Comm whether a FD associated with a close callback
has been closed. The answer will be yes after comm_close is called and
no before that. This interface needs to be added. Direct access to
fd_table[fd].flags will not work because the same FD could have been
assigned to another connection already. The code will have to use its
close callback to determine FD status.

Sounds good, BUT, direct access to fd_table pieces may need to be blocked
entirely (private:) so code is forced to go through the Comm API properly.


Yes, except if we want to avoid modifying old code that can still access
those flags directly because it gets immediate close callbacks (more on
that below).


(2) states that the higher-level close callbacks may be run at any time.
ie after the callback (3) refers to is run. This leaves a window open for
disaster, unless the closing callbacks are made immediate, and back we go
to recursion...


Those are the same close callbacks! There are no low-level and
high-level close callbacks here. The external code can use its stored
callback pointer to get FD status even after that close callback has
been called. There is no problem with that. The callback will not look
at fd_table in that case, it will just say yes, the fd is closed as far
as you should be concerned.

And, per recent suggestions, old code will get immediate close callbacks
so it does not need to be modified to use the close callback pointer to
ask about FD status.


Alright. I got to this branch of the thread before the other which makes 
things clearer. Same for the below.





4) Registering any callback after comm_close has been called is wrong.
Comm will assert if it can detect such late registrations. Not all late
registrations will be detected because the same FD can be already
assigned to a different (new) connection[*].

That non-detection seems to me to be a worry. The same problem in (3)
occurs here. (4) can guarantee that the closing callbacks don't play nasty
re-registrations. But only if they are called immediate instead of
scheduled.


Sorry, I do not understand. The late registration of a close callback
can come from anywhere. The old code relies on fd only. There is no way
for comm to distinguish whether the caller is using a FD from a closed
connection or a new one. This problem exists in the old code as well so
there is no new danger here! This problem can only be solved when we
have comm handlers of one sort or another.


The above comm_close API is easy to implement without massive code
changes. Do you think it is the right API? If not, what should be
changed?

Apart from the worry with immediate vs delayed closing callbacks.

To reduce that worry somewhat I think, the callbacks which actually use
ERR_COMM_CLOSING for anything other than immediate abort will need to be
replaced with two; a normal callback that checks the FD is open, and a
simple closing callback.


I am confused by the normal callback that checks the FD is open part.
What is that for?  Are you talking about some internal comm calls to
close the FD? I believe that handler code that currently does not ignore
ERR_COMM_CLOSING notifications will need to be moved into the close
handler code (because only the close handler will be called if we
eliminate ERR_COMM_CLOSING).


Nevermind. I was saying it wrong, but meaning what you have re-stated.

Amos
--
Please use Squid 2.6.STABLE19 or 3.0.STABLE4


Re: client_side and comm_close

2008-04-21 Thread Amos Jeffries

Alex Rousskov wrote:

On Mon, 2008-04-21 at 16:02 +1200, Amos Jeffries wrote:

On Sun, 2008-04-20 at 22:01 +0300, Tsantilas Christos wrote:

Maybe it can be easier:

The ease of implementation is a separate question. We still have to
agree what we are implementing, and the items below attempt to define
that. Ambiguity and lack of clear APIs is quite dangerous here (as the
related bugs illustrate!).


Just keeping the comm_close as is and in AsyncCallsQueue just
cancels/not execute asyncCall's for which the fd_table[fd].flags.closing
is set.

Implementing #1 below can indeed be as simple as you sketch above. We
need to make sure that the final call that closes the FD and makes fd
value available to other connections is placed last (that may already be
true and is easy to fix if not).

Not true.

IMO, The proposed API _very_ first two lines of code in comm_close are to
register a special Comm callback to perform the fclose() call, and then to
immediately set fd_table flag closed for the rest of the comm_close
process.


Agreed on the flag, disagreed on the call. The special/internal Comm
call (to self) should be scheduled last (after all close callbacks) and
not first because the close handler might need access to some FD-related
info. That info should be preserved until all close handlers have been
called.


With that condition at the start we can guarantee that any registrations
made during the close sequence are either non-FD-relevant or caught.


Yes, the flag is sufficient for that. The internal close for good call
can still be last.


I was thinking the close-for-good call would get caught as an fd 
operation on closed fd by the Async stuff if set after the flag.





The special Comm callback is only special in that it is not required to
check flags open before fclose()'ing the system-level FD, which will allow
new connections to open on the FD.


It is special because it is calling an internal comm method not some
external close handler. Its profile and parameters are different.  I
would not call it a callback because of that, but the label is not
important. It is not a close callback in terms of
comm_add_close_handler.


Yet it seems to me it needs to be run as an async after the other async 
close-handlers are run. Due to the non-determinence of the async timing.



Between the initial comm_close() call and the special Comm callback, we
don't need to care if callbacks write their shutdown statements to the fd
(its still technically open) but the closed flag prevents comm accepting
any new delayed event registrations or reads.


Exactly. Our only problem is with code that calls comm with a stale fd,
after that fd has been really closed and a new connection was opened
with the same fd. That's not a new problem and we will solve it in v3.2
using comm handlers.

I hope the above puts us on the same page about the implementation
sketch for the comm_close API, but please yell if something still seems
broken.


Okay. It looks the way I would implement it from scratch.

Amos
--
Please use Squid 2.6.STABLE19 or 3.0.STABLE4


Re: client_side and comm_close

2008-04-21 Thread Alex Rousskov
On Mon, 2008-04-21 at 23:45 +1200, Amos Jeffries wrote:
  IMO, The proposed API _very_ first two lines of code in comm_close are to
  register a special Comm callback to perform the fclose() call, and then to
  immediately set fd_table flag closed for the rest of the comm_close
  process.
  
  Agreed on the flag, disagreed on the call. The special/internal Comm
  call (to self) should be scheduled last (after all close callbacks) and
  not first because the close handler might need access to some FD-related
  info. That info should be preserved until all close handlers have been
  called.
  
  With that condition at the start we can guarantee that any registrations
  made during the close sequence are either non-FD-relevant or caught.
  
  Yes, the flag is sufficient for that. The internal close for good call
  can still be last.
 
 I was thinking the close-for-good call would get caught as an fd 
 operation on closed fd by the Async stuff if set after the flag.

Actually, the internal close-for-good call handler asserts that the flag
is set! This internal handler does not have the structure or the
restrictions of the public comm_* functions...

  The special Comm callback is only special in that it is not required to
  check flags open before fclose()'ing the system-level FD, which will allow
  new connections to open on the FD.
  
  It is special because it is calling an internal comm method not some
  external close handler. Its profile and parameters are different.  I
  would not call it a callback because of that, but the label is not
  important. It is not a close callback in terms of
  comm_add_close_handler.
 
 Yet it seems to me it needs to be run as an async after the other async 
 close-handlers are run. Due to the non-determinence of the async timing.

Yes, the it must called asynchronously. The call is scheduled after all
close callbacks are scheduled so that it will be dialed last.

Alex.




Re: client_side and comm_close

2008-04-20 Thread Alex Rousskov
On Sun, 2008-04-20 at 11:47 +0800, Adrian Chadd wrote:

   Yes, then you left it by introducing something which exposed the bugs.
  
  The problems this thread is discussing were known before AsyncCalls.
 
 But they weren't exhibiting as -bugs-.

They were: assertion in bug2309 (and many other problems I personally
worked on) were discovered in Squid3 way before AsyncCalls (v3.0 and
earlier).

   Its great that we now know about the bugs - but now you've got a codebase
   that you're trying to stabilise 
  
  This thread is not about stabilizing client-side code. It is about
  changing its key classes and the relationships among them (for many
  reasons). This will certainly destabilize the code short-term. If it
  would not, I would not need to ask others opinion whether we should wait
  with that work!
 
 I thought the focus on 3.1 - and I'd check, but the Wiki history for the
 Roadmap stuff I put in early in the 3.1 roadmap cycle - was to focus on
 stability and performance.
...
 _I_ helped start building the 3.1 roadmap, if you remember. _I_ helped
 draft the notion that 3.1 should be about performance and stability
 fixes.

This thread is (was) dedicated to the discussion whether _adjusting_
v3.1 roadmap to add client-side cleanup is a good idea. Since we have
more than one person working on the project, we should try to coordinate
significant changes.  If you think client-side cleanup is a bad idea for
v3.1, just post your arguments -- no reason to talk about AsyncCalls or
other unrelated matters (which can be discussed in other threads, of
course).

As for v3.1 original focus that you put in early, I do not even want
to argue about that, and I am happy to assume that your original v3.1
roadmap was exactly what we should have done.

What matters though, is the _current_ Squid3 roadmap, which we are
trying to base on specific commitments of specific folks. It is far from
perfect, but I think it is improving and reflects the current reality
reasonably well.

 If you want people to adopt Squid-3 then you should try and bring
 another stable release out with IPv6 (once all of -those- issues are
 fixed) and without too much quirkiness, and soon.

We already have a Squid3 release that is reasonably stable and
improving. I am not sure why we need to rush v3.1 out when the planned
features have not been completed yet. (And did not you claim that IPv6
as a Squid feature is not really important? This thread is not about
IPv6 though.)

Alex.




Re: client_side and comm_close

2008-04-20 Thread Henrik Nordstrom
lör 2008-04-19 klockan 21:01 -0600 skrev Alex Rousskov:

 I agree with many of your specific suggestions (and will discuss them
 separately), but I want to understand what problem you are proposing to
 solve first. For example:
 
 - Fix a few new bugs introduced by async calls.
 - Fix certain comm API problems.
 - Improve client-side code.
 - Something else.
 
 This thread was just about improving client-side code (more
 specifically, when to do it), but the responses seem to be are all over
 the map so I am trying to understand what the goal is before diving into
 specific development ideas.

I am focusing more on comm than client_side at the moment. Fixing
client_side without a well defined and reliable comm layer is not very
meaningful.

Regards
Henrik



Re: client_side and comm_close

2008-04-20 Thread Alex Rousskov
On Sun, 2008-04-20 at 09:22 +0200, Henrik Nordstrom wrote:
 lör 2008-04-19 klockan 23:05 -0600 skrev Alex Rousskov:
  It is easy to implement two kinds of call scheduling, but I am not sure
  I understand the merits of doing so. What do you mean by the event
  loops in this context? Are you talking about eventAdd() queue? Or I/O
  event (i.e., comm) callbacks? I am guessing it is the latter.
 
 Both. The event checks (I/O, timed events etc) run from the main loop
 and are always the first AsyncCall callers for anything going on.
 
  The immediate calls from comm were not really safe (because of the
  recursive/reentrant behavior) so I am not sure what we would achieve
  here.
 
 As everything going on in Squid is initiated by the event loop you don't
 get recursion from this change..
 
 main loop - event loop - immediate async call - handler - other I/O
 actions etc but from here always async..

I agree regarding the first initiation point, but disagree that it is
sufficient to guarantee non-recursive behavior: In addition to
callbacks, we have direct function calls to comm and other APIs. The
combination of immediate callbacks and those function calls will cause
recursion as control will reenter the jobs multiple times. For
example:

event loop - immediate callback - handler - comm_* call - immediate
callback - handler - ...

  Please clarify which of the following comm_close implementation would
  solve the problems you want to solve in this context:
  
  A) Immediately call all not-yet-dialed callbacks for the affected FD.
  All relevant I/O callbacks are called with ERR_COMM_CLOSING flag set (if
  we still have that). The closing callbacks, if any, are called last.
  
  B) Asynchronously call all not-yet-dialed callbacks for the affected FD.
  All relevant I/O callbacks are called with ERR_COMM_CLOSING flag set (if
  we still have that). The closing callbacks, if any, are called last.
  
  C) Other.
 
 Don't call them. See ERR_COMM_CLOSING discussion.
 
 The I/O calls are no longer relevant at the moment the fd has been
 semantically closed (comm_close called by the code). Only the close
 callbacks matter at that time.
 
 Note: The close callbacks as such need to be async, just not the closing
 of the fd (or at least at a semantical level).

OK, I think you picked option B then :-). Let me polish it to better
match the current level of understanding and the dropping
ERR_COMM_CLOSING agreement:

comm_close(fd) API:

1) No I/O callbacks will be dialed after comm_close is called (including
the time when comm_close is running).

2) All close callbacks registered before comm_close was called will be
called asynchronously, sometime after comm_close exits.

3) The code can ask Comm whether a FD associated with a close callback
has been closed. The answer will be yes after comm_close is called and
no before that. This interface needs to be added. Direct access to
fd_table[fd].flags will not work because the same FD could have been
assigned to another connection already. The code will have to use its
close callback to determine FD status.

4) Registering any callback after comm_close has been called is wrong.
Comm will assert if it can detect such late registrations. Not all late
registrations will be detected because the same FD can be already
assigned to a different (new) connection[*].

The above comm_close API is easy to implement without massive code
changes. Do you think it is the right API? If not, what should be
changed?

[*] We can reliably detect #4 violations by replacing fd integers
outside of comm with a unique integer ID or, better, a handler object
containing metadata. That kind of change may be too big for v3.1 though.

Thank you,

Alex.




Re: client_side and comm_close

2008-04-20 Thread Tsantilas Christos

Maybe it can be easier:
Just keeping the comm_close as is and in AsyncCallsQueue just 
cancels/not execute asyncCall's for which the fd_table[fd].flags.closing 
is set.
We only need a way to identify comm_close handlers and the asyncCall 
which will do the actual close of the socket.




Alex Rousskov wrote:

OK, I think you picked option B then :-). Let me polish it to better
match the current level of understanding and the dropping
ERR_COMM_CLOSING agreement:

comm_close(fd) API:

1) No I/O callbacks will be dialed after comm_close is called (including
the time when comm_close is running).

2) All close callbacks registered before comm_close was called will be
called asynchronously, sometime after comm_close exits.

3) The code can ask Comm whether a FD associated with a close callback
has been closed. The answer will be yes after comm_close is called and
no before that. This interface needs to be added. Direct access to
fd_table[fd].flags will not work because the same FD could have been
assigned to another connection already. The code will have to use its
close callback to determine FD status.

4) Registering any callback after comm_close has been called is wrong.
Comm will assert if it can detect such late registrations. Not all late
registrations will be detected because the same FD can be already
assigned to a different (new) connection[*].

The above comm_close API is easy to implement without massive code
changes. Do you think it is the right API? If not, what should be
changed?

[*] We can reliably detect #4 violations by replacing fd integers
outside of comm with a unique integer ID or, better, a handler object
containing metadata. That kind of change may be too big for v3.1 though.

Thank you,

Alex.







Re: client_side and comm_close

2008-04-20 Thread Henrik Nordstrom
sön 2008-04-20 klockan 22:19 +0300 skrev Christos Tsantilas:

 The important files are the CommCalls.* AsyncCalls.* and ICAP/AsyncJob.*
 But I believe that you are understand it enough. 

I think I understand most of the code now, but not entirely 100% sure on
the design and intended use model yet..

Basic principle is simple and neat.

Regards
Henrik



Re: client_side and comm_close

2008-04-20 Thread Henrik Nordstrom
sön 2008-04-20 klockan 22:56 -0600 skrev Alex Rousskov:

 - I hope we can avoid similar problems with comm_cancel_read because the
 two or three places using that call can be adjusted to cancel the read
 callback directly. We have already talked about that elsewhere.

It's not only comm_cancel_read but also commSetTimeout.

Regards
Henrik



Re: client_side and comm_close

2008-04-20 Thread Alex Rousskov
On Mon, 2008-04-21 at 15:48 +1200, Amos Jeffries wrote:
  comm_close(fd) API:
 
  1) No I/O callbacks will be dialed after comm_close is called (including
  the time when comm_close is running).
 
 Sounds good.
 
  2) All close callbacks registered before comm_close was called will be
  called asynchronously, sometime after comm_close exits.
 
 
 Sound good.
 
  3) The code can ask Comm whether a FD associated with a close callback
  has been closed. The answer will be yes after comm_close is called and
  no before that. This interface needs to be added. Direct access to
  fd_table[fd].flags will not work because the same FD could have been
  assigned to another connection already. The code will have to use its
  close callback to determine FD status.
 
 Sounds good, BUT, direct access to fd_table pieces may need to be blocked
 entirely (private:) so code is forced to go through the Comm API properly.

Yes, except if we want to avoid modifying old code that can still access
those flags directly because it gets immediate close callbacks (more on
that below).

 (2) states that the higher-level close callbacks may be run at any time.
 ie after the callback (3) refers to is run. This leaves a window open for
 disaster, unless the closing callbacks are made immediate, and back we go
 to recursion...

Those are the same close callbacks! There are no low-level and
high-level close callbacks here. The external code can use its stored
callback pointer to get FD status even after that close callback has
been called. There is no problem with that. The callback will not look
at fd_table in that case, it will just say yes, the fd is closed as far
as you should be concerned.

And, per recent suggestions, old code will get immediate close callbacks
so it does not need to be modified to use the close callback pointer to
ask about FD status.

  4) Registering any callback after comm_close has been called is wrong.
  Comm will assert if it can detect such late registrations. Not all late
  registrations will be detected because the same FD can be already
  assigned to a different (new) connection[*].
 
 That non-detection seems to me to be a worry. The same problem in (3)
 occurs here. (4) can guarantee that the closing callbacks don't play nasty
 re-registrations. But only if they are called immediate instead of
 scheduled.

Sorry, I do not understand. The late registration of a close callback
can come from anywhere. The old code relies on fd only. There is no way
for comm to distinguish whether the caller is using a FD from a closed
connection or a new one. This problem exists in the old code as well so
there is no new danger here! This problem can only be solved when we
have comm handlers of one sort or another.

  The above comm_close API is easy to implement without massive code
  changes. Do you think it is the right API? If not, what should be
  changed?
 
 Apart from the worry with immediate vs delayed closing callbacks.
 
 To reduce that worry somewhat I think, the callbacks which actually use
 ERR_COMM_CLOSING for anything other than immediate abort will need to be
 replaced with two; a normal callback that checks the FD is open, and a
 simple closing callback.

I am confused by the normal callback that checks the FD is open part.
What is that for? Are you talking about some internal comm calls to
close the FD? I believe that handler code that currently does not ignore
ERR_COMM_CLOSING notifications will need to be moved into the close
handler code (because only the close handler will be called if we
eliminate ERR_COMM_CLOSING).

Thanks,

Alex.




Re: client_side and comm_close

2008-04-20 Thread Alex Rousskov
On Mon, 2008-04-21 at 16:02 +1200, Amos Jeffries wrote:
  On Sun, 2008-04-20 at 22:01 +0300, Tsantilas Christos wrote:
  Maybe it can be easier:
 
  The ease of implementation is a separate question. We still have to
  agree what we are implementing, and the items below attempt to define
  that. Ambiguity and lack of clear APIs is quite dangerous here (as the
  related bugs illustrate!).
 
  Just keeping the comm_close as is and in AsyncCallsQueue just
  cancels/not execute asyncCall's for which the fd_table[fd].flags.closing
  is set.
 
  Implementing #1 below can indeed be as simple as you sketch above. We
  need to make sure that the final call that closes the FD and makes fd
  value available to other connections is placed last (that may already be
  true and is easy to fix if not).
 
 Not true.
 
 IMO, The proposed API _very_ first two lines of code in comm_close are to
 register a special Comm callback to perform the fclose() call, and then to
 immediately set fd_table flag closed for the rest of the comm_close
 process.

Agreed on the flag, disagreed on the call. The special/internal Comm
call (to self) should be scheduled last (after all close callbacks) and
not first because the close handler might need access to some FD-related
info. That info should be preserved until all close handlers have been
called.

 With that condition at the start we can guarantee that any registrations
 made during the close sequence are either non-FD-relevant or caught.

Yes, the flag is sufficient for that. The internal close for good call
can still be last.

 The special Comm callback is only special in that it is not required to
 check flags open before fclose()'ing the system-level FD, which will allow
 new connections to open on the FD.

It is special because it is calling an internal comm method not some
external close handler. Its profile and parameters are different.  I
would not call it a callback because of that, but the label is not
important. It is not a close callback in terms of
comm_add_close_handler.

 Between the initial comm_close() call and the special Comm callback, we
 don't need to care if callbacks write their shutdown statements to the fd
 (its still technically open) but the closed flag prevents comm accepting
 any new delayed event registrations or reads.

Exactly. Our only problem is with code that calls comm with a stale fd,
after that fd has been really closed and a new connection was opened
with the same fd. That's not a new problem and we will solve it in v3.2
using comm handlers.

I hope the above puts us on the same page about the implementation
sketch for the comm_close API, but please yell if something still seems
broken.

Thank you,

Alex.




Re: client_side and comm_close

2008-04-19 Thread Henrik Nordstrom
fre 2008-04-18 klockan 08:25 -0600 skrev Alex Rousskov:

 The changes needed on the client-side are pretty much unrelated to async
 calls. Async calls just make the design problems more obvious and help
 in solving them. Removing async calls from comm will fix a few bugs, but
 will be a step backwards and will make ongoing code cleanup more
 difficult.

Here is a different proposal. Split async calls in two. Queued and
immediate, with the event loops using immediate calls and everything
else queued. The siplest way of doing so safely would be to make async
calls immediate if not already within an async call, but adjusting the
event loops is also a possibility. This would make the code behave a lot
more like how it was before async, eleminating most races.


The very few cases which may depend on comm_close being immediate should
be possible to identify by cbdataInternalLock calls I think as the state
object previously may disappear immediately otherwise. I can't see any
in the client_side_* code today. But there very likely is assumptions
that after comm_close has been called no further events will fire
positively on that fd (only as ERR_COMM_CLOSING).

I don't remember why comm_close was made async, but the more I think
about it I think it's most likely wrong. comm_close SHOULD invalidate
any pending comm operations on that fd, anything else just leads to
nightmares.

Related to this I also think the COMM_ERR_CLOSING callback from comm
should be dropped. Nearly all handlers looking for COMM_ERR_CLOSING do
nothing else than return immediately to abort execution.. and the few
exceptions there is can be dealt with by a close callback if at all
needed:

errorSendComplete, invalidates the err object

ftpStateData::dataRead, sets a boolean that the read is done

HttpStateData::handleRequestBodyProducerAborted injects a
COMM_ERR_CLOSING

HttpStateData::readReply sets a boolean that the read is done

ServerStateData::sentRequestBody clears requestSender


the other 20 or so users of comm callbacks just returns immediately on
COMM_ERR_CLOSING, aborting the call..

Regards
Henrik



Re: client_side and comm_close

2008-04-19 Thread Adrian Chadd
On Sat, Apr 19, 2008, Henrik Nordstrom wrote:

 I don't remember why comm_close was made async, but the more I think
 about it I think it's most likely wrong. comm_close SHOULD invalidate
 any pending comm operations on that fd, anything else just leads to
 nightmares.

The back-end of comm_close() may have to be async to properly support
windows overlapped IO in the future. You have you cancel pending IOs
which IIRC may actually fire before the cancel can be processed. Its been
a few years since I toyed with overlapped IO so I can't be sure.

Someone with more Windows Overlapped IO experience should chime in.
Also, I reckon it would be a smart idea to sit down and code up some
very basic Windows overlapped IO examples (like a tcp proxy! :) to
properly explore its behaviour before any changes to comm are made.

(Thats what I'll be doing before my comm changes in the cacheboy branch,
as I really don't want to use libevent/libev's windows mode for IO.)



Adrian

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -


Re: client_side and comm_close

2008-04-19 Thread Tsantilas Christos
Hi all,
  I think we have two different problems here, the client_side code  and
the comm_close in async calls.
If I am not wrong the bug which triggered this discussion is the bug2309
which is squid3.0 bug where AsyncCalls does not exists. The problem here
is the client side code which the true is that it needs clean up. It is
well known to everyone tried to fix bugs or add code in this part of
squid3..

About the comm_close, the AsyncCalls designed and implemented in such
way to require the minimum changes in the squid3 code and this was
limiting some thinks...

Henrik Nordstrom wrote:
 
 Here is a different proposal. Split async calls in two. Queued and
 immediate, with the event loops using immediate calls and everything
 else queued. The siplest way of doing so safely would be to make async
 calls immediate if not already within an async call, but adjusting the
 event loops is also a possibility. This would make the code behave a lot
 more like how it was before async, eleminating most races.
 
 
 The very few cases which may depend on comm_close being immediate should
 be possible to identify by cbdataInternalLock calls I think as the state
 object previously may disappear immediately otherwise. I can't see any
 in the client_side_* code today. But there very likely is assumptions
 that after comm_close has been called no further events will fire
 positively on that fd (only as ERR_COMM_CLOSING).
 
 I don't remember why comm_close was made async, but the more I think

I think (but not sure now) there was cases where some of the pending
asyncCalls must executed.

 about it I think it's most likely wrong. comm_close SHOULD invalidate
 any pending comm operations on that fd, anything else just leads to
 nightmares.

To invalidate any pending comm operation for a particular fd you should
scan all AsyncCalls queue which maybe is costly in a busy proxy, with
thousands of asyncCalls scheduled (maybe the use of secondary asyncCalls
indexes for each fd?)
If we have only AsyncJob based asyncCalls it would be easy to not
execute (invalidate) AsyncCalls which are referred to non valid
asyncJobs, but now still exists functions which are scheduled using the
old way...

 
 Related to this I also think the COMM_ERR_CLOSING callback from comm
 should be dropped. Nearly all handlers looking for COMM_ERR_CLOSING do
 nothing else than return immediately to abort execution.. and the few
 exceptions there is can be dealt with by a close callback if at all
 needed:
 
 errorSendComplete, invalidates the err object
 
 ftpStateData::dataRead, sets a boolean that the read is done
 
 HttpStateData::handleRequestBodyProducerAborted injects a
 COMM_ERR_CLOSING
 
 HttpStateData::readReply sets a boolean that the read is done
 
 ServerStateData::sentRequestBody clears requestSender
 
 
 the other 20 or so users of comm callbacks just returns immediately on
 COMM_ERR_CLOSING, aborting the call..
 
 Regards
 Henrik
 
 



Re: client_side and comm_close

2008-04-19 Thread Alex Rousskov
On Sat, 2008-04-19 at 10:50 +0800, Adrian Chadd wrote:
 On Fri, Apr 18, 2008, Alex Rousskov wrote:

  If the client_side's main problem was a few coding bugs caused by async
  call changes, you may have been right. Unfortunately, our main problem
  here is the client_side design flaws. Removing async calls from comm
  will not undo those flaws.

 I know this.

  The changes needed on the client-side are pretty much unrelated to async
  calls. Async calls just make the design problems more obvious and help
  in solving them. Removing async calls from comm will fix a few bugs, but
  will be a step backwards and will make ongoing code cleanup more
  difficult.

 But now you have a better understanding of the bugs that are introduced
 with AsyncCalls. You can then plan design changes to the client-side code
 to work around these.

I am not sure how to rephrase this so that it is clear, but client-side
design changes are needed to fix client-side bugs, not to work around
AsyncCall bugs. AsyncCalls are just a helpful tool here, not the focus.
Removing AsyncCalls will not help but will hurt.

I simply do not understand why you focus on AsyncCalls in the context of
this thread when this thread discusses client-side problems that were
there before AsyncCalls and are there after AsyncCalls were added.

  Another way to look at it is to note that client_side was already pretty
  stable before async calls (i.e., Squid 3.0). Thus, we have already been
  where you are proposing to go.
 
 Yes, then you left it by introducing something which exposed the bugs.

The problems this thread is discussing were known before AsyncCalls.

 Its great that we now know about the bugs - but now you've got a codebase
 that you're trying to stabilise 

This thread is not about stabilizing client-side code. It is about
changing its key classes and the relationships among them (for many
reasons). This will certainly destabilize the code short-term. If it
would not, I would not need to ask others opinion whether we should wait
with that work!

 and you can't guarantee you understand
 all of the possible ways its going to break in the future.

True. In fact, nobody can guarantee he understands all of the possible
ways something is going to break in the future.

 If you back out the async related changes then you'll have a stable codebase
 again - you can then make smaller incremental changes to the codebase
 and test them to make sure you haven't broken anything, rather than spending
 possibly 6 + months finding more surprises.

I do not think we can fix the problems discussed here by doing small
incremental changes. We are not talking about bug workarounds or a few
places that need internal polishing.

 Remember, the initial idea was to use 3.1 to tidy up the codebase to stabilise
 it, not tidy up the codebase to unstabilise it. Redesigning chunks of the
 comm codebase -was- on my todo list for 3.1, but then people dumped stuff into
 3 to make it less stable again.

This thread is not about redesigning comm APIs, and I do not know much
about your todo list, but Squid 3.1 roadmap has been built in public. If
any proposed Squid 3.1 features went against your plans, you should have
added your feature(s) to the roadmap and discussed conflicts. This
procedure does not guarantee anything (those who work together on the
project have to compromise), but it works better than complaining about
the first steps while approaching the end of the road.

And nobody is tidying up the code to make it unstable, obviously.

In summary, I do not understand how removing AsyncCalls would help fix
client-side design problems that are the subject of this thread. I do
not even understand why we are talking about AsyncCalls here.

Alex.




Re: client_side and comm_close

2008-04-19 Thread Alex Rousskov
On Sat, 2008-04-19 at 11:52 +0200, Henrik Nordstrom wrote:
 fre 2008-04-18 klockan 08:25 -0600 skrev Alex Rousskov:
 
  The changes needed on the client-side are pretty much unrelated to async
  calls. Async calls just make the design problems more obvious and help
  in solving them. Removing async calls from comm will fix a few bugs, but
  will be a step backwards and will make ongoing code cleanup more
  difficult.
 
 Here is a different proposal.

I agree with many of your specific suggestions (and will discuss them
separately), but I want to understand what problem you are proposing to
solve first. For example:

- Fix a few new bugs introduced by async calls.
- Fix certain comm API problems.
- Improve client-side code.
- Something else.

This thread was just about improving client-side code (more
specifically, when to do it), but the responses seem to be are all over
the map so I am trying to understand what the goal is before diving into
specific development ideas.

Thank you,

Alex.




Re: client_side and comm_close

2008-04-19 Thread Adrian Chadd
On Sat, Apr 19, 2008, Alex Rousskov wrote:

  But now you have a better understanding of the bugs that are introduced
  with AsyncCalls. You can then plan design changes to the client-side code
  to work around these.
 
 I am not sure how to rephrase this so that it is clear, but client-side
 design changes are needed to fix client-side bugs, not to work around
 AsyncCall bugs. AsyncCalls are just a helpful tool here, not the focus.
 Removing AsyncCalls will not help but will hurt.

I know that client-side changes are required. We've all known that
the client-side codebase needs major surgery for various reasons.

 I simply do not understand why you focus on AsyncCalls in the context of
 this thread when this thread discusses client-side problems that were
 there before AsyncCalls and are there after AsyncCalls were added.

I'm not focusing on AsyncCalls. I'm focusing on the breakage that its
introduced in code which isn't yet ready for it.

  Yes, then you left it by introducing something which exposed the bugs.
 
 The problems this thread is discussing were known before AsyncCalls.

But they weren't exhibiting as -bugs-.

  Its great that we now know about the bugs - but now you've got a codebase
  that you're trying to stabilise 
 
 This thread is not about stabilizing client-side code. It is about
 changing its key classes and the relationships among them (for many
 reasons). This will certainly destabilize the code short-term. If it
 would not, I would not need to ask others opinion whether we should wait
 with that work!

I thought the focus on 3.1 - and I'd check, but the Wiki history for the
Roadmap stuff I put in early in the 3.1 roadmap cycle - was to focus on
stability and performance.

The recent work has broken the stability, and doing performance work on an
unstable codebase is silly.

  If you back out the async related changes then you'll have a stable codebase
  again - you can then make smaller incremental changes to the codebase
  and test them to make sure you haven't broken anything, rather than spending
  possibly 6 + months finding more surprises.
 
 I do not think we can fix the problems discussed here by doing small
 incremental changes. We are not talking about bug workarounds or a few
 places that need internal polishing.

And thats where we differed.

  Remember, the initial idea was to use 3.1 to tidy up the codebase to 
  stabilise
  it, not tidy up the codebase to unstabilise it. Redesigning chunks of the
  comm codebase -was- on my todo list for 3.1, but then people dumped stuff 
  into
  3 to make it less stable again.
 
 This thread is not about redesigning comm APIs, and I do not know much
 about your todo list, but Squid 3.1 roadmap has been built in public. If
 any proposed Squid 3.1 features went against your plans, you should have
 added your feature(s) to the roadmap and discussed conflicts. This
 procedure does not guarantee anything (those who work together on the
 project have to compromise), but it works better than complaining about
 the first steps while approaching the end of the road.

_I_ helped start building the 3.1 roadmap, if you remember. _I_ helped
draft the notion that 3.1 should be about performance and stability fixes.
The introduction of AsyncCalls, whilst a good idea in the long run,
just shows that the current client-side codebase isn't ready for it.

 And nobody is tidying up the code to make it unstable, obviously.

But it is! And instead of backing the changes out - giving us a stable
codebase again, but with an understanding of what needs to be changed before
the AsyncCalls stuff is reintroduced - people still seem to persist along
the same path which brought all the quirky instability into Squid-3 in the
first place.

Stop thinking that I'm focusing on AsyncCalls, and start thinking that I'm
focusing on software engineering process with a lean towards stability.
If you want people to adopt Squid-3 then you should try and bring another
stable release out with IPv6 (once all of -those- issues are fixed)
 and without too much quirkiness, and soon.

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -


Re: client_side and comm_close

2008-04-19 Thread Alex Rousskov
On Sat, 2008-04-19 at 11:52 +0200, Henrik Nordstrom wrote:
 fre 2008-04-18 klockan 08:25 -0600 skrev Alex Rousskov:
 
  The changes needed on the client-side are pretty much unrelated to async
  calls. Async calls just make the design problems more obvious and help
  in solving them. Removing async calls from comm will fix a few bugs, but
  will be a step backwards and will make ongoing code cleanup more
  difficult.
 
 Here is a different proposal. Split async calls in two. Queued and
 immediate, with the event loops using immediate calls and everything
 else queued.

It is easy to implement two kinds of call scheduling, but I am not sure
I understand the merits of doing so. What do you mean by the event
loops in this context? Are you talking about eventAdd() queue? Or I/O
event (i.e., comm) callbacks? I am guessing it is the latter.

 The siplest way of doing so safely would be to make async
 calls immediate if not already within an async call, but adjusting the
 event loops is also a possibility. This would make the code behave a lot
 more like how it was before async, eleminating most races.

The immediate calls from comm were not really safe (because of the
recursive/reentrant behavior) so I am not sure what we would achieve
here. I hope answers to earlier questions would clarify the intent.
Again, I would like to understand and agree on the problem/goal before
debating specific solutions...

 The very few cases which may depend on comm_close being immediate should
 be possible to identify by cbdataInternalLock calls I think as the state
 object previously may disappear immediately otherwise. I can't see any
 in the client_side_* code today. But there very likely is assumptions
 that after comm_close has been called no further events will fire
 positively on that fd (only as ERR_COMM_CLOSING).

 I don't remember why comm_close was made async, but the more I think
 about it I think it's most likely wrong. comm_close SHOULD invalidate
 any pending comm operations on that fd, anything else just leads to
 nightmares.

Please clarify which of the following comm_close implementation would
solve the problems you want to solve in this context:

A) Immediately call all not-yet-dialed callbacks for the affected FD.
All relevant I/O callbacks are called with ERR_COMM_CLOSING flag set (if
we still have that). The closing callbacks, if any, are called last.

B) Asynchronously call all not-yet-dialed callbacks for the affected FD.
All relevant I/O callbacks are called with ERR_COMM_CLOSING flag set (if
we still have that). The closing callbacks, if any, are called last.

C) Other.

Implementation A would lead to reentrant job calls, which is the problem
we were trying to solve earlier, I think. Implementation B avoids
reentrant/recursive behavior. We can have both if B is the Right Thing
to do but A is what some old code required.

(I still would like to know what problem we are trying to solve though;
Is the problem you are trying to solve specific to comm_close?)

Thank you,

Alex.




Re: client_side and comm_close

2008-04-19 Thread Alex Rousskov
On Sat, 2008-04-19 at 17:24 +0300, Tsantilas Christos wrote:

 I think we have two different problems here, the client_side code  and
 the comm_close in async calls.

You are correct. General async call bugs and general comm API problems
have been mentioned on this thread as well. Most of these problems are
related, of course, but it is rather important to understand which
problem the person is talking about when discussing a specific
solution :-).

 If I am not wrong the bug which triggered this discussion is the bug2309
 which is squid3.0 bug where AsyncCalls does not exists.

If that is indeed the case, it is a good point!

 The problem here
 is the client side code which the true is that it needs clean up. It is
 well known to everyone tried to fix bugs or add code in this part of
 squid3..

Yes, I hope everybody who worked with the client-side code is in
agreement that it needs a clean up. I believe the question opening
this thread was when to do it (v3.1 or v3.2) not whether to do it.

It is important to note that I can hit bug2309-like assertions in trunk
as well. The exact causes may be a little different, but all seem to
revolve around comm_close and transaction abort handling (before and
after async calls).

Thank you,

Alex.




Re: client_side and comm_close

2008-04-18 Thread Amos Jeffries

Alex Rousskov wrote:

On Fri, 2008-04-18 at 15:40 +1200, Amos Jeffries wrote:

IMO, This should be queued as part of the 3.1 cleanups, but not as 
urgently as some other issues.


The priorities as I see them now are:
  - immediately fixable and urgent bugs.
  - completing the dedicated 3.1 feature list (DEVEL is overdue).
  - cleanups for fixing more intrusive or design bugs (such as this, and 
the 407 one).

  - shakedown for release cycle.

We have a few months of shakedown before anything like a good stable. 
Plenty of time for cleanup not blocking new features.


I estimate that properly cleaning up the client side would delay v3.1
stable release by 2-3 months unless we suddenly get more experts
involved. Is that delay acceptable to you? Others?


If we were in a position to start it now, maybe. But we are not I think.

I'm vote is for the timeline above, leaving step-3 as optional, the 
descision whether to even start it until step 2 is fully completed.


We may later decide that it is worthwhile or vital, but I don't think 
anyone should start in on more major sweeping until we know what all the 
fallout from the currently underway set is going to be.


Amos
--
Please use Squid 2.6.STABLE19 or 3.0.STABLE4


Re: client_side and comm_close

2008-04-18 Thread Adrian Chadd
On Wed, Apr 16, 2008, Alex Rousskov wrote:

 In short, we have several related problems here: (a) client_side code is
 incapable of reliably identifying whether comm_close has been called;
 (b) ConnStateData::isOpen may not work anymore; (c) client_side code
 uses many different ways to identify whether more data should be read
 from the connection; (d) comm_close is used a lot but no longer has an
 immediate effect and some client_side code may still depend on that
 effect to be immediate; (e) client_side comm handlers decent very deep,
 making it difficult to propagate errors and comm_close status up.
 
 We should decide whether to continue patching holes here and there or
 clean up the client_side*cc connection management mess for good. Should
 we continue to patch isolated v3.0 problems and cleanup v3.1? Or is this
 a v3.2 project? Or am I exaggerating the problems since common cases
 usually work fine?

I'd suggest another option - roll back all of the async calls changes to the
comm code, stabilise the codebase without it and re-evaluate what should
occur (in smaller chunks, rather than dropping in a new comm manager)
before reintroducing it.

I think we've seen that there's still a gap in how development branches are
tested and that the best way we have to test code is to put it in production :)




Adrian

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -


Re: client_side and comm_close

2008-04-18 Thread Alex Rousskov
On Fri, 2008-04-18 at 18:34 +1200, Amos Jeffries wrote:
 Alex Rousskov wrote:
  On Fri, 2008-04-18 at 15:40 +1200, Amos Jeffries wrote:
  
  IMO, This should be queued as part of the 3.1 cleanups, but not as 
  urgently as some other issues.
 
  The priorities as I see them now are:
- immediately fixable and urgent bugs.
- completing the dedicated 3.1 feature list (DEVEL is overdue).
- cleanups for fixing more intrusive or design bugs (such as this, and 
  the 407 one).
- shakedown for release cycle.
 
  We have a few months of shakedown before anything like a good stable. 
  Plenty of time for cleanup not blocking new features.
  
  I estimate that properly cleaning up the client side would delay v3.1
  stable release by 2-3 months unless we suddenly get more experts
  involved. Is that delay acceptable to you? Others?
 
 If we were in a position to start it now, maybe. But we are not I think.

You are probably right. FWIW, I will not be able to work on that until
eCAP is done.

 I'm vote is for the timeline above, leaving step-3 as optional, the 
 descision whether to even start it until step 2 is fully completed.

 We may later decide that it is worthwhile or vital, but I don't think 
 anyone should start in on more major sweeping until we know what all the 
 fallout from the currently underway set is going to be.

In the unlikely event somebody with enough client-side knowledge can
start on this now, I would probably support it (and help with the
design). Otherwise, I agree that your plan is the best option available.

Thank you,

Alex.




Re: client_side and comm_close

2008-04-18 Thread Alex Rousskov
On Fri, 2008-04-18 at 16:15 +0800, Adrian Chadd wrote:
 On Wed, Apr 16, 2008, Alex Rousskov wrote:
 
  In short, we have several related problems here: (a) client_side code is
  incapable of reliably identifying whether comm_close has been called;
  (b) ConnStateData::isOpen may not work anymore; (c) client_side code
  uses many different ways to identify whether more data should be read
  from the connection; (d) comm_close is used a lot but no longer has an
  immediate effect and some client_side code may still depend on that
  effect to be immediate; (e) client_side comm handlers decent very deep,
  making it difficult to propagate errors and comm_close status up.
  
  We should decide whether to continue patching holes here and there or
  clean up the client_side*cc connection management mess for good. Should
  we continue to patch isolated v3.0 problems and cleanup v3.1? Or is this
  a v3.2 project? Or am I exaggerating the problems since common cases
  usually work fine?
 
 I'd suggest another option - roll back all of the async calls changes to the
 comm code, stabilise the codebase without it and re-evaluate what should
 occur (in smaller chunks, rather than dropping in a new comm manager)
 before reintroducing it.

If the client_side's main problem was a few coding bugs caused by async
call changes, you may have been right. Unfortunately, our main problem
here is the client_side design flaws. Removing async calls from comm
will not undo those flaws.

The changes needed on the client-side are pretty much unrelated to async
calls. Async calls just make the design problems more obvious and help
in solving them. Removing async calls from comm will fix a few bugs, but
will be a step backwards and will make ongoing code cleanup more
difficult.


Another way to look at it is to note that client_side was already pretty
stable before async calls (i.e., Squid 3.0). Thus, we have already been
where you are proposing to go.

Alex.




Re: client_side and comm_close

2008-04-18 Thread Adrian Chadd
On Fri, Apr 18, 2008, Alex Rousskov wrote:

  I'd suggest another option - roll back all of the async calls changes to the
  comm code, stabilise the codebase without it and re-evaluate what should
  occur (in smaller chunks, rather than dropping in a new comm manager)
  before reintroducing it.
 
 If the client_side's main problem was a few coding bugs caused by async
 call changes, you may have been right. Unfortunately, our main problem
 here is the client_side design flaws. Removing async calls from comm
 will not undo those flaws.

I know this.

 The changes needed on the client-side are pretty much unrelated to async
 calls. Async calls just make the design problems more obvious and help
 in solving them. Removing async calls from comm will fix a few bugs, but
 will be a step backwards and will make ongoing code cleanup more
 difficult.

But now you have a better understanding of the bugs that are introduced
with AsyncCalls. You can then plan design changes to the client-side code
to work around these.

 Another way to look at it is to note that client_side was already pretty
 stable before async calls (i.e., Squid 3.0). Thus, we have already been
 where you are proposing to go.

Yes, then you left it by introducing something which exposed the bugs.
Its great that we now know about the bugs - but now you've got a codebase
that you're trying to stabilise and you can't guarantee you understand
all of the possible ways its going to break in the future.
If you back out the async related changes then you'll have a stable codebase
again - you can then make smaller incremental changes to the codebase
and test them to make sure you haven't broken anything, rather than spending
possibly 6 + months finding more surprises.

Remember, the initial idea was to use 3.1 to tidy up the codebase to stabilise
it, not tidy up the codebase to unstabilise it. Redesigning chunks of the
comm codebase -was- on my todo list for 3.1, but then people dumped stuff into
3 to make it less stable again.



Adrian

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -


Re: client_side and comm_close

2008-04-17 Thread Tsantilas Christos
Hi all,
Alex Rousskov wrote:
 In reply to comment #8 for bug #2309 
 comm_read assertion failed: ccb-active == false
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2309
 
 Note to the one looking into this bug: The assert happens if a 
 comm_read() is called while there already is a comm_read() pending.
 
 Yes.
 
 I would suspect the half closed client check has been wrongly 
 adopted to call comm_read where it used commSetSelect periodically
 before..
 d
 The above may be true, but there are other conditions causing the same
 assertion in my tests, with and without the pending comm_close patch
 from Christos.
 

As I can understand the problem does not exist only in squid3.0 or only
in squid3-trunk.

The main problem with the client_side code is that the comm_read done by
the ConnStateData class, the comm_writes by the ClientSocketContext
class and comm_close (and other comm operations ) by all. This makes
very difficult to manage the comm related problems in client_side code


 In short, we have several related problems here: (a) client_side code is
 incapable of reliably identifying whether comm_close has been called;
 (b) ConnStateData::isOpen may not work anymore; (c) client_side code
 uses many different ways to identify whether more data should be read
 from the connection; (d) comm_close is used a lot but no longer has an
 immediate effect and some client_side code may still depend on that
 effect to be immediate; (e) client_side comm handlers decent very deep,
 making it difficult to propagate errors and comm_close status up.
 
 We should decide whether to continue patching holes here and there or
 clean up the client_side*cc connection management mess for good. Should
 we continue to patch isolated v3.0 problems and cleanup v3.1? Or is this
 a v3.2 project? Or am I exaggerating the problems since common cases
 usually work fine?
 
 Thank you,
 
 Alex.
 
 
 



Re: client_side and comm_close

2008-04-17 Thread Alex Rousskov
On Thu, 2008-04-17 at 20:13 +0300, Tsantilas Christos wrote:
 Alex Rousskov wrote:
  In reply to comment #8 for bug #2309 
  comm_read assertion failed: ccb-active == false
  http://www.squid-cache.org/bugs/show_bug.cgi?id=2309
  
  Note to the one looking into this bug: The assert happens if a 
  comm_read() is called while there already is a comm_read() pending.
  
  Yes.
  
  I would suspect the half closed client check has been wrongly 
  adopted to call comm_read where it used commSetSelect periodically
  before..
  d
  The above may be true, but there are other conditions causing the same
  assertion in my tests, with and without the pending comm_close patch
  from Christos.
  
 
 As I can understand the problem does not exist only in squid3.0 or only
 in squid3-trunk.
 
 The main problem with the client_side code is that the comm_read done by
 the ConnStateData class, the comm_writes by the ClientSocketContext
 class and comm_close (and other comm operations ) by all. This makes
 very difficult to manage the comm related problems in client_side code

Yes. If we decide to fix this, there will be a single object responsible
for client connection management. Others will simply use its services as
needed. Comm cannot be such an object because it is too low-level.
Currently, the functionality is spread among many classes.

  We should decide whether to continue patching holes here and there or
  clean up the client_side*cc connection management mess for good. Should
  we continue to patch isolated v3.0 problems and cleanup v3.1? Or is this
  a v3.2 project? Or am I exaggerating the problems since common cases
  usually work fine?

The question remains.

Thank you,

Alex.




Re: client_side and comm_close

2008-04-17 Thread Amos Jeffries

Alex Rousskov wrote:

On Thu, 2008-04-17 at 20:13 +0300, Tsantilas Christos wrote:

Alex Rousskov wrote:
In reply to comment #8 for bug #2309 
comm_read assertion failed: ccb-active == false

http://www.squid-cache.org/bugs/show_bug.cgi?id=2309

Note to the one looking into this bug: The assert happens if a 
comm_read() is called while there already is a comm_read() pending.

Yes.

I would suspect the half closed client check has been wrongly 
adopted to call comm_read where it used commSetSelect periodically

before..

d
The above may be true, but there are other conditions causing the same
assertion in my tests, with and without the pending comm_close patch
from Christos.


As I can understand the problem does not exist only in squid3.0 or only
in squid3-trunk.

The main problem with the client_side code is that the comm_read done by
the ConnStateData class, the comm_writes by the ClientSocketContext
class and comm_close (and other comm operations ) by all. This makes
very difficult to manage the comm related problems in client_side code


Yes. If we decide to fix this, there will be a single object responsible
for client connection management. Others will simply use its services as
needed. Comm cannot be such an object because it is too low-level.
Currently, the functionality is spread among many classes.


We should decide whether to continue patching holes here and there or
clean up the client_side*cc connection management mess for good. Should
we continue to patch isolated v3.0 problems and cleanup v3.1? Or is this
a v3.2 project? Or am I exaggerating the problems since common cases
usually work fine?


The question remains.


IMO, This should be queued as part of the 3.1 cleanups, but not as 
urgently as some other issues.


The priorities as I see them now are:
 - immediately fixable and urgent bugs.
 - completing the dedicated 3.1 feature list (DEVEL is overdue).
 - cleanups for fixing more intrusive or design bugs (such as this, and 
the 407 one).

 - shakedown for release cycle.

We have a few months of shakedown before anything like a good stable. 
Plenty of time for cleanup not blocking new features.


Amos
--
Please use Squid 2.6.STABLE19 or 3.0.STABLE4


Re: client_side and comm_close

2008-04-17 Thread Alex Rousskov

On Fri, 2008-04-18 at 15:40 +1200, Amos Jeffries wrote:

 IMO, This should be queued as part of the 3.1 cleanups, but not as 
 urgently as some other issues.
 
 The priorities as I see them now are:
   - immediately fixable and urgent bugs.
   - completing the dedicated 3.1 feature list (DEVEL is overdue).
   - cleanups for fixing more intrusive or design bugs (such as this, and 
 the 407 one).
   - shakedown for release cycle.
 
 We have a few months of shakedown before anything like a good stable. 
 Plenty of time for cleanup not blocking new features.

I estimate that properly cleaning up the client side would delay v3.1
stable release by 2-3 months unless we suddenly get more experts
involved. Is that delay acceptable to you? Others?

Thank you,

Alex.