tis 2008-04-08 klockan 20:46 +0300 skrev Tsantilas Christos:
> AsyncCalls are refcounted so believed that with "ccb->callback = NULL"
> the AsyncCall which the ccb->callback point released. But yes if it is
> refcounted by an other object (for example in AsyncCalls queues,....)
> it is not released and remains active. Looks that this is the bug.

Hmm.. refcounting and = NULL to cancel/invalidate what is referenced..
have seen that before, was one of the major blockers for 3.0 release.
Not how things may be done, clearing a reference just removes the
reference, not cancels/invalidates the referenced object.

cancellation/invalidation needs to be explicit.

> Propably this callback set to NULL twice because the developer has in
> his mind to put something different here :-)

What I thougth when seeing this.. And additionally the comment header no
longer matches the code..

> This bug exits only for function based AsyncCalls (the compatible with
> the old comm_read interface) and which canceled using the
> comm_read_cancel(int fd, IOCB *callback, void *data) function.

On a closer inspection comm_read_cancel is not called here
(IdleConnList::timeout).

and additionally there is no cancellation point for timeouts, so if the
IdleConnList::read gets scheduled before the timeout it still ends up in
the same badness.

Instead the code here assumes comm_close will cleanup and cancel
everything needed.. but that is also an AsyncCall queued event these
days, doing nothing other than registering the intent and queueing the
close event..


Anyway the applied workaround makes this specific race harmless. But
it's a good exersice in understanding AsyncCall and the complexities
involved in proper cancelling.

I still think the cbdata model is generally preferrable. Much less to
keep track of. For pconn it would need one cbdata object per idle fd,
and having only that object invalidated when the fd is taken out of the
pconn queue for whatever reason.

> Some lines after inside the
>    comm_read_cancel(int fd, AsyncCall::Pointer &callback)
> the "callback->cancel("comm_read_cancel")" is the first call...

Right. How did I miss that one..

> Just one comment:
>  Maybe because in the case of
>  comm_read_cancel(int fd, AsyncCall::Pointer &callback)
> call the asyncCall canceled twice it is better to move the asyncCall
> cancelation to
>  comm_read_cancel(int fd, IOCB *callback, void *data) function

twice?

> > Is it perhaps as simple as making commio_cancel_callback call
> > ccb->callback->cancel()? Testing.. seems to work. Patch attached. Any
> > comments?

Apart from being unneeded it didn't work. It turns out my test case
isn't 100% reliable after all, only 95%.. sometimes the two events gets
scheduled in separate time slots..

> I think Alex has do excellent work with AsyncCalls. The difficult and
> tricky part is the related with the Dialers, but you need to look on
> them only once (just for curiosity) and then just use their interface
> without bother about them...

What worries me a bit is that this introduces a queue where invocation
was always immediate before. So we have a different problem space to
deal with in queued events where we before had to deal with recursive
events.. It's a subtle change and requires partly new thinking.

But when we are used to it the queue is easier to deal with than races
due to recursion. Much more predictable as objects won't go out under
the feets of the caller, just a bit different from what we are used to..

Regards
Henrik

Reply via email to