Firstly, thanks for the in-depth reply! Some thoughts interleaved below...

Rocco Caputo wrote:

On Aug 22, 2006, at 14:06, Nick Williams wrote:

Rocco Caputo wrote:

Do you have a use case where it's impossible to do something under the new behavior? I'm working under the assumption that a session can always find a way to call sig(YOUR_SIGNAL_HERE => undef) when it's ready to be destroyed.


"When it's ready to be destroyed" is the key. The new behaviour means that sessions need to track that themselves and (in effect) manage their own reference counting independent of POE's. Before, they could just rely on POE to let them know via _stop. Now, _stop is no longer called unless the session clears the sig first. This is just a catch-22. And it leaves a race condition whereby the session exists but has declared it no longer wants the signal.


Sessions already do need to manage their reference counts, at least in the sense that they won't exit until they stop watching for events. Signal events are just another kind of event, and the semantics were a bit exceptional for some good reasons that don't necessarily apply anymore.

If I understand the catch-22 correctly, it's that a session can't clear its signal watchers from _stop because those watchers prevent _stop from executing. If that's the case, I'd like to point out that it's not very useful to clear any resources from _stop to begin with. POE will automatically reclaim its resources from the session after _stop returns, so any explicit POE cleanup in _stop is an expensive no-op.


In my existing code, I'm not cleaning up POE resources - it's *my* resources I'm cleaning up in _stop. The point is that _stop no longer gets called because of the signal handlers, so I can no longer use _stop as a garbage cleanup mechanism similar to DESTROY, since I will by definition always have to know when the session is going to be destructed (in order to remove signal handlers) and therefore _stop becomes superfluous.


To be fair regarding discussions on this list, Jonathan Steinert announced the intent to make sig() hold sessions alive in his 19 October 2005 message titled "Nastiness, and wrapping up signal reforms". I replied that day with:

Big change. I don't mind this; the old semantics of not holding a reference count were tied to _signal, which delivered signals without sessions explicitly asking for them. _signal is gone now, so we can tie the explicit interest of sig() into a reference count to keep the session alive.



Nobody else responded. 17 days later I replied with a public go- ahead to make the change.


Yes, I realise that there was this discussion previously. However, speaking purely for myself, I didn't understand the impact of this at the time, since I wasn't cognizant of the internals of session reference counting at the time. Now I've looked at this, and I can't see how the new implementation makes sense.


I can understand how the implementation might be confusing. The released versions since last December have flaws, especially regarding reference counting. In fact I recently committed fixes for them while portability testing some of Benjamin Smith's new tests.

My issue is that that the bugzilla that Jonathan was attempting to fix is just trivial to fix using existing POE mechanisms of aliases, since there's an easy point at which you know you want to start the persistence of the session, and there's a well-defined point at which you can release the persistence. However, by making the behaviour of persistence implicit within signals, there is simply no way to achieve the opposite effect (automatic garbage collection). The user (the application) must decide at which point it has no more work to do and at that point it can then clear the signal. And only then will POE do it's garbage collection and call back to the application. This just doesn't make sense. Especially when you compare signals in POE with signals in other dispatchers. Having a handler configured for a signal should not make that process persistent.


The point that persistence shouldn't be tied to signals for flexibility's sake is the start of a slippery slope. What would then stop us from asserting that some timers should not imply persistence? Input timeouts, for example. They don't contribute to a session's lifespan since they're only relevant as long as there's an I/O watcher. Why then should delay() keep sessions alive?

We're really talking semantics. A delay says (i.e. is documented as) "call me back at time T" (in effect). The time will always happen (by definition). So, it makes sense with those semantics that the session should stay alive until at least time T. However a signal (by definition) might never happen.


One solution might be to expand the Kernel's APIs for different semantic variants of each watcher. The endgame for that strategy is ugly at best.


And after reading your comments, I'll agree with you that many variants of the API aren't a good idea.

Another solution might be to drop implicit reference counting altogether. Every session would then need to explicitly hold something as long as it wanted to live. That's the "alias" idea, although it should probably be something with a stronger reference count. Then there's the converse: No session will die unless it explicitly asks to be killed. Each has a certain charm, although both have their pitfalls.

I think none of the solutions in the previous paragraph will satisfy a significant portion of POE's users, so I'd rather just not go there at this point.


Agreed.


Your last point is a good one. Where possible, POE has used Unix as a model for its behavior, and signal handlers don't contribute to a process' lifetime. The new sig() semantics are therefore incongruous with the base model. Before I trash them, though, I'd like to learn more about the other side. Are the new sig() semantics necessary? I seem to recall yes, but I don't remember the details. If Jonathan Steinert doesn't explain it, I'll need to make time to grovel through my logs (and the source) to refresh my memory.


The logs imply that the reason it was changed was to deal with a bugzilla entered about there being no simple way to set up a session to wait for UI_DESTROY. This is interesting, because from the earlier point about the distinction between delays (time passing will always happen) and signals (which may never happen), we can see that UI_DESTROY is a very special signal that WILL ALWAYS happen (if requested), which is very different from the rest of the signals. The API was changed to deal with UI_DESTROY, not taking into account that it's a different beast.


And it's after 04.30 here (it's the only time I could make to answer this message). I'm not committing to anything until after some sleep.


Again, much thanks for taking the time to put some notes onto the mailing list for us IRC-deprived people :-).


Maybe I'm thinking about it wrong. To me, "explicit interest of sig ()" does not mean to me that I (this session) want to stay around until that sig, it means to me that it's merely putting in place a handler IN CASE of that sig. It's a really really important distinction. With the huge differentiator that the latter behaviour of 'in-case-of-handlers' CANNOT be achieved in the new POE signal world without careful application coding and ignoring any of the benefits of POE's internal garbage collection.

As a compromise, I've also proposed implicitly that maybe we should have a new function wait_for_sig() as well as just sig(), so that we can make the difference in semantics explicit to users. I don't mind which way around the functions and semantics are achieved, so long as there is a way of doing this.


People can interpret it either way depending on requirements and point of view. In the past I've had to explain why sig(USR1 => "event") won't keep a daemon alive.

I'm still opposed to methods for semantic variants, though. I'd rather avoid having both semantics at once if that's possible.


Fair enough.

Nick.

Reply via email to