Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
On 07/17/2015 04:37 PM, Rafael Schloming wrote: Are they really returned with a ref count of 0? No, indeed they are not. I missed the fact that the finalizer will run before the object is returned to the application.
Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
On 07/17/2015 08:11 PM, Rafael Schloming wrote: On Fri, Jul 17, 2015 at 10:45 AM, Gordon Sim wrote: Still digesting the explanation (thanks!) but one follow up question: On 07/17/2015 04:37 PM, Rafael Schloming wrote: it isn't actually possible to use the object when there refcount is 0. What is the purpose of the incref/decref pattern then, e.g. as used in pn_session_free()? That is designed to trigger the finalizer, right? If so it would seem that can only happen if the reference count is 0 at the point when pn_session_free() is called. (And if it was valid to call pn_session_free for a given session, then that session would have been valid for use before that, no?) The refcount should never be zero going into pn_session_free. The situation where it triggers the finalizer is when the refcount is 1 upon entering pn_session_free, but the referenced boolean is false meaning that the one remaining reference that exists is the parent -> child pointer (in this case the connection -> session pointer). The incref triggers the pn_session_incref hook which flips the reference around so that the session -> connection pointer is now the reference and the connection -> session pointer is the borrowed reference. Ah, I see, I'd missed that hook. So the pn_incref(session) call actually increments the reference of the *connection*, not the session in this case. The pn_decref(session) then reduces the refcount of the session from 1 to 0 which triggers the finalizer. Thanks! I'm slowly understanding how it works a little more.
Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
On Fri, Jul 17, 2015 at 10:45 AM, Gordon Sim wrote: > Still digesting the explanation (thanks!) but one follow up question: > > On 07/17/2015 04:37 PM, Rafael Schloming wrote: > >> it isn't actually possible to use the object when there refcount is 0. >> > > What is the purpose of the incref/decref pattern then, e.g. as used in > pn_session_free()? That is designed to trigger the finalizer, right? If so > it would seem that can only happen if the reference count is 0 at the point > when pn_session_free() is called. > > (And if it was valid to call pn_session_free for a given session, then > that session would have been valid for use before that, no?) > The refcount should never be zero going into pn_session_free. The situation where it triggers the finalizer is when the refcount is 1 upon entering pn_session_free, but the referenced boolean is false meaning that the one remaining reference that exists is the parent -> child pointer (in this case the connection -> session pointer). The incref triggers the pn_session_incref hook which flips the reference around so that the session -> connection pointer is now the reference and the connection -> session pointer is the borrowed reference. Logically this momentarily increases the refcount from 1 to 2 and then the hook decreases it back to 1 and the decref triggers the finalizer. Now if pn_session_free is called when the refcount is > 1, or if pn_ep_decref ends up posting events (events create references to endpoint objects and thereby increase the refcount), then the pn_incref/decref ends up being a noop and the session is simply marked as free but finalized when the last reference to it is decref'ed. --Rafael
Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
Still digesting the explanation (thanks!) but one follow up question: On 07/17/2015 04:37 PM, Rafael Schloming wrote: it isn't actually possible to use the object when there refcount is 0. What is the purpose of the incref/decref pattern then, e.g. as used in pn_session_free()? That is designed to trigger the finalizer, right? If so it would seem that can only happen if the reference count is 0 at the point when pn_session_free() is called. (And if it was valid to call pn_session_free for a given session, then that session would have been valid for use before that, no?)
Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
On Thu, Jul 16, 2015 at 7:46 AM, Gordon Sim wrote: > On 07/16/2015 02:40 PM, aconway wrote: > >> Can someone who understand the proton use of refcounts please add some >> doc comments to explain the semantics? Apologies if this is already >> there and I missed it, tell me to RTFM. >> > > I'm not entirely sure I understand it. However having spent a couple of > days reading and puzzling over the code, I'll try and offer some answers > where I think I can, and add some questions of my own. > > For comparison here is what "refcount" traditionally means ("tradition" >> includes CORBA, boost::intrusive_ptr, std::shared_ptr etc.) I'm not >> saying we have to make proton conform, but we should document how it >> differs to save confusion. >> >> 1. A refcount is a count of the number of references (owning pointers) >> to an object. >> > > Yes, that broadly speaking is the purpose in proton. However this was a > recent addition and it is not used uniformly inside the library itself. Not > only that but the old api, where the application (generally) owned the > pointers it created until it called pn_xxx_free, is still supported > alongside the newer mode of use, as e.g. employed in the python binding, > where the application uses only incref and decref. > > 2. Objects are created with refcuont=1, the creator owns the first >> reference. >> > > This is not always the case and was one of the points I found surprising. > Though a newly created 'object' does indeed have its ref count set to 1, > for pn_session() and pn_link(), which are the functions actually used by > the application to created new sessions or links, the implementation of > those functions includes an explicit decref, meaning the objects are > returned with a ref count of 0. > Are they really returned with a ref count of 0? I don't think proton objects can actually exist with a refcount of 0 outside a finalizer. What should actually happen is that the finalizer for the newly created session will run and cause the parent of the session or link (the connection or session) to inspect the child's state. Based on that state it may create a new reference to the child, e.g. if there is outstanding work on the wire to be done for that session or link. In the case of a new session or link this is always the case, so it will always end up creating a new reference to it, but this should never result in a child with a non zero refcount being returned or visible in any way to user code. I suspect this was done to simplify things for the newer mode of use, where > a wrapper object can always be instantiated and it simply increments the > ref count. Would be good to have that confirmed though, as to me this is > the source of confusion and complexity. > Yes, this is certainly true, but it is also to accommodate the memory model the C interface exposes. In the C interface sessions and links are owned by their parent objects, e.g. freeing the connection frees all the contained sessions. The way this is accommodated is that the parent object is what owns the reference to the child by default. What is returned from the pn_session()/pn_link() calls is a borrowed reference. (This changes when you do an incref, see below for more details.) > However this means that in the old mode of use, e.g. > pn_session()/pn_session_free(), the object may have a refcount of 0 even > though it has not been freed by the application. > As mentioned above, this should never be able to happen outside of a finalizer. Perhaps what is confusing here is that the finalizer can create a new reference to the object that is being finalized thereby causing the pn_decref() to *appear* to be a noop, however what is really happening is an important state change, the last reference is being deleted, and the finalizer has decided to create a new reference for other purposes (e.g. because there is outstanding transport work to be done with that object or because it is going into a pool) and the state of the object (or related objects/state) is being changed in key ways to reflect this. Note that this pattern is not at all unique to proton's refcount system. All GC systems that have finalizers accommodate these sorts of semantics, i.e. finalizers causing objects to become live again. This is fundamental to finalizers since they are just user code and can do whatever they want, including create new references. What is confusing about it here is more to do with how this capability is being used to interact with all the engine data structures that predate both the refcount system and the collection objects that make use of the refcount system. I believe ultimately the engine data structures should be reworked to use the various collection objects now available, at which point a lot of this will become much more centralized, self contained, and understandable and likely won't require so much magic (or at least the magic will be in one place rather than spread around like it is now). > > 3. If anot
Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
On Thu, 2015-07-16 at 15:11 +0100, Gordon Sim wrote: > On 07/16/2015 02:40 PM, aconway wrote: > > The fix mentioned above has this, which make no sense under > > traditional > > refcounting: > > > > pn_incref(endpoint); > > pn_decref(endpoint); > > Note that this is not added as part of my fix, it is already there. > [snip] > > The reference counting logic may not match the ideal, but we can't > postpone a fix for the current issue pending some nicer overall > solution. We can avoid it, by backing out the problematic previous > commit, or we can adjust that commit. > +1, I'm just proposing that we get a clear statement of the intended semantics of the *existing* proton refcounting scheme into the code as it has tripped up a couple of people so far. I for one am still not very clear on it. This is not an issue for 0.10 but for longer term code health. Cheers, Alan.
Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
On 07/16/2015 02:40 PM, aconway wrote: Can someone who understand the proton use of refcounts please add some doc comments to explain the semantics? Apologies if this is already there and I missed it, tell me to RTFM. I'm not entirely sure I understand it. However having spent a couple of days reading and puzzling over the code, I'll try and offer some answers where I think I can, and add some questions of my own. For comparison here is what "refcount" traditionally means ("tradition" includes CORBA, boost::intrusive_ptr, std::shared_ptr etc.) I'm not saying we have to make proton conform, but we should document how it differs to save confusion. 1. A refcount is a count of the number of references (owning pointers) to an object. Yes, that broadly speaking is the purpose in proton. However this was a recent addition and it is not used uniformly inside the library itself. Not only that but the old api, where the application (generally) owned the pointers it created until it called pn_xxx_free, is still supported alongside the newer mode of use, as e.g. employed in the python binding, where the application uses only incref and decref. 2. Objects are created with refcuont=1, the creator owns the first reference. This is not always the case and was one of the points I found surprising. Though a newly created 'object' does indeed have its ref count set to 1, for pn_session() and pn_link(), which are the functions actually used by the application to created new sessions or links, the implementation of those functions includes an explicit decref, meaning the objects are returned with a ref count of 0. I suspect this was done to simplify things for the newer mode of use, where a wrapper object can always be instantiated and it simply increments the ref count. Would be good to have that confirmed though, as to me this is the source of confusion and complexity. However this means that in the old mode of use, e.g. pn_session()/pn_session_free(), the object may have a refcount of 0 even though it has not been freed by the application. 3. If another owning reference is created, the refcount MUST BE incremented. This is not currently the case for all internal use which necessitates some extra logic and checks. 4. The owner of a reference MUST decrease the refcount on dropping the reference and MUST NOT use the pointer after refcount is decremented. It no longer "owns" a reference. 5. The object MAY be deleted when refcount goes to 0 or MAY be pooled for re-use but it MUST NOT be used by any code (other than the pooling system if there is one) As above, that is currently not completely true at least for sessions and links, where they may be in use while still having a 0 ref count. 6. You never examine the refcount (except for debugging) Either you own a reference, which means refcount > 0, or you don't, which means you MUST NOT touch the object, even to examine its refcount. My questions: * what are the respective roles of the ref count in pn_endpoint_t, and in the pni_head_t record associated with pn_session_t and pn_link_t? i.e. why are there two separate ref counts for the same object? * pn_link_new() and pn_session() both decrement the count (in the pni_head_t record) for the newly created objects. Why? * what exactly does the 'referenced' flag indicate? * when a finalize function is exited early due to the return value of pni_preserve_child, the reference count is incremented to prevent that object being freed by the code in pn_class_decref. Where/how is that then decremented again to retrigger the finalizer? * when should pn_class_free be used v pn_class_decref? (The former does not check whether the finalizer increases the reference, the latter does) * pn_class_free() has the precondition that the object passed in should have a refcount of 1 (which I understand) or -1. When and why would the refcount be -1?
Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]
On 07/16/2015 02:40 PM, aconway wrote: The fix mentioned above has this, which make no sense under traditional refcounting: pn_incref(endpoint); pn_decref(endpoint); Note that this is not added as part of my fix, it is already there. The simple explanation is that it is a trick to get a finalize function for the object in question to be re-run (when it was previously cut short). The broader context is that the reference counting is not used everywhere internally (it was a more recent addition), causing some additional twists and turns in the logic. Traditionally, this sequence is a no-op if the caller owns the reference and is illegal if not. We need a clear statement of the semantics of such sequences: when it it legal and what does it mean? From the fix it seems to be related to the "freed" and "referenced" members, what exactly is the relationship? The 'freed' flag indicates that the pn_xxx_free() call was made. This is not always the case. There is a mode of use, e.g. in the python binding, that use inc-/dec- ref instead. I'm not as clear on the design behind the 'referenced' flag, but it is set to false when the finalizer is cut short. The original fix for PROTON-905 triggered the re-run of the finalizer whenever the reference count was 0 and a session or link was moved out of the set of modified items. From the description it was only intended to do this where the object in question had been freed and had had its finalizer cut short. Hence the addition in my proposed fix of that extra qualification. As it is, the code currently on master, pending the 0.10 release is *incompatible* with qpid-cpp. So either the original fix needs to be reverted or it needs to be corrected in some way (e.g. along the lines I have suggested). The reference counting logic may not match the ideal, but we can't postpone a fix for the current issue pending some nicer overall solution. We can avoid it, by backing out the problematic previous commit, or we can adjust that commit.