#11521: Use weak references to cache homsets
--------------------------------------------------+-------------------------
       Reporter:  jpflori                         |         Owner:  robertwb    
                 
           Type:  defect                          |        Status:  
positive_review              
       Priority:  major                           |     Milestone:  sage-5.4    
                 
      Component:  coercion                        |    Resolution:              
                 
       Keywords:  sd35                            |   Work issues:              
                 
Report Upstream:  N/A                             |     Reviewers:  Jean-Pierre 
Flori, Nils Bruin
        Authors:  Simon King                      |     Merged in:              
                 
   Dependencies:  #12969; to be merged with #715  |      Stopgaps:              
                 
--------------------------------------------------+-------------------------

Comment (by SimonKing):

 I this post, I clarify some notions and explain why the weak reference to
 a homset introduced here does ''not'' imply that the homset will
 immediately be garbage collected. Sorry, while I wrote that post, you
 clarified things for yourself, making my post redundant. Anyway, here it
 goes...

 Replying to [comment:152 nbruin]:
 > Here `H` is a map from `X` to `Y` wrt to the category `C`,
 > so it stores strong references to `X`,`Y`,`C`.

 No, it is a homset. But yes, it has strong references to X,Y,C.

 > The PRESENCE of a coercion should probably be cached with a strong
 reference to the coercion. That's the point of caching the thing!

 Yes. But that should happen in a way that domain and codomain remain
 collectable. And that is a problem; see below.

 But I think there is a lot of confusion now between coercion, homset,
 `TripleDict` post and prior to #715, and the way how coercion maps are
 cached. Let me try to straighten things a bit:

 The original purpose of `TripleDict` was to store ACTIONS (not coercions).
 When you have an action of X on Y, then you would store the action in an
 attribute of X, addressed by the key triple that is formed by Y, the
 information what operation is considered (+ or *), and the information
 whether the action is on the left or on the right.

 The original `TripleDict` did so by strong references to both keys and
 values. By #715, `TripleDict` got changed so that one has weak references
 to the keys.

 Before #11521, that was the ''only'' application of `TripleDict`. But
 here, I suggest a second application, namely: Use it to store homsets. A
 "homset" from X to Y in the category C is the parent that contains all
 morphisms from X to Y in the category C (regardless whether these
 morphisms are coercion maps or conversion maps or anything else).

 Let me emphasize again that #715, #12215 and #11521 do not change the way
 coercions are stored.

 How are coercions stored? Before #12313, the coercions from X to Y were
 stored in a strong dictionary in Y, strongly keyed with X. Hence, as long
 as Y survives, there would be a strong reference to X. But with #12313, a
 weak dictionary is introduced for that purpose. Hence, even if Y persists
 and a coercion has been cached from X to Y, X remains collectable.

 Now, back to caching `H = Hom(X,Y,category=C)` (H is a set of maps). You
 are right, having just a weak reference to H sounds like it would be
 immediately collected. If I recall correctly, my reasoning for introducing
 the weak reference to H has been as follows:

  * Let T be the cache for the Hom function. T is a `TripleDict`. and T is
 in fact available under sage.categories.homset._cache. Hence, we have a
 strong permanent reference to T.
  * T provides some buckets, and of course T has strong references to its
 buckets.
  * With #715, the buckets contain some memory addresses representing the
 keys, and then strong references to the values.

 By consequence, defining T[X,Y,C]=H means that we have a chain of strong
 references, starting with T (T is permanent), from T to its buckets, from
 the buckets to H, and from H to X, Y and C. Hence, X,Y and C will never be
 garbage collected, even though T itself only has weak references to X,Y
 and C.

 Now, only define T[X,Y,C]=weakref.ref(H) (or any other kind of weak
 reference to H). Why is H ''not'' immediately garbage collected, in usual
 applications?

 Usual application means: You will not just create H, but you will also
 create an element of H, say, phi. We have a strong reference from phi to
 H, because H is the parent of phi. If you do not store phi, then H remains
 collectable. But in a usual application, you would store phi, say, because
 it is a coerce map from X to Y.

 By now, we assume that phi is a coercion, which is cached in Y. To be
 precise, we have Y._coerce_from_hash[X] = phi.

 Prior to #12313, Y._coerce_from_hash is a usual dict. Hence, the existence
 of Y keeps X, phi and thus H alive.

 With #12313, Y._coerce_from_hash is a `MonoDict`, which only has a weak
 reference to X. However, Y._coerce_from_hash has a strong reference to its
 buckets, the buckets have a strong reference to phi, phi has a strong
 reference to its parent H, and H has a strong reference to X. Hence, we
 have the reference cycle (with -> weak and => strong references)
 {{{
 * => T -> H

    Y -> X
    Y => phi => H => X
                H => Y
                H => C
 }}}

 In conclusion, an external reference to Y will keep X alive (which is
 bad). But if there is no external reference to Y nor to X, then Y,X,H and
 C remain collectable (assuming that no `__del__` method makes the cyclic
 garbage collection impossible), which is good enough to fix a memleak.

 '''__Summary__'''

  * If H=Hom(X,Y,C) and the homset cache keeps a strong reference to H,
 then H, X, Y and C will never be collectable. That's why I introduce the
 weak reference, strange as it may look.
  * If H=Hom(X,Y,C) is created as the parent of a coercion or conversion
 map phi, which is stored in Y._coerce_from_cache, then:
     - An external strong reference to Y keeps phi, thus H, thus X and C
 alive (*).
     - An external strong reference to X does ''not'' prevent Y, H or C
 from being collected.

 In an ideal world, (*) would be improved: An external strong reference to
 Y would ''not'' be enough to keep X alive. Do you have any idea how this
 can be implemented?

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11521#comment:159>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to