#14711: Weak references in the coercion graph
-------------------------------------+-------------------------------------
       Reporter:  jpflori            |        Owner:  davidloeffler
           Type:  defect             |       Status:  needs_review
       Priority:  critical           |    Milestone:  sage-5.13
      Component:  number fields      |   Resolution:
       Keywords:  memleak, number    |    Merged in:
  field, QuadraticField              |    Reviewers:
        Authors:  Simon King         |  Work issues:
Report Upstream:  N/A                |       Commit:
         Branch:                     |  5168cfd15d4cdbaa3ffdbd4be0f7d783f77257c7
  u/SimonKing/ticket/14711           |     Stopgaps:
   Dependencies:                     |
-------------------------------------+-------------------------------------

Old description:

> The following quickly eats up memory:
> {{{
> sage: for D in xrange(2,2**32):
> ....:     QuadraticField(-D);
> ....:
> }}}
> (This is with 5.10.rc0)

New description:

 The following quickly eats up memory:
 {{{
 sage: for D in xrange(2,2**32):
 ....:     QuadraticField(-D);
 ....:
 }}}
 (This is with 5.10.rc0)

 __Problem analysis__

 The quadratic field is created with a coerce embedding into `CLF`. At the
 same
 time, this coerce embedding is stored in `CLF._coerce_from_hash`:
 {{{
 sage: phi = CLF.coerce_map_from(Q)
 sage: phi is Q.coerce_embedding()
 True
 sage: Q in CLF._introspect_coerce()['_coerce_from_hash']
 True
 }}}
 The "coerce_from_hash" is a `MonoDict`, hence, has only a weak reference
 to the key
 (Q, in this case). However, there still is a ''strong'' reference from
 CLF to the coerce map phi. And phi has a strong reference to its
 domain, thus, to Q. Hence, the existence of CLF prevents garbage
 collection of
 Q.

 And there is a second chain of strong references from CLF to Q: From CLF
 to
 phi to the parent of phi (i.e., a homset) to the domain Q of this homset.

 __Suggested solution__

 We can not turn the reference from CLF to phi into a weak reference,
 because
 then even a strong reference to Q would not prevent phi from garbage
 collection. Hence, we need to break the above mentioned reference chains
 in
 two points. In the attached branch, maps generally keep a strong reference
 to
 the codomain (this is important in composite maps and actions), but those
 used
 ''in the coercion system'' (and ''only'' there!!) will only have a weak
 reference to the domain, and they set the cdef `._parent` attribute to
 `None`
 (hence, we also override `.parent()`, so that it reconstructs the homset
 if
 the weak reference to the domain is still valid).

 To preserve the `domain()/codomain()` interface, I have removed the method
 `domain()` and have replaced it by a cdef public attribute that will
 either
 hold a weak reference (which returns the domain when called, hence, the
 interface does not change) or a `ConstantFunction` (which should actually
 be
 faster to call than a method). Since accessing a cdef attribute is still
 faster, the cdef attribute `_codomain` is kept (since this will always be
 a
 strong reference), but `_domain` has been removed.

 This "weakening of references" is done for the coercions found by
 `discover_coerce_map_from()` stored into `_coerce_from_hash`. So, this
 mainly
 happens for things done with `_coerce_map_from_()` and with composite
 maps. Similarly for `_convert_from_hash`.

 Weakening is ''not'' used on the maps that are explicitly registered by
 `.register_embedding()` and `.register_coercion()`. This is in order to
 preserve the connectivity of the coercion graph. The `register_*` methods
 are only used on selected maps, that are of particular importance for the
 backtrack search in `discover_coerce_map_from()`. These ''strong''
 registrations do not propagate: Compositions of strongly registered
 coercions found by `discover_coerce_map_from()` will be weakened.

 Since weakened maps should not be used outside of the coercion system, its
 string representation shows a warning to replace them by a copy. The
 attached
 branch implements copying of maps in some additional cases.

 `SchemeMorphism` can not inherit from `Morphism`, because of a bug with
 multiple inheritance of a Python class from Cython extension classes. But
 once
 this bug is fixed, we surely want to make `SchemeMorphism` inherit from
 `Morphism`. This transition is prepared here.

 In any case, the commit messages should give a concise description of what
 has
 been done.

 '''__Still TODO__'''

 Let the string representation of weakened maps point the user to the need
 of
 creating a copy.

 '''__TODO in future tickets__'''

 - Provide a documentation of the use of weak references in coercion, and
 of
   different ways of registering coercions, with their different impacts on
   garbage collecion.
 - Provide a version of `.register_coercion()` that weakens the coercion
   map. It would hence have the same effect as returning a map by
   `._coerce_map_from_()`, but of course `._coerce_map_from()` could not
 easily
   be changed in an interactive session.
 - provide copying for ''all'' kinds of maps.

 '''__Effects on the overall functioning of Sage__'''

 It is conceivable that some parts of Sage still suppose implicitly that
 stuff
 cached with `UniqueRepresentation` is ''permanently'' cached, even though
 the
 seemingly permanent cache was not more than a consequence of a memory leak
 in
 the coercion system. With the attached branch, garbage collection of
 parent
 structures will much more often become possible. Hence, code that relied
 on a
 fake-permanent cache would now need to create the same parent repeatedly.

 I (Simon) have tested how many additional parent creations occur with the
 attached branch when running `sage -t --all`. The findings are summarised
 in
 comment:107: The number of additional parent creations increased by not
 more
 than 1% for all but two parent classes (both related with tableaux). I
 also
 found that the time to run the tests did not significantly increase.

 Jean-Pierre has occasionally stated that some of his computations have
 been
 infeasible with the memory leak in the above example. I hope that his
 computations will now succeed.

--

Comment (by SimonKing):

 Replying to [comment:116 nthiery]:
 > Could you post a quick summary (say in the ticket description/title)
 > of what the current patch does?

 Done. OK, the summary is actually not quick. Sorry.

--
Ticket URL: <http://trac.sagemath.org/ticket/14711#comment:117>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to