#14711: Memleak when creating QuadraticField
-------------------------------------+-------------------------------------
Reporter: jpflori | Owner: davidloeffler
Type: defect | Status: needs_review
Priority: critical | Milestone: sage-5.13
Component: number fields | Resolution:
Keywords: memleak, number | Merged in:
field, QuadraticField | Reviewers:
Authors: Simon King | Work issues: Fix elliptic curves
Report Upstream: N/A | code
Branch: | Commit:
u/SimonKing/ticket/14711 | 05fb569cb132a8c89713021f1f4b25cd2dd7cb1c
Dependencies: | Stopgaps:
-------------------------------------+-------------------------------------
Comment (by SimonKing):
Replying to [comment:96 nbruin]:
> I agree that there is a place for such strong connections, but I have
severe reservations about declaring it [`.register_coercion()`] is the
only way or even the default way to inform the system
> about coercions.
Well, I have mentioned `._coerce_map_from_(...)` in several previous
posts, and if you look at my thematic tutorial on categories and coercion,
you'll find that I consider ''this'' the default. And it only yields
''weak'' caching.
> I have severe reservations about declaring that this code "will never be
memory efficient" in sage.
I think that we want ''some'' particularly important coercions to be tied
to the lifetime of the codomain, and thus we use `.register_coercion()`,
and we want other coercions to be tied to the minimum of the lifetimes of
domain and codomain, and thus we use `._coerce_map_from_()`. I don't think
we have a problem here.
> Consider:
> {{{
> Qx.<x>=QQ[]
> K.<a>=NumberField(x^4-2)
> L.<b>=NumberField(x^2-2,embedding=a^2)
> }}}
> This fits perfectly in the "unchanging universe" model. Also note that
the coercion system does not need to let L keep K alive, since the
construction parameters,
> which get kept alive for the life of L by `CachedRepresentation` or
something analogous, refer to K already.
It isn't `CachedRepresentation`, but this doesn't matter.
> Now consider
> {{{
> M.<b>=NumberField(x^2-2)
> }}}
> In the "unchanging universe" (and in sage as well) we have that M is
distinct from L. However, I think it's unrealistic to expect that all
"embeddings" etc. can be
> specified at construction time.
Again, nobody has claimed that everything needs to be declared at
construction
time. There are some particularly important coercions registered at
construction time, namely '''the''' coerce embedding (if it exists then it
is
unique) and those installed by `.register_coercion()`. Everything else is
dynamical, based on `_coerce_map_from_()`.
In my description from comment:84, note that the digraph is not totally
static. It has static parts (corresponding to coerce embeddings and
coercions
fixed by `.register_coercion()`) and dynamic shortcuts (corresponding to
`_coerce_map_from_`).
> So I think, even though it's not possible currently in sage, that one
should allow for
> {{{
> m1=Hom(M,K)([a^2])
> m2=Hom(M,K)([-a^2])
> M.register_embedding(m1)
> }}}
I don't know if this is reasonable, but at least it is against what people
originally wanted with the coerce embedding. If you declare the coerce
embedding phi from a number field `K` to, say, `CC`, then you consider `K`
as a
subfield of `CC`. If you provide another number field `L`, which is
isomorphic
to `K`, with a different embedding psi into `CC`, then adding an element
of
`K` to an element of `L` is done by embedding both into `CC` and then
adding
complex numbers.
Since we think of K and L as different subfields of `CC` and not as
abstract
fields, we ''must'' consider K and L as different objects, and so the
different
embedding must play a role in the cache key for K and L. This is why they
have
to be provided at construction time.
It would be a totally different way of thinking if you tried to do the
same
with `CC.register_coercion(phi/psi)` or with
`CC._coerce_map_from_(K/L)`. Namely, in both cases, you would not be able
to
add elements of K and L, because neither K nor L would know about the
embedding. And in fact you would consider K and L as abstract fields, and
you
would in fact want `K is L` (at least if you fancy unique parents, which I
do...). And then the axioms for coercion would strike: There can be at
most
one coercion from `K` (i.e., `L`) to `CC`. Hence, you could ''not''
simultaneously declare different embeddings of `K` into `CC` as coercions.
Since it pretty much seems to me that number theorists want to comfortable
compute with different isomorphic subfields of `CC`, it would thus simply
not
feasible to restrict oneself to `.register_coercion` and
`_coerce_map_from_`:
One ''needs'' coerce embeddings, and one ''needs'' that they are part of
the
defining data of a number field.
> Note that the choice of `m1` or `m2` here leads to different relations
between `M` and `K` and hence different universes.
> In other words, our concept of "globally unique" is not powerful enough
to capture the
> full identity of objects, which would include the coercion relations
with
> objects that haven't been "discovered" yet.
I would state it differently. In order to define `K` (a subfield of CC),
there
is no way around providing the embedding during creation. "Discovering" a
coercion relation seems the wrong approach here.
And speaking about memory: The embedding of K into CC is stored as an
attribute of K, not of CC. Hence, K keeps CC alive, but CC does not
prevent K
from garbage collection. So, I really don't understand where you see a
problem.
> In practice, we can usually work around that by
> for instance changing the names of generators and hence create
artificially differently labelled objects but that's already not a
possibility for creating
> different copies of `ZZ^n`, since there are no generator names to choose
there.
Well, if one has obvious distinguishing data, such as an embedding, then
there
is nothing artificial when using them.
> I think one has to accept the reality here: what we have is a collection
of objects whose relations do change in time.
No. I don't see anything dynamic in your "embedded numberfield" examples.
A
subfield is a subfield is a subfield.
> That's not the only thing coercion does. It may also find "common
covering
> structures", which may lead to construction of new parents. Those
definitely
> don't deserve to get nailed into memory. Yet, the code that creates
these
> parents will look (to the coercion system) as a "user", so it would be
using
> these strong-referencing coercion registration routines.
What you seem to mention here is the pushout construction. It mainly
relies on
"construction functors". I don't even know if it takes the coerce
embeddings
into account at all.
Anyway, the new parents created by pushouts indeed play the same role as
parents
created by the user. Let's try to be more concrete. Let P and Q be
parents, you want to add an element p of P to an element q of Q, and the
pushout
construction finds a parent R such that both P and Q coerce into R,
allowing
to perform the addition in R, resulting in `r=R(p)+R(q)`.
Now, it could be that `R.register_coercion(P)` and
`R.register_coercion(Q)`
are both executed in `R.__init__` (but see the remark below). In the
current
code (also in my branch), this would imply a strong reference chain from R
to
both P and Q. Hence, even if you did `del p,q,P,Q`, P and Q could not be
garbage collected.
But I don't think we should see this problematic, for several reasons:
- Pushout constructions don't arise particularly often. Normally, either P
coerces into Q or Q coerces into P, or both embed into the same parent
anyway, and I have mentioned above: With a coerce embedding, the
existance
of R would not prevent P and Q from garbage collection (plus, it has
nothing
to do with pushout anyway...)
- Is `register_coercion` really used so often? I think `_coerce_map_from_`
is
more commonly used, and then the existence of R would ''not'' prevent P
and
Q from garbage collection.
> I think it may well be a feature, not a bug, that one at some point can
just
> be left with the shortcuts and that the intermediates have fallen out.
How would you guarantee that you kept in mind enough shortcuts to not
change
connectivity by throwing away intermediates?
> The
> natural way of staying close to "discovering a permanent universe" is by
> never throwing away anything that has been discovered, and I think we
agree
> that's a "memory leak".
No, we disagree.
It is a memory leak if a connected component (not taking into account
shortcuts) of the coercion graph can not be garbage collected, even though
there is no external strong reference (which may be a coerce embedding) to
any
vertex of the connected component.
Note that this was exactly the problem with the example from the ticket
description! As I have pointed out in comment:18, it was ''not'' the case
that
the problem lay in `_coerce_from_list_`, because this was empty. In
particular, it was ''not'' the case that `.register_coercion()` was to
blame.
Instead, the memory leak came from short-cuts, i.e., from the stuff
stored in `_coerce_from_hash`, which can also be seen in
[attachment:chain.png].
Hence, the quadratic field and CC did belong to different connected
components
of the coercion graph, but the shortcut kept Q alive.
> > And there surely is a means available to add a coercion that doesn't
tie the
> > lifespan of two parents too closely: Implement
`P._coerce_map_from_(Q)`,
> > which can return a map or simply "True" (in the latter case,
conversion is
> > used to coerce Q into P). The result is cached in
`P._coerce_from_hash`, but
> > not in `P._coerce_from_list`.
> You mean: implement
> `P._install_coerce_map_from_(m)`, which does:
> `_coerce_from_hash[Domain(m)]=m`. I think it is quite important to be
able
> to manipulate the coercion graph without having to modify library code.
This might be a good idea. So, we would have `.register_coercion()` for
permanent pathways, and `_install_coerce_map_from()` (with a similar
semantics, i.e., you can either just provide the parent or a whole
morphism)
for impermanent shortcuts.
--
Ticket URL: <http://trac.sagemath.org/ticket/14711#comment:99>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/groups/opt_out.