Re: [sympy] GSoC 15: Improving Assumptions

Aaron Meurer Sat, 21 Mar 2015 11:55:35 -0700

On Sat, Mar 21, 2015 at 6:18 AM, Joachim Durchholz <[email protected]> wrote:
> Am 21.03.2015 um 00:23 schrieb Aaron Meurer:
>>
>> The problem is, you don't want your assumptions to "leak" to other,
>> unrelated functions that happen to use the same symbol names. There
>> have been several suggestions on how to fix this. Matthew Rocklin
>> introduced an assuming() context manager, which makes it easy to
>> assume a fact and unassume it when you don't need it. Making something
>> that uses function scopes has also been suggested (see issue 4983).
>> The final suggestion is to effectively keep the old assumptions, i.e.,
>> allow symbols to have assumptions defined directly on them (more on
>> this later).
>
>
> I have another suggestion: Turn assumption contexts into first-class objects
> (and give them a better name).
> Have a global assumption context, let the context manager manage another
> one, or add contexts into expressions. IOW let the user decide what contexts
> he needs or wants.
>
> Do not have any kind of fallback between these context objects; if there's a
> global_assumptions object, make assuming() not pull it in by default, but
> allow writing
>   with assuming(global_assumption):
> so people can pull in their global assumptions (and they can also keep
> different sets of assumptions around, e.g. to test different hypotheses).
>
>> - The new assumptions and old assumptions differ. I noticed you put
>> this as your item e. But really this *has* to be the first thing that
>> is fixed. If one assumptions system uses "real" to mean one thing and
>> another uses it to mean another (as is currently the case; in the old
>> assumptions oo is real and in the new it is not), then changing code
>> from is_real to Q.real is not a simple find and replace.
>
>
> That's a *really* important thing.
> If we want to merge the two, and if it's an ongoing effort, then this must
> be heavily guarded with unit tests.
>
>> The handler system is broken in other ways too. Suppose we want to do
>> the reverse fact, "if x*y is zero, then x is zero or y is zero". How
>> to even begin writing this is not clear.
>
>
> That's inference engine (i.e. SAT solver) stuff.
> You can't infer True vom A | !A on a per-symbol basis, you need a less local
> view.
> Also, an inference engine can do strategy. E.g. in an expression like
> x*(hugely_complicated_thing), and you find that x is zero, you don't need to
> infer much about hugely_complicated_thing except that it will not diverge
> towards infinity.
>
> I still see a place for per-symbol assumptions and rules. They are the basic
> facts that the inference engine needs to work with.
>
>> I don't want to get too sidetracked from the history, but the
>> system I proposed in 2508 handles this fact, and it handles it now
>> matter how you pose it to ask, because all the logic is done
>> symbolically, and things are solved with the SAT solver.
>
>
> I see we're coming to similar conclusions then :-)
>
>> - The new assumptions are slow. As I said above, don't discount the
>> old assumptions. The are quite fast, and considering how they are used
>> in the core, it's somewhat impressive. The new assumptions, as they
>> are now, couldn't be used as a drop in replacement for the old
>> assumptions for performance reasons alone.
>
>
> I suppose this won't change quickly. Old assumptions are ahead by a few
> years of refinement.
> To catch up, the solver would have to get smart about strategy quickly,
> which means it should be easy to hack on.
>
>> - Find and fix all inconsistancies between the old and new
>> assumptions. A lot of what I suggested there was already fixed at
>> https://github.com/sympy/sympy/pull/8293, but I made a more extensive
>> list of suggested changes at https://github.com/sympy/sympy/pull/8134.
>> In particular, the nature of infinities and real and positive facts
>> has not been ironed out (and indeed, I'm not entirely convinced which
>> way it should be changed to). Most importantly of all, document
>> everything rigorously, so that there is a definitive definition of
>> each assumption.
>
>
> Maybe we should make the new assumptions pick up the primitive facts from
> the old assumption, instead of trying to recreate facts in the new
> assumptions.


This is part of the "make new assumptions read old assumptions" step I
outlined. Actually, the two pull requests I mentioned only make the
new assumptions read the old assumptions on Symbol. The next step
would be to make them read the old assumptions on every expression.

> Trouble is: old assumption have their facts and rules as executable code in
> handlers. Solvers in new assumptions need them as proposition objects that
> they can recombine.

Yes this is true. Long term, I would like to see as many facts as
possible written down as in 2508, so that they can be used with the
inference system.

But before we can do that, we have to figure out how to make that
approach faster.

>
> So my approach would be to look at the old-assumptions code, identify
> structural preconditions and ensuing inferences, recode these as rules for
> new assumptions, and replace the old-assumption handler code with a series
> of calls to the "apply THIS rule to THAT Basic AFTER checking that it is
> valid" function.
> Nice thing with that is that the old-assumption code is pretty well-covered
> by tests, so most mistakes in the recoding will be caught.

Only if the translated code is actually used by the assumptions
system. Although the assumptions tests themselves could be translated
to the new assumptions syntax.

But another advantage of a full inference system is that it catches
contradictions.  So mistakes can often be caught that way. Another
advantage is that facts written down declaritively tend to be terse
(see 
https://github.com/sympy/sympy/pull/2508/files#diff-e0f95401c86d07d90903363122990de0R287,
each oneline fact would be several lines in the old assumptions). So
mistakes are harder just for that. I'm also thinking of syntactic
sugar that can be used to make those facts even easier to read.

>
>> - Make the new assumptions call the old assumptions.
>
>
> I think the transitional structure would have to be this:
>
> SAT solver (new assumptions)  handler (old assumptions)
>                         |         |
>                         v         v
>                      elementary rules
>
> Elementary rules consisting of a structural pattern ("this rule applies to
> Add nodes with all operands being integral"), possibly a predicate ("number
> of known-to-be-odd operands is even"), and a resulting predicate ("Add node
> is known to be even") and/or structural transformation ("move the even
> operands to the head of the list" - okay, that's sill, you need that for
> sine/cosine transforms and such, and maybe transformation is beyond the
> scope of assumptions anyway).
>
> I do not think the new assumptions should call the old assumptions. The new
> assumptions work by manipulating data structures, not by executing code.

They do execute code. See sympy.assumptions.handlers.

The final goal is to not have multiple assumptions systems, at least
from a user perspective. We will probably have multiple inference
layers, but these should be hidden from the user. To do this, we
either need to remove one of the existing systems (already tried and
failed), or merge the two, so that one calls the other and they just
become different layers in the inference stack.

>
>> - Remove assumptions from the core as much as possible. If we can
>> remove ManagedProperties (the core metaclass that holds the basic old
>> assumptions logic) that would be awesome.
>
>
> ManagedProperties isn't just for assumptions, it's integrated with the
> metaclass system (which is there for the Singleton stuff).
> I.e. Singleton and assumptions stuff get mixed in the metaclass hierarchy,
> and separating that from each other would be really great.

OK, I didn't know that.

>
>> - Use a tiered handlers system. I'm currently of the opinion that this
>> is the way to go. The more powerful a system is, the slower it is.
>> 2508 is great, but slow. The old assumptions, facts.py system is fast,
>> but limited in power. I think we should have a system where it asks
>> one by one, from fastest to most powerful, until it gets an answer
>> (optionally stopping early if getting a None is not that big of an
>> issue, e.g., in the core it should do this).
>
>
> That would require naming each tier, and telling users what tiers exist and
> in what circumstances they get activated.
> Otherwise, users will be constantly surprised about rules and when they'll
> get applied and when not, and what to do to get a specific rule applied.

Yes, this idea is still in infancy. We should think about what the
different tiers should be, how they should be named, and when they
should be used. The main focus is performance, i.e., calling faster
things first, and potentially not calling slower things. So before we
think too hard about it we would also need to benchmark things.  This
is why I put it in my longer term list, as other things would need to
settle down a bit first before doing this. Or maybe we can find a way
to make the most powerful system really fast and this will become
completely unnecessary.

>
> Another way to get users a say would be to allow them to specify an
> assumption *as something to verify*. This would force the solver into
> chasing those rules that lead to the specified assumption, and trigger those
> rules that the user want to see triggered.

Sadly, the facts are all tied to one another pretty closely. If you
were to make a graph where the nodes are the assumptions (real,
positive, even, etc.) and there are edges if two assumptions are
related by a fact (like positive -> real), the graph would have a very
large connected component containing all the standard assumptions
(there are about twenty of them). Also, a good SAT solver should be
pretty good at ignoring useless facts.

And every edge has to point in both directions. You can't just say
positive -> real and I know something about real, so I don't need to
care about positive. The reason is the contrapositive rule, !real ->
!positive. If what I know about real ends up being that real is False,
then I can deduce something about positive.

Aaron Meurer

>
> Hope I didn't ramble too much :-)
>
> Regards,
> Jo
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/550D536D.8020806%40durchholz.org.
>
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAKgW%3D6KxfiNE3JXRJ6_uFx4C0emMrVed6OOqVSGW_DGs-5SZ3w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [sympy] GSoC 15: Improving Assumptions

Reply via email to