Re: [sympy] more unified (and usable) solve output

Oscar Benjamin Sat, 05 Mar 2022 11:49:33 -0800

On Fri, 4 Mar 2022 at 22:02, Aaron Meurer <[email protected]> wrote:
>
> It would really be a shame to give up the name solve(). Maybe we can
> put this on a list of big changes that we can do in 2.0? Although it
> would be good to have a working replacement ready before we do any
> renaming.

Let's just work on a replacement first and then we can consider what
to do with legacy solve later. Remember there was a previous attempt
to do exactly this and resulted in linsolve, nonlinsolve and solveset.
Unfortunately none of those can replace solve and there are failings
in their APIs that mean we still just want to make a new solve
function that isn't compatible with any of the existing ones.

> On Fri, Mar 4, 2022 at 8:27 AM Oscar Benjamin
> <[email protected]> wrote:
> >
> > On Fri, 4 Mar 2022 at 14:57, Chris Smith <[email protected]> wrote:
> > >
> > > We could just make a wrapper named `solved = lambda f,*s,**k: solve(f, 
> > > *s, **k, dict=True)`
> >
> > If there is to be a new API then there are a *lot* of other things
> > that I would want to change about it compared to solve so it
> > definitely wouldn't be just a thin wrapper around solve. It would be
> > nice if nonlinsolve could be the new API but it doesn't have good
> > return types either.
>
> It might be helpful to enumerate some of those things. Here are a few
> I've noticed
>
> - In general, the output of solve() should not depend on the specific
> mathematical type of the input. This means for instance, linear and
> nonlinear equations should be handled in the exact same way. Right
> now, even dict=True can't save you:
>
> >>> solve([x**2 - 2, x > 1], x)
> Eq(x, sqrt(2))
> >>> solve([x**2 - 2, x > 1], x, dict=True)
> Eq(x, sqrt(2))
>
> One can imagine wanting to use x > 1 just to limit the solutions
> (which can't be done just with the old assumptions), but this has
> caused solve() to treat it as a completely different type of equation.

Most people who use an inequality with solve do not expect the current
behaviour. Indeed they usually want to filter a finite set of
solutions. However if inequalities can be included in the inputs then
there needs to be a way of representing unsolved inequalities in the
output format. In fact representing unsolved inequalities is needed
for other cases such as a solving a quadratic with symbolic
coefficients over the reals where the symbolic condition b^2 - 4ac > 0
needs to be satisfied.

> - One issue I see is that it's not always clear what solve()
> will/should do in some cases, especially undetermined cases. It seems
> like "solving" and "variable elimination" are too often mixed
> together. It would be better if they were completely separated, since
> it's not clear to me why you would want one when solve can't produce
> the other. We should very carefully define what it means to "solve" an
> equation (note this is not so trivial once you include systems,
> multiple variables, infinite solutions, and inequalities into the
> mix).

The first point that needs to be understood is that the output of
solve is for use in *substitutions*. The design decision that you want
something that can be used for substitutions places significant
constraints on how mathematically correct its output can be and what
kinds of problems can be solved. This is what solve is for though.
Mathematica has separate functions solve and reduce where solve
returns substitution rules and reduce simplifies a system of boolean
statements about integer/real/complex symbols. The output of reduce is
not directly usable for substitution but it can be always formally
correct. The design goal for solve is to be as correct as possible
while still returning an explicit list of substitution rules.

Understanding that the output is for substitutions we can see why:

1. The list of dicts output from solve(..., dict=True) is the best
output format from all of the existing solve functions because each
solution can be used directly with subs:

In [7]: equation = x**2 + x + 1

In [8]: sol1, sol2 = solve(equation, x, dict=True)

In [9]: print(sol1, sol2)
{x: -1/2 - sqrt(3)*I/2} {x: -1/2 + sqrt(3)*I/2}

In [10]: equation.subs(sol1).expand()
Out[10]: 0

2. The new solvers all failed to have usable output:

The output of solveset is completely unstructured so attempting to use
it for any further processing will fail very often:

In [18]: print(solveset(sin(x)-1, x))
ImageSet(Lambda(_n, 2*_n*pi + pi/2), Integers)

In [19]: print(solveset(x**2 - y, x, Reals))
Intersection({-sqrt(y), sqrt(y)}, Reals)

It's very unclear how to write some code that can extract substitution
rules from the Set object returned by solveset.

The linsolve function is a bit better in that the return type is
consistently a FiniteSet containing a single tuple. You can use it for
substitutions like this:

In [21]: (sol,) = linsolve(eq, [x, y])

In [22]: sol
Out[22]: (1/3, 1/3)

In [23]: rep = dict(zip([x, y], sol))

In [24]: rep
Out[24]: {x: 1/3, y: 1/3}

In [25]: eq[0].subs(rep)
Out[25]: 0

This is still an awkward format though because the FiniteSet isn't
indexable so you have to extract with (sol,) = or with list(sol)[0].
You also need to wrap it up with dict(zip(...)) before you can do any
substitution.

The FiniteSet does seem particularly pointless when it's guaranteed to
contain only a single tuple (why not just return the tuple?).
Supposedly the reason for doing this is to have a "consistent format"
across different solvers. However it isn't consistent because solveset
returns proper sets like ImageSet for infinite sets but linsolve still
returns a FiniteSet in that case:

In [26]: linsolve([x+y], [x, y])
Out[26]: {(-y, y)}

That's not a proper set and it can't be used properly with
intersections or other set operations. At the same time it's not a
directly usable output like a list of dicts so it's the worst of both
worlds: an awkward format that is also not mathematically correct.

With nonlinsolve at least returning a FiniteSet makes sense because
there can be a non-unique but still finite solution set:

In [27]: nonlinsolve([x**2 - 1, y], [x, y])
Out[27]: {(-1, 0), (1, 0)}

However solveset then does something really bizarre for infinite solutions:

In [28]: nonlinsolve([sin(x), y], [x, y])
Out[28]: {({2⋅n⋅π + π │ n ∊ ℤ}, 0), ({2⋅n⋅π │ n ∊ ℤ}, 0)}

Here you have a set of tuples but where an expression (0) is given for
y in each tuple we have a set ({2⋅n⋅π + π │ n ∊ ℤ}) as the
"expression" for x. Again this is not a mathematically correct set and
at the same time this is not a convenient format for further
processing and it is not clear how to use this to do substitutions.

> One idea is to make things programmatic with some sort of Solution
> object that provides different things like solution parameters in
> consistent attributes. This would also allow moving some of the more
> complex keyword-arguments from solve() itself into methods of the
> Solution object.

I could imagine something like having a solve function that just
returns a list of dicts but then a function like solve_full that
returns a more formally correct representation of a set of solutions.
The basic goal of what solveset tries to achieve is to represent the
different cases but the problem is that it is unstructured and does
not easily resolve into explicit expressions for the output. More
generally it should always be possible to represent the set of
solutions as a union of sets where each set is described more or less
like a Python list-comprehension and also is included only
conditionally:

S = {f(x1, x2, ...) for (x1, x2, ...) in (S1, S2, ...) where P(x1, x2,
...)} if condition

Examples of what this could look like:

  >>> solve_full([x**2 - 1], [x])
  [({x: 1}, {}, True, True),
   ({x: -1}, {}, True, True)]

  >>> solve_full([sin(x)], [x])
  [({x: n*p}, {n: Integers}, True, True)]

  >>> solve_full([a*x**2 + b*x + c], [x], {x:Reals}])
  [({x: (-b + sqrt(b**2-4*a*c))/(2*a)}, {}, True, (b**2-4*a*c > 0) & (a>0)),
   ({x: (-b - sqrt(b**2-4*a*c))/(2*a)}, {}, True, (b**2-4*a*c > 0) & (a>0))]

  >>> solve_full([x + y, x>2], [x, y])
  [({x: t, y: -t}, {t: Reals}, t>2, True)]

  >>> solve_full([a*x - b], [x])
  [({x: b/a}, {}, True, Ne(a, 0))]

Of course the representation above could be wrapped in a  solution
object of some kind. It's also possible to inject much of the above
into the expressions themselves using ExprCondPair or new types of
symbols for the parameters n and t above.

Then the solve function can be a wrapper around this that just distils
it down to the list-of-dicts substitution rules with the understanding
that solve_full is needed to see full solution information such as the
conditional existence of solutions above.

A key question is whether solve is allowed to introduce new symbols.
At the moment solve doesn't do that but that makes it impossible to
correctly represent the solution set of basic equations like sin(x)=0
in an explicit form. If new symbols are to be returned then there has
to be some way to know what set those symbols are supposed to
represent (e.g. n is an integer) and to get those symbols somehow in
the output from the solving function.

Another question is how to handle solutions that exist for "most"
values of a parameter e.g. a solution for x that is valid provided a
does not equal 0. I think it's right that solve should return the
"generic" solution, however it should return a sufficient condition
describing that genericity like `Ne(a,0)`. None of the current solve
functions can do this. Also sometimes the system is generically
inconsistent e.g.:

In [31]: solve([x+y+a, x+y+1], [x, y])
Out[31]: []

Here there are no solutions unless a=1. There should be some way to
know that solve has assumed that a does not equal 1 and likewise
someway to tell solve that you want it to get the solution when a is
equal to 1 e.g. this doesn't work with current solve:

In [32]: solve([x+y+a, x+y+1, a-1], [x, y])
Out[32]: []

It might make more sense to have an assumptions=Eq(a, 1) argument
where various inequalities or other kinds of statements can be
assumed.

Another basic issue with solve is that it always tries to use roots to
get explicit radical expressions but RootOf is better when there are
no symbolic coefficients and in fact it is often better to have an
implicit description for the roots of a polynomial anyway e.g.:

  >>> solve_full([exp(5*a*x) - exp(a*x) + 1], [x])
  [({x: (log(t) + I*n*pi)/a}, {n: Integer, t: RootSet(t**5 - t + 1,
t)}, True, True)]

With implicit descriptions of roots we can handle polynomials of
arbitrary degree without the Abel-Ruffini limit. It's also much more
efficient in the case of large degree polynomials just to express the
solution in terms of a parameter t representing any of the roots. Of
course there should be some way to convert to an explicit list of the
roots using e.g. RootOf but a RootSet object could be correct even for
polynomials with symbolic coefficients and unknown total degree (e.g.
if the leading coefficient is not known to be nonzero as in a*t**5 - t
+ 1 or something like t**n - 1).

> - The API should also work uniformly for systems as well as single
> equations. Perhaps the cleanest way would be for solve() to always
> treat a single equation as a system of one equation. That way the
> return type is always uniform (the downside is the common case might
> involve an extra layer of indirection, which could be annoying).

The list of dicts form works just as well for a system as for a single
equation if you remember that the goal is substitution.

> There's also this old issue "solve is a mess" which discusses a lot of
> these things. https://github.com/sympy/sympy/issues/6659

Also:
https://github.com/sympy/sympy/issues/16861

--
Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxTTq8CL_d9YQJk838eqhY0bz8Z3xVX-peO2oi28zFWfXA%40mail.gmail.com.

Re: [sympy] more unified (and usable) solve output

Reply via email to