On 05.09.2012 17:48, Aaron Meurer wrote:
On Wed, Sep 5, 2012 at 4:37 AM, Juha Jeronen <[email protected]> wrote:
Hence, let's allow re-ordering. If we reorder the given syms list
independently in each subexpression, and collect by ["b","c"] (now in
automatically determined, appropriate order for each subexpression) we
obtain
a*(b*(c*(d + 1) + 1) + 1) + d*(c*(b*(a + 1) + 1) + 1)
which gets rid of the extra "c". The first parenthetical expression gets
collected by ["b","c"] and the second by ["c","b"].
Now, it seems to me that this feature would be useful... so I extended my
recursive_collect() accordingly. The new one has a "reorder" flag which
affects the operation when the syms are given. When reordering is enabled,
it basically runs the automatic syms generation, and filters the list to
include only those syms that were given by the user (while preserving the
ordering from the automatically generated list).
What would be the sane default for this option? Intuitively, at least I
would expect the non-recursive version to do *exactly as it is told* (i.e.
no reordering unless explicitly requested), whereas in the recursive
version, it would make more sense to have reordering enabled by default in
order for collect() to "do what I mean" in cases like the above.
Is this fine?
It seems OK to me.
Ok.
So, summarizing again:
collect(expr, syms=None, **kwargs)
kwargs:
deep=True (default; API break!) or deep=False (top level only, like
old collect())
method="dfs" (old rcollect()) or method="bfs" (recursive_collect();
suggesting this as default).
reorder=bool; whether to allow reordering of given syms
per-subexpression to optimize collection. When deep=False, default is
False, and when deep=True, default is True (mnemonic: default has the
same state as "deep"). When syms=None (automatic syms), this flag has no
effect.
For syms, a list can be given as before. The default is the special
value None, which means "use automatic syms".
rcollect(expr, *vars) just calls collect(expr, syms, deep=True,
method="dfs").
Why I think this functionality should be included is, that by using this it
is possible to produce efficient recursive collection w.r.t. the given syms
only. Something like this is needed, if the user wants to (somewhat
optimally) collect e.g. only in variables, ignoring constants.
Moving on to the other topic, about the number atoms, is collect() intended
to ignore numbers? To me it seems it does:
import sympy as sy
sy.collect( sy.sympify("2*a + 2*b + 2*c"), sy.sympify("2") )
=> 2*a + 2*b + 2*c
Probably. One issue with this is that numbers automatically
distribute, so if you want to force them out, you have to use a hack
(this is what factor() does).
Ah, I see.
Also, this test
assert collect(-x/8 + x*y, -x) == x*(y - S(1)/8)
in simplify/tests/test_simplify.py, which produces the same result as
collecting w.r.t. just "x" - and indeed, according to the assert, is
expected to do so - seems to suggest that collect() is meant to ignore
numbers.
NumberSymbols, OTOH, seem to be handled the same way as generic Symbols:
sy.collect( sy.sympify("pi*a + pi*b + 2*pi*c"), sy.sympify("pi") )
=> pi*(a + b + 2*c)
I'm asking because my automatic syms generator also currently ignores
numbers. It would be possible to handle numbers, too, but that would require
complicating the ordering a bit (symbols first, numbers last). If it's not
needed, I won't add the extra functionality (and the extra bugs to go with
it).
There are already good functions for pulling out numbers from an
expression, like factor_terms, gcd_terms, as_content_primitive (I'm
not sure what the best one to use is at the moment). So if anything
collect() should just call one of those as a pre- or post-processor.
Ok. Thanks for the tip. I'll see what those can do.
But I think this part can wait until later - it's probably better to
first just integrate the current version, which ignores numbers
(consistently with how collect() currently works), and then see about
improving both.
Finally, reading the latest source, I noticed some things about collect():
- it descends into a top-level Mul even when not recursive. Is this
behaviour desired? (I think it's ok like this. It's just a bit surprising -
maybe needs to be mentioned in the docstring.)
This is probably an artifact of what I was talking about. Recursive
should really be the default, because it's what most people. If
collect() was truly non-recursive by default, it wouldn't do what most
people wanted it to do.
Ok.
- expr is sympified, but only after some processing (conditional on the "if
evaluate:") is already run on it. This processing assumes that expr is
already an object tree; hence collect() errors out on string input if
evaluate=True. Doing the sympification a few lines earlier would fix this
and should have no negative side effects. If that's ok, I'll change this.
It's OK, though note that in general, it's better to just pass SymPy
objects rather than strings to functions.
Ok.
In programs/scripts, I agree. I consider it mainly a convenience feature
for interactive use :)
-J
--
You received this message because you are subscribed to the Google Groups
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sympy?hl=en.