Response in line
On Wed, Nov 26, 2014 at 3:19 PM, James Crist <[email protected]> wrote:
> All,
>
> In my spare time, I've been working on implementing a fast pattern matcher
> that accounts for Associative and Commutative symbols. It's going to be a
> while before I'm ready to release the code (it needs some serious cleanup),
> but as of now it is "partly functional". Some notation:
>
> T = set of known patterns. A trie like data structure is used to
> preprocess these patterns to aid in fast matching.
> s = expression to be matched
> x, y = variables
> a, b, c = constants
>
Awesome, thanks for doing this.
> Currently the matcher can determine what patterns in T match s, in the
> presence of associative and commutative symbols, as long as the pattern is
> "linear". By linear, I mean no repeated *variable* symbols (`x + y` is
> acceptable, `x + x` is not). If no associative and commutative symbols are
> present, it can also determine a match substitution to make `s` equivalent
> with the pattern.
>
> Some design questions before I continue work:
>
> - Is there any use for just determining what patterns match, without
> generating a match substitution? Determining if a match exists is faster
> than determining a specific substitution. However, in the presence of
> nonlinear patterns a substitution needs to be found to verify the match is
> valid. This decision determines how tightly coupled the two algorithms will
> be.
>
I can think of use cases but they're secondary. One might ask questions
like "is this expression an element of known class of special functions?".
> - Is there need for nonlinear patterns? I plan to account for them, but
> they make the algorithm a bit more complicated. Nonlinear, AC pattern
> matching is NP complete. Linear AC pattern matches can be found in
> polynomial time.
>
I haven't thought much about this. Is X*X.T non-linear?
>
> - In a general case, there can be many substitutions that match a specific
> pattern mod AC. For example, given `t = x*y`, and `s = a*b*c`, this can
> match with {x: a*b, y: c}, or {x:a, y: b*c}, etc... This is easy to account
> for, and can be done lazily. Api-wise, are there cases where generating
> more than one matching substitution are needed?
>
I believe so. Sometimes we have a condition that can't be expressed in
terms of patterns (e.g. pulling in assumptions) and so we fall back to
python iteration and finding the first in the sequence that matches some
python condition.
Another major use case is selecting the "best" match. In this case we
might also want to control the order in which matches come out of the
sequence so that matches that we think are best come first.
- The algorithm lazily determines matches. Currently preference is given to
> matching variables before constants, but it could go the other way around.
> i.e. if T = {a + b, x + b}, and s = a + b, then the matcher will find `x +
> b` before `a + b`. I'd rather not make this configurable, as it complicates
> things. The question is, do we want the most general match first, or the
> most specific?
>
In my ideal world I might be able to decide the order in which we traverse
the tree of possible matches. I don't know exactly how your algorithm
works but, in some algorithms, we obtain a set of partial matches at each
step and then have to decide which branch to pursue first. I've found that
accepting a user-provided objective function is useful. It allows the user
to specify a greedy search strategy.
More generally I've found that controlling the traversal midstream is of
high-value. This might have been in my particular use case though.
> - At the moment, the code is completely removed from dependence on sympy,
> and can function on its own. As such, I'm planning on making it a separate
> library that can be used by SymPy as a dependency. I understand we were
> against that in the past, but this may be changing? I'd rather not maintain
> two separate codebases.
>
Hooray!
> - There's a `commutative` assumption in sympy, but there doesn't seem to
> be an `associative` one. How can I check if a function/operator is
> associative? Does this information need to be kept elsewhere?
>
Aaron mentioned AssocOp. FWIW I'm not sure that any Sets or MatrixExprs
use this, despite being at times associative. Perhaps they should. If one
wanted a hack they could just keep a set of known associative types
somewhere.
> - Jim
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/42d9511d-e963-416f-9d4e-3e0435461422%40googlegroups.com
> <https://groups.google.com/d/msgid/sympy/42d9511d-e963-416f-9d4e-3e0435461422%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit
https://groups.google.com/d/msgid/sympy/CAJ8oX-Gf%3DQc3EURr4trzKoXnbSkc8HNf6Y02Pxd7LcTEeo_NLg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.