Am 27.11.2014 um 00:19 schrieb James Crist:
All,

In my spare time, I've been working on implementing a fast pattern matcher
that accounts for Associative and Commutative symbols. It's going to be a
while before I'm ready to release the code (it needs some serious cleanup),
but as of now it is "partly functional". Some notation:

T = set of known patterns. A trie like data structure is used to preprocess
these patterns to aid in fast matching.
s = expression to be matched
x, y = variables
a, b, c = constants

Currently the matcher can determine what patterns in T match s, in the
presence of associative and commutative symbols, as long as the pattern is
"linear". By linear, I mean no repeated *variable* symbols (`x + y` is
acceptable, `x + x` is not).

That's an important restriction.
It's probably also what makes the pattern matcher fast enough to be useful :-)

- Is there any use for just determining what patterns match, without
generating a match substitution? Determining if a match exists is faster
than determining a specific substitution. However, in the presence of
nonlinear patterns a substitution needs to be found to verify the match is
valid. This decision determines how tightly coupled the two algorithms will
be.

See the next remark.

- Is there need for nonlinear patterns? I plan to account for them, but
they make the algorithm a bit more complicated. Nonlinear, AC pattern
matching is NP complete. Linear AC pattern matches can be found in
polynomial time.

This depends on what you're aiming for.

Is it a basic building block for other algorithms to use? Then favor predictable performance over features. Something that has unpredictable performance isn't very attractive to an algorithm author.

Are you aiming for something that can solve as many situations as possible? Then aim for more features. I'm somewhat sceptical whether just pattern matching will be the right approach, but I suspect you aren't after this one anyway :-)

- In a general case, there can be many substitutions that match a specific
pattern mod AC. For example, given `t = x*y`, and `s = a*b*c`, this can
match with {x: a*b, y: c}, or {x:a, y: b*c}, etc... This is easy to account
for, and can be done lazily. Api-wise, are there cases where generating
more than one matching substitution are needed?

I'm not sure that such cases will actually come up, but if they do (and I see a significant probability that they will), it would be a pity if your code would have to be rewritten from scratch.

So I'd recommend doing that, or at least keeping the path to this kind of stuff open if possible.

- The algorithm lazily determines matches. Currently preference is given to
matching variables before constants, but it could go the other way around.
i.e. if T = {a + b, x + b}, and s = a + b, then the matcher will find `x +
b` before `a + b`. I'd rather not make this configurable, as it complicates
things. The question is, do we want the most general match first, or the
most specific?

It's almost a given that anybody who writes an algorithm on top of the pattern matcher will find that the first match found isn't suitable but some other matching order would fit their needs *exactly*.

You could tackle the complication, or just complete your work and assume somebody else will find a less intrusive way to make it configurable. For that, your code would have to be simple and easily understood - which is something that would make the code more useful in the long term anyway.

- At the moment, the code is completely removed from dependence on sympy,
and can function on its own. As such, I'm planning on making it a separate
library that can be used by SymPy as a dependency. I understand we were
against that in the past, but this may be changing? I'd rather not maintain
two separate codebases.

Heh. I'm seeing SymPy spin-offs of that kind happening lately.
I suspect SymPy could profit from rephrasing its "avoid external dependencies" policy, but that's a different discussion, and possibly one that can't be done before some more investigation has been done.

Regards,
Jo

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/547770CF.2050602%40durchholz.org.
For more options, visit https://groups.google.com/d/optout.

Reply via email to