On Mar 6, 2006, at 1:05 PM, Harry Chesley wrote:
This is more of an algorithm question than a language question, but
any insights would be much appreciated.
The problem is to input a series of programs and find previous
occurrences of the same algorithm.
The programs consist of a set of input parameters (a, b, c, ...),
and a set of side-effect-free functions (f, g, h, ...). Since the
functions are side-effect-free, they can be reordered without
changing the algorithm ("f(a), g(b)" is the same as "g(b), f(a)").
Subsequent calls of the same function with the same parameters have
no effect ("f(a), f(a)" is the same as "f(a)"); in fact, you can
assume duplicates have been removed in earlier processing.
But here's the thing that makes it hard (at least for me): two
programs are considered the same if they can be made to match by
rearranging the order of the input parameters. I.e., "f(a), g(b)"
is the same as "f(b), g(a)". Although parameters can be reordered,
they cannot be substituted ("f(a), g(b)" is _not_ the same as "f
(a), g(a)").
Example: "f(a), g(b), h(a, b)" is the same as "f(b), g(a), h(b, a)"
but _not_ the same as "f(a), g(b), h(b, a)".
I need a way to compare the input programs, and preferably to order
them.
In Haskell terms, given the programs are represented by a type
Prog, I want Prog to be a member of class Ord, letting me use tools
like Data.Map to look up information about previous instances.
One thing you could try is to develop a "canonical representation",
ie, an exemplar from the equivalence class which can be calculated
from any member of that class. You could define a lexicographical
order for variables and define the canonical representation such that
the first occurrence of each variable occurs in lexicographical
order. Then you define an ordering based on the canonical
representation. If your representation of programs is simple enough,
you can probably just use the derived Ord instance and just make sure
to always use the canonical representation.
I can do a brute-force compare by trying all the parameter
permutations, but that only gives me Eq, not Ord, and seems
terribly inelegant as well.
It's hard to be more specific without more details about the language
and the problem. Your comments make it sound like you are dealing
with an imperative language, but it's hard to tell. In some cases,
language analysis is easier if you do a dataflow analysis first and
then do your manipulations on the resulting graphs; you might try
taking that tack.
Rob Dockins
Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
-- TMBG
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe