Re: [Caml-list] How to simplify an arithmetic expression ?

2011-10-02 Thread Gabriel Scherer
In my experience, the OCaml code doing recursive call and pattern
matching is a relatively bad way to reason about such rewrite systems.
Your questions are extremely pertinent, and relatively difficult to
answer in general.

For a start, I think your code indeed repeats useless traversals. This
can be seen syntactically by the nesting of two normalForm calls, such
as:

  | (e, Constant b) - normalForm (Plus (Constant b, normalForm e))

You reduce e to a normal form, then repeat the reduction on some
expression containing e. The outer call will surely re-traverse (the
normal form of) e, which is useless here.

One approach I like for such simplifications is the normalization by
evaluation approach. The idea is to define a different representation
of normal forms of your system as semantic values; I mean a
representation that has a meaning in itself and not just what's left
after this arbitrary transformation; in your case, that could be
multivariate polynomials (defined as an independent datatype). Then
you express your normalization algorithm as an evaluation of your
expression into semantic values; you can reify them back into the
expression datatype, and if you did everything right you get normal
forms (in particular, normalizing a reified value will return exactly
this value). The main difficulty is to understand what are the normal
forms you're looking for; then the code is relatively easy and can be
made efficient.

I'm afraid my explanation may be a bit too abstract and high-level. Do
not hesitate to ask for more concrete details.

On Sun, Oct 2, 2011 at 1:51 PM, Diego Olivier Fernandez Pons
dofp.oc...@gmail.com wrote:
     OCaml list,
 It's easy to encapsulate a couple of arithmetic simplifications into a
 function that applies them bottom up to an expression represented as a tree
 let rec simplify = function
     | Plus (e1, e2) -
         match (simplify e1, simplify e2) with
              | (Constant 0, _) - e2
 With a couple well known tricks like pushing constants to the left side and
 so on...
 How can I however guarantee that
     1. My final expression reaches a kind of minimal normal form
     2. My set of simplifications is optimal in the sense it doesn't traverse
 subtrees without need
 Here is my current simplifier and I have no way of telling if it really
 simplifies the expressions as much as possible and if it does useless passes
 or not
 type expr =
     | Constant of float
     | Plus of expr * expr
     | Minus of expr * expr
     | Times of expr * expr
     | Variable of string
 let rec normalForm = function
     | Minus (e1, e2) - normalForm (Plus (normalForm e1, Times (Constant
 (-1.0), normalForm e2)))
     | Plus (e1, e2) -
         begin
         match (normalForm e1, normalForm e2) with
             | (Constant a, Constant b) - Constant (a +. b)
             | (Constant 0.0, e) - normalForm e
             | (e, Constant b) - normalForm (Plus (Constant b, normalForm
 e))
             | (Constant a, Plus (Constant b, e)) - Plus (Constant (a +. b),
 normalForm e)
             | (Plus (Constant a, e1), Plus (Constant b, e2)) - Plus
 (Constant (a +. b), normalForm (Plus (normalForm e1, normalForm e2)))
             | (Variable a, Variable b) when a = b - Times (Constant 2.0,
 Variable a)
             | (Times (Constant n, Variable b), Variable a) when a = b -
 Times (Constant (n +. 1.0), Variable a)
             | (Variable a, Times (Constant n, Variable b)) when a = b -
 Times (Constant (n +. 1.0), Variable a)
             | (Times (Constant n, Variable a), Times (Constant m, Variable
 b)) when a = b - Times (Constant (n +. m), Variable a)
             | other - Plus (e1, e2)
         end
     | Times (e1, e2) -
         begin
         match (normalForm e1, normalForm e2) with
             | (Constant a, Constant b) - Constant (a *. b)
             | (Constant 0.0, e) - Constant 0.0
             | (Constant 1.0, e) - e
             | (e, Constant b) - normalForm (Times (Constant b, normalForm
 e))
             | (Constant a, Times (Constant b, e)) - Times (Constant (a *.
 b), e)
             | other - Times (e1, e2)
          end
     | x - x
 let (++) = fun x y - Plus (x, y)
 let ( ** ) = fun x y - Times (x, y)
 let ( --) = fun x y - Minus (x, y)
 let f = function fl - Constant fl
 let x = Variable x
 let h = fun i - f i ** x -- f i ** f i ** x ++ (f 3.0 ++ f i)
 let e = List.fold_left (fun t i - Plus (t, h i)) (f 0.0) [1.0; 2.0; 3.0;
 4.0; 5.0]
 normalForm e

         Diego Olivier



-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] How to simplify an arithmetic expression ?

2011-10-02 Thread Ernesto Posse
 On Sun, Oct 2, 2011 at 1:51 PM, Diego Olivier Fernandez Pons
 dofp.oc...@gmail.com wrote:
     OCaml list,
 It's easy to encapsulate a couple of arithmetic simplifications into a
 function that applies them bottom up to an expression represented as a tree
 let rec simplify = function
     | Plus (e1, e2) -
         match (simplify e1, simplify e2) with
              | (Constant 0, _) - e2
 With a couple well known tricks like pushing constants to the left side and
 so on...
 How can I however guarantee that
     1. My final expression reaches a kind of minimal normal form
     2. My set of simplifications is optimal in the sense it doesn't traverse
 subtrees without need
 Here is my current simplifier and I have no way of telling if it really
 simplifies the expressions as much as possible and if it does useless passes
 or not
 type expr =
     | Constant of float
     | Plus of expr * expr
     | Minus of expr * expr
     | Times of expr * expr
     | Variable of string
 let rec normalForm = function
     | Minus (e1, e2) - normalForm (Plus (normalForm e1, Times (Constant
 (-1.0), normalForm e2)))
     | Plus (e1, e2) -
         begin
         match (normalForm e1, normalForm e2) with
             | (Constant a, Constant b) - Constant (a +. b)
             | (Constant 0.0, e) - normalForm e
             | (e, Constant b) - normalForm (Plus (Constant b, normalForm
 e))
             | (Constant a, Plus (Constant b, e)) - Plus (Constant (a +. b),
 normalForm e)
             | (Plus (Constant a, e1), Plus (Constant b, e2)) - Plus
 (Constant (a +. b), normalForm (Plus (normalForm e1, normalForm e2)))
             | (Variable a, Variable b) when a = b - Times (Constant 2.0,
 Variable a)
             | (Times (Constant n, Variable b), Variable a) when a = b -
 Times (Constant (n +. 1.0), Variable a)
             | (Variable a, Times (Constant n, Variable b)) when a = b -
 Times (Constant (n +. 1.0), Variable a)
             | (Times (Constant n, Variable a), Times (Constant m, Variable
 b)) when a = b - Times (Constant (n +. m), Variable a)
             | other - Plus (e1, e2)
         end
     | Times (e1, e2) -
         begin
         match (normalForm e1, normalForm e2) with
             | (Constant a, Constant b) - Constant (a *. b)
             | (Constant 0.0, e) - Constant 0.0
             | (Constant 1.0, e) - e
             | (e, Constant b) - normalForm (Times (Constant b, normalForm
 e))
             | (Constant a, Times (Constant b, e)) - Times (Constant (a *.
 b), e)
             | other - Times (e1, e2)
          end
     | x - x
 let (++) = fun x y - Plus (x, y)
 let ( ** ) = fun x y - Times (x, y)
 let ( --) = fun x y - Minus (x, y)
 let f = function fl - Constant fl
 let x = Variable x
 let h = fun i - f i ** x -- f i ** f i ** x ++ (f 3.0 ++ f i)
 let e = List.fold_left (fun t i - Plus (t, h i)) (f 0.0) [1.0; 2.0; 3.0;
 4.0; 5.0]
 normalForm e

         Diego Olivier



On Sun, Oct 2, 2011 at 10:08 AM, Gabriel Scherer
gabriel.sche...@gmail.com wrote:
 In my experience, the OCaml code doing recursive call and pattern
 matching is a relatively bad way to reason about such rewrite systems.

Why? In general, reduction in (pure) functional languages is
rewriting. Diego's function does not seem to have side-effects, so why
would this be a bad way?

As I understand it, this style is very much one of the reasons why ML
was designed this way. If we are talking about optimization, then yes,
there may be better ways of doing this, but if we are talking about
correctness, readability, and reasoning, then I don't see why this
style would be considered bad.

 Your questions are extremely pertinent, and relatively difficult to
 answer in general.

 For a start, I think your code indeed repeats useless traversals. This
 can be seen syntactically by the nesting of two normalForm calls, such
 as:

  | (e, Constant b) - normalForm (Plus (Constant b, normalForm e))

 You reduce e to a normal form, then repeat the reduction on some
 expression containing e. The outer call will surely re-traverse (the
 normal form of) e, which is useless here.

Another problem might be that the rewriting rules do not seem to be
exhaustive although I haven't checked very carefully.


 One approach I like for such simplifications is the normalization by
 evaluation approach. The idea is to define a different representation
 of normal forms of your system as semantic values; I mean a
 representation that has a meaning in itself and not just what's left
 after this arbitrary transformation; in your case, that could be
 multivariate polynomials (defined as an independent datatype). Then
 you express your normalization algorithm as an evaluation of your
 expression into semantic values; you can reify them back into the
 expression datatype, and if you did everything right you get normal
 forms (in particular, normalizing a reified value will return exactly
 this value). The main difficulty is to understand what are the normal
 forms 

Re: [Caml-list] How to simplify an arithmetic expression ?

2011-10-02 Thread Xavier Leroy
On Sun, Oct 2, 2011 at 10:08 AM, Gabriel Scherer wrote:
 One approach I like for such simplifications is the normalization by
 evaluation approach.

NBE is neat, but I'm skeptical that it will work out of the box here:
if you apply NBE to a standard evaluator for arithmetic expressions,
it's not going to take advantage of associativity and distributivity
the way Diego wants.

On 10/02/2011 05:09 PM, Ernesto Posse wrote:
 In general, whenever you have an algebraic
 structure with normal forms, normal forms can be obtained by
 equational reasoning: using the algebra's laws as rewriting rules. 

Yes, writing down a system of equations is the first thing to do.
But, to obtain a normalization procedure, you need to orient those
rules and complete them (in the sense of Knuth-Bendix completion) with
extra rules to derive a confluent, terminating rewriting system.

Here is a good, down-to-earth introduction to Knuth-Bendix completion:

A.J.J. Dick, An Introduction to Knuth-Bendix Completion
http://comjnl.oxfordjournals.org/content/34/1/2.full.pdf

And here is a solid textbook on rewriting systems:

Franz Baader and Tobias Nipkow. Term Rewriting and All That.
http://www4.in.tum.de/~nipkow/TRaAT/

Hope this helps,

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] How to simplify an arithmetic expression ?

2011-10-02 Thread Gabriel Scherer
 Below is a quick tentative implementation of NbE, on a slightly
 restricted expression type (I removed the not-so-interesting Minus
 nodes).

Sorry, I forgot to give a small example of what the implementation
does. Really the obvious thing, but it may not be so obvious just
looking at the code.
Here is an example of (X*2+Y)*(3+(Y*2+Z)) being 'rewritten' to
(1*Y)*Z+((2*Y)*Y+(3*Y+((2*X)*Z+((4*X)*Y+6*X.
In a more readable way, it computes that (2X+Y)*(3+2Y+Z) equals
YZ+2Y²+3Y+2XZ+4XY+6X.
(The ordering of the result monomials is arbitrary, as implemented by
Pervasives.compare, but a more human-natural order could be
implemented.)

  # test1;;
  - : expr = Plus (Times (Variable X, Constant 2.), Variable Y)
  # test2;;
  - : expr =
  Plus (Constant 3., Plus (Times (Variable Y, Constant 2.), Variable Z))
  # show (eval test1);;
  - : (Poly.key * float) list = [([Y], 1.); ([X], 2.)]
  # show (eval test2);;
  - : (Poly.key * float) list = [([Z], 1.); ([Y], 2.); ([], 3.)]
  # show (eval (Times(test1,test2)));;
  - : (Poly.key * float) list =
  [([Y; Z], 1.); ([Y; Y], 2.); ([Y], 3.); ([X; Z], 2.);
   ([X; Y], 4.); ([X], 6.)]
  # reify (eval (Times(test1,test2)));;
  - : expr =
  Plus (Times (Times (Constant 1., Variable Y), Variable Z),
   Plus (Times (Times (Constant 2., Variable Y), Variable Y),
Plus (Times (Constant 3., Variable Y),
 Plus (Times (Times (Constant 2., Variable X), Variable Z),
  Plus (Times (Times (Constant 4., Variable X), Variable Y),
   Times (Constant 6., Variable X))


On Sun, Oct 2, 2011 at 6:48 PM, Gabriel Scherer
gabriel.sche...@gmail.com wrote:
 On Sun, Oct 2, 2011 at 6:32 PM, Xavier Leroy xavier.le...@inria.fr wrote:
 NBE is neat, but I'm skeptical that it will work out of the box here:
 if you apply NBE to a standard evaluator for arithmetic expressions,
 it's not going to take advantage of associativity and distributivity
 the way Diego wants.

 My idea was to use a semantic domain which is a quotient over those
 associativity and distributivity laws. If you choose a canonical
 representation of multivariate polynomials (sums of product of some
 variables and a coefficient) and compute on them, you get
 associativity and distributivity for free. But indeed, the rewriting
 that happens implicitly may not implement the exact same rules Diego
 had in mind. In particular, canonical polynomial representations may
 be much bigger than the input expression, due to applying
 distributivity systematically.

 Not all rewrite systems are suitable for NbE. Most reasonable semantic
 domains probably induce very strong rewrite rules, or none at all. For
 the middle ground, finding a suitable semantic domain is probably just
 as hard as completing the rewrite system as you suggest.

 On 10/02/2011 05:09 PM, Ernesto Posse wrote:
 If we are talking about optimization, then yes,
 there may be better ways of doing this, but if we are talking about
 correctness, readability, and reasoning, then I don't see why this
 style would be considered bad.

 Optimization is important here. By calling the deep-recursive
 transformation twice in a case, you get an exponential algorithm which
 can be so slow and memory-hungry that impracticality borders
 incorrectness.


 On 10/02/2011 05:09 PM, Ernesto Posse wrote:
 So in principle at least, shouldn't Diego's problem be solvable this way,
 without the need for a special semantic domain for normal forms? When
 would the normalization by evaluation approach be preferable? Can you
 show a small example?

 Yes, implementing the rewrite system directly is possible and probably
 a more precise way to get a result (in particular if you already know
 the rewrite rules you wish to have, but not the semantic domain their
 normal forms correspond to). I'm not sure it's simpler.

 Below is a quick tentative implementation of NbE, on a slightly
 restricted expression type (I removed the not-so-interesting Minus
 nodes).
 You can normalize an expression `e` with `reify (eval e)`.
 `show (eval e)` is a representation whose toplevel printing is more
 redable, which helps testing.

  type var = string
  type expr =
      | Constant of float
      | Plus of expr * expr
      | Times of expr * expr
      | Variable of var

  (* multivariate polynomials: maps from multiset of variables to coefficients
     2*X²*Y + 3*X + 1 = {[X,X,Y]↦2, [X]↦3, ∅↦1}
  *)
  module MultiVar = struct
    (* multisets naively implemented as sorted lists *)
    type t = var list
    let compare = Pervasives.compare
  end
  module Poly = Map.Make(MultiVar)
  type value = float Poly.t

  let sort vars = List.sort String.compare vars

  let constant x = Poly.singleton [] x
  let variable v = Poly.singleton [v] 1.

  (* BatOption.default *)
  let default d = function
    | None - d
    | Some x - x

  let plus p1 p2 =
    let add_opt _vars c1 c2 =
      Some (default 0. c1 +. default 0. c2) in
    Poly.merge add_opt p1 p2

  let times p1 p2 = (* naive implementation *)
    let