Here is my solution that solves the problem optimally. Observe that
given a sequence we can do one of two things: (1) represent it as a
repetition or (2) represent it as a concatenation of repetitions. If
we had two functions to calculate all the ways of doing (1) and (2) we
could solve the problem as follows:

(defn represent [s] (apply min-key cost (concat (reps s) (concats
s))))

That is, we pick out the minimum cost way of representing s as a
repetition or as a concatenation of repetitions, where reps returns
all repetitions and concats returns all concatenations.

The cost function to count the number of symbols is:

(defn cost [repr] (apply + (map #(count (% 1)) repr)))

Of course you can use another cost function.

The function reps can be defined as follows:

(defn reps [s]
  (map vector (filter #(= s (apply concat (repeat (first %) (second
%))))
                      (map #(vector (/ (count s) %) (take % s)) (range 1 (+ 1 
(count
s)))))))

For example, reps on ababab gives: [3 ab] and [1 ababab]. Note that
this algorithm is very naive: it tries all repetitions [n (take n s)]
and eliminates invalid ones. The code may look involved but the
algorithm is rather simplistic.

The function concats is as follows:

(defn concats [s]
  (map #(apply concat (map represent (split-at % s))) (range 1 (-
(count s) 1))))

It works like this. We split s at all possible boundaries: ababab => a
babab, ab abab, aba bab, abab ab, ababa b. Then we recursively
calculate the optimal representation of each of the pairs, and
concatenate the representations. This gives us all ways to represent s
as a concatenation of optimal representations.

That is all it takes :)

(represent '(a b b a b b a)) => ([1 (a)] [2 (b b a)])

Unfortunately the algorithm is exponential time. We can fix this by
memoizing represent:

(memoize represent)

Now the algorithm is polynomial time. It is still rather slow for
large sequences (but there is a lot of room for optimization), and
gives stack overflows. You can fix this by eliminating the recursion
and filling the memoization table iteratively.

Jules

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to