Hi, I'm experimenting with various implementations of finite automata. I think the following is interesting, but gives rise to a question, which might actually be silly. (Forgive me if it is.)
First, I can define a state as a recursive data type: data FaState l = FaState {label :: l, acceptQ :: Bool, trans :: [[FaState l]]} where l is the type of an arbitrary label. (This will become important a bit later on.) The trans is the adjacency list for state automaton: trans!!i is a list of all states reachable via the symbol i from the current state. (I use a list rather than an array because I typically work with binary alphabets, so the use of Arrays doesn't seem justified.) I make trans!!i a list so that I can represent nondeterminism. Then a finite automaton is merely a list of starting FaStates: type FinAu l = [FaState l] So, for example, an automaton equivalent to the regular expression 0*(0(0+1)*)* could be defined as: dom18 :: FinAu Int dom18 = [one,two] where one = FaState 1 True [[two],[]] two = FaState 2 True [[one],[one]] So, with this machinery in place we can define most of the classical constructions quite easily. For example, to check whether a given string is accepted by an automaton: finAuAcceptStringQ :: FinAu l -> [Int] -> Bool finAuAcceptStringQ startStates string = or [faStateAcceptStringQ s string | s <- startStates] where faStateAcceptStringQ (FaState _ acceptQ _) [] = acceptQ faStateAcceptStringQ (FaState _ _ trans) (a:as) | null followerStates = False | otherwise = or [ faStateAcceptStringQ s as | s <- followerStates ] where followerStates = if a > (length trans) - 1 then [] else trans!!a and the classical nfa-to-dfa power-set (determinization) construction may be implemented quite nicely with: finAuDeterminize :: (Ord l) => FinAu l -> FinAu [l] finAuDeterminize startStates = [f startStates] where f :: (Ord l) => [FaState l] -> FaState [l] f states = FaState label' acceptQ' trans' where label' = sort (map label states) acceptQ' = or (map acceptQ states) trans' = map (singleton.f.concat) (transpose (map trans states)) singleton a = [a] Now, we get to the heart of my question. This definition is quite lovely, recursive and lazy, but it doesn't actually capture the recurrent structure of the resulting automata. That is, if we were to compute: finAuAcceptStringQ (finAuDeterminize dom18) somestring then finAuDeterminize would keep spinning out, generating more and more states, which would actually be equivalent (and, have identical state labels). So an idea is that we can assume the state labels induce an equivalence relation on our states and define something like: finAuConsolidate :: (Eq l) => FinAu l -> FinAu l finAuConsolidate startStates = [ newStates!!i | l <- map label startStates, i <- elemIndices l (map fst assocs) ] where assocs = finAuStateLabelAssoc startStates newStates = map (f.snd) assocs f oldState@(FaState l acceptQ trans) = FaState l acceptQ (map (map (g.label)) trans) g label = case elemIndex label (map fst assocs) of Just i -> newStates!!i Nothing -> error "odd error in finAuConsolidate" provided we have defined: finAuStateLabelAssoc :: (Eq l) => FinAu l -> [(l,FaState l)] finAuStateLabelAssoc startStates = foldl f [] startStates where f knownStatesAssoc state@(FaState label _ trans) | isJust (lookup label knownStatesAssoc) = knownStatesAssoc | otherwise = foldl f ((label,state):knownStatesAssoc) followers where followers = concat trans So this is the heart of my question: is this the best way to proceed? Will the list lookups (!!) of finAuConsolidate be used only once in the construction or will they be perpetually used in something like: finAuAcceptStringQ ((finAuConsolidate.finAuDeterminize) dom18) somestring? That is, how much of a difference would it make if I were to use an Array instead? Thanks, Carl _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell