Steffen Mazanek wrote: > apfelmus wrote >> The key point of the dynamic programming algorithm is indeed to memoize >> the results gs i j for all pairs of i and j. In other words, the insight >> that yields a fast algorithm is that for solving the subproblems gs i j >> (of which there are n^2), solution to smaller subproblems of the same >> form are sufficient. Tabulation is a must. > > I underst and this, however I thought it would be possible to use the > automatic collapsing of the termgraphs in some way.
Well, some things have to be left to the programmer :), especially the choice of trading space for time. Note that there are very systematic and natural ways to derive dynamic programming algorithms in functional languages. In a sense, much of the work of R. Bird centers this topic. The book "Algebra of Programming" http://web.comlab.ox.ac.uk/oucl/research/pdt/ap/pubs.html#Bird-deMoor96:Algebra is one of the cornerstones. The systematic derivation of dynamic programming algorithms has been rediscovered in a more direct but less general fashion http://bibiserv.techfak.uni-bielefeld.de/adp/ > Of course, you can still choose how to represent the table. There's a >> nice higher order way to do that >> >> tabulate :: (Int -> Int -> a) -> (Int -> Int -> a) >> >> gs = tabulate gs' >> where >> gs' 1 j = ... uses gs x y for some x y ... >> gs' i j = ... ditto ... > > > Thank you for this explanation. Your approach is not very concise either > but it does not pollute the algorithm so much. Oh? It's concise enough for me :) The nice thing about an explicit 'tabulate' is that you can separate the table and the entry calculations completely. > That would be strange. I mean, no gs i j may have more than two >> elements, namely S or A. The other key point of the CYK algorithm is >> that the sets gs i j are indeed sets and may only contain as many >> elements as there are nonterminals. > > > .... You are right, of course. I have tried a nub before the list > comprehension however this is evaluated too late. Yes, the nub has to eliminate storing a nonterminal "for every k". But this can be done in advance by noting that we're only interested in whether there exists at least one k [nt | ..., not $ null [k | k<-[1..i-1], a `elem` gs k j, b `elem` gs (i-k) (j+k)]] and don't want to emit a nonterminal for which k [nt | ..., k <- [1..i-1], a `elem` gs k j, b `elem` gs (i-k) (j+k)] > I should really use sets, however, I would miss the list comprehension > syntactic sugar sooo much. Is there something similar for real Data.Set? Not that I knew of, but you can always use fromList [nt | ...] Note that the Data.Set can be thought of as being part of the tabulation. In effect, you really deal with a 3-dimensional truth table gs i j nt = substring starting at j of length i can be derived by nonterminal nt Here's an implementation of the tabulation with Data.Set tabulate :: (Int -> Int -> NT -> Bool) -> (Int -> Int -> NT -> Bool) tabulate gs' = \i j nt -> nt `member` table ! (i,j) where table = array bnds [(ij, mkSet $ gs' i j) | ij@(i,j) <- range bnds] mkSet chi = fromList [nt | nt <- nts, chi nt] And here's the memoized function gs = tabulate gs' where gs' 1 j nt = any [True | Left t <- productions nt, w !! (j-1) == t] gs' i j nt = any [True | Right (a,b) <- productions nt, gs k j a, gs (i-k) (j+k) b] It assumes a function (dependent on the grammer at hand) productions :: NT -> [Either T (NT, NT)] that returns the terminal and nonterminal productions for a given nonterminal. Regards, apfelmus _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe