Re: [Haskell-cafe] Memoizing in parallel

Pierre-Etienne Meunier Sun, 08 Nov 2009 06:21:12 -0800

In fact, I could simply write a fibonacci function like this :


fibonacci n=
  fst $ fib n Map.empty
    where
      fib 0 m=(0,m)
      fib 1 m=(1,m)
      fib n m=
        case Map.lookup n m of
          Just x->(x,m)
          Nothing->
            let (n',m')=memoMap (n-2) m
                (n'',m'')=memoMap (n-1) m'
            in
             (n'+n'',Map.insert n (n'+n'') m'')

And I could do it even with the implementation of maps in Data.IntMap.I've tried your solution, and looked at your source code; There is asmall performance difference between your solution and this one :since you rely on laziness, and my structure is really sparse (aO(\sqrt(n)) is used in the best cases), it performs slightly betterthan an array, but this is still too slow : the tree does notrebalance itself, contrarily to what the trees in Data.IntMap do.

But my question is mainly a question about Haskell's parallelism :imagine I'm solving the problem of attributing m pairs of skis to nskiers, while trying to minimize the sum of the differences betweenthe size of the skiers and the size of their skis. With an array, I'dwrite :


import GHC.Arr
import Control.Parallel

n=1000 --skiers
m=1500 --skis

skis=listArray (1,m) [150..]
skiers=listArray (1,n) [139..]
costs=listArray ((1,1), (n,m)) $ map cost $ range ((1,1),(n,m))

cost (i,j)
  | j<i              = 1/0
  | i==1                = abs $ skis!j-skiers!i
  | otherwise   = let a=costs!(i,j-1)
                                 b=(abs $ skis!j-skiers!i)+costs!(i-1,j-1)
                           in
                                (a`par`b)`seq`(min a b)

main=print $ costs!(n,m)

I'd first like to know if the difference in execution time that I canmeasure when profiling really comes from thunks in the array beingexecuted in parallel, or from my imagination (the difference is quitesmall with these numbers of skis and skiers).Now imagine the same problem with at most 1% of the items used. Ifyour computer does not have infinite memory (or you don't want tospend 90% allocating an array in GHC's runtime), how do you writethat, with the memoization table shared, of course ?


Pierre-Etienne

El 08-nov-09, a las 14:08, Luke Palmer escribió:

On Sun, Nov 8, 2009 at 2:51 AM, Pierre-Etienne Meunier
<[email protected]> wrote:

Hi,
I'm designing an algorithm that uses dynamic programming. I'vewritten itwith an array, and it works, but it is still very slow and needsway too
much memory.
Then I realized that the array was very sparse (at most aO(\sqrt(n)) of itssize is actually used). Now I want to rewrite it with a Data.Map,but sinceI do not know a priori what the keys are, I need a mutable refsomewhere.


I don't know how you drew that conclusion.

In fact, no mutable ref is necessary.  Your keys are (or can be mapped
to) integers, since you used an array.  A solution is to use a trie of
integers.   You could, for example, store the values at the nodes of
an infinite tree that looks like:

             0
     1               2
 3       4       5       6
7 8    9  10   11 12   13  14
            ...

There are various implementations of this around.  For a quick
solution, though, you can try the data-memocombinators package:

import qualified Data.MemoCombinators as Memo

let f = Memo.integral go
  where
  go = ... f ...

See how that performs.  It has asymptotically better space performance
for sparse usage, but the devil can be in the constant factors.

Luke


_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memoizing in parallel

Reply via email to