Am Montag 19 April 2010 01:03:14 schrieb Arnoldo Muller: > Hello all: > > I want to generate some hamming distance statistics about a set of > strings. As explained in another e-mail in this list, I used the > following code to call the > functions: > (exampl holds the list of strings of size w) > filter (\x -> x /= 0) $ map (uncurry hammingX) [(xs, ys) | xs <- exampl, > ys <- exampl] > > I have two hamming functions: > -- hamming distance for variable length strings > hamming :: String -> String -> Int > hamming x y = hamming' x y 0 > where > hamming' [] _ !c = c > hamming' _ [] !c = c > hamming' (x:xs) (y:ys) !c > | x == y = hamming' xs ys c > | otherwise = hamming' xs ys (c + 1) > > -- function posted in this mailing list > hamming2 :: String -> String -> Int > hamming2 xs ys = length (filter not (zipWith (==) xs ys)) > > I am executing these functions millions of times and the bottleneck of > my program is in them as explained by running in profiling mode with > +RTS -K400M -p -RTS > > The costlier function is the hamming distance > COST CENTRE MODULE %time %alloc > > hamming Distances 66.6 41.9 > > It says that it is performing 41% of the allocations. In the case of > hamming2 the allocations go as far as 52%.
Allocations are cheap, so that's not necessarily a problem. More important is, what's the maximum residency and how much is copied during GC? Are you compiling with -O2 ? > I could understand that > there are allocations in "hamming2" because we are creating pairs, but > in the case of "hamming" there should be no allocation. Why not? I don't know how GHC counts allocations, but everytime you go from (x:xs) to xs, you need a new pointer to the tail. If that counts as allocation, hamming must allocate a lot, too. > > How can I execute my hamming functions without allocating memory? > > Best regards, > > Arnoldo Muller _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe