Alberto Ruiz <[EMAIL PROTECTED]> writes: > My experience is that Haskell allocation time is very fast and usually > negligible in most non trivial matrix computations. > > A good thing about > > sum v = constant 1 (dim v) <.> v > > is that a constant vector is efficiently created internally (not from > an intermediate Haskell list), and the inner product will be computed > by the possibly optimized blas version available in your machine. > > You can also write simple definitions like the next one for the > average of the rows of a matrix as a simple vector-matrix product: > > mean m = w <> m > where w = constant k r > r = rows m > k = 1 / fromIntegral r > > I prefer high level "index free" matrix operations to C-like > code. Definitions are clearer, have less bugs, and are also more > efficient if you use optimized numerical libs. > > In any case, many algorithms can be solved by the recursive approach > described by Tomas.
After reading don's blog, I tried to make a test with both methods. The code is very simple, as following module Main where import System import Numeric.LinearAlgebra vsum1 :: Vector Double -> Double vsum1 v = constant 1 (dim v) <.> v vsum2 :: Vector Double -> Double vsum2 v = go (d - 1) 0 where d = dim v go :: Int -> Double -> Double go 0 s = s + (v @> 0) go !j !s = go (j - 1) (s + (v @> j)) mean :: Vector Double -> Double mean v = vsum v / fromIntegral (dim v) where vsum = vsum1 main :: IO () main = do fn:nrow:ncol:_ <- getArgs print . mean . flatten =<< fromFile fn (read nrow, read ncol) Compile it with "-O2 -optc-O2", and run with a data set of length 5000000. The results are with "vsum1": 80,077,984 bytes allocated in the heap 2,208 bytes copied during GC (scavenged) 64 bytes copied during GC (not scavenged) 40,894,464 bytes maximum residency (2 sample(s)) %GC time 0.0% (0.0% elapsed) Alloc rate 35,235,448 bytes per MUT second ./vsum1 huge 5000000 1 2.25s user 0.09s system 99% cpu 2.348 total This is reasonable, exactly two copies of vector with size of 40MB. with "vsum2": 560,743,120 bytes allocated in the heap 19,160 bytes copied during GC (scavenged) 15,920 bytes copied during GC (not scavenged) 40,919,040 bytes maximum residency (2 sample(s)) %GC time 0.3% (0.3% elapsed) Alloc rate 222,110,261 bytes per MUT second ./mean2 huge 5000000 1 2.53s user 0.06s system 99% cpu 2.598 total This is strange. A lot of extra usage of heap? Probably because '@>' is not efficient? So it looks like the inner-product approach wins with a fairly margin. -- c/* __o/* <\ * (__ */\ < _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe