Dear all,

I have implemented a small module to generate random items with a given
probability distribution using the alias approach [1] and unfortunately
compared to similar implementations in C++ or Java it is about 10 times
slower. I have to confess that I am still in the early stages of writing
Haskell programs so there is hopefully something I can do about it and I
hope some helpful soul on this list can give me a hint. 

I have attached my implementation and a small testing routine which runs
in about 5 seconds on my machine (when compiled with -O2) whereas my
Java-Implementation finishes in about 0.48 seconds. Profiling indicates
that the time is mostly spend in System.Random.random and
System.Random.randomR so I wonder if these are slow or what else might
cause the relative slowness of my implementation.

Many thanks for your responses,
Thoralf

PS: I would also appreciate any feedback about the module from a design
perspective. I bet I miss lots of good Haskell idioms.

----------------
[1] The alias methods moves probability mass of a non-uniform
probability distribution around to create a uniform distribution. Lets
say you have three items "a", "b", and "c" with distribution [0.2, 0.1,
0.7] then a uniform distribution would assign 1/3 to each so "a" and "b"
need to be "filled up" with exactly the same probability mass that "c"
has too much. Then 2 uniform random numbers are generated. The first one
to pick an item and the second one to pick either the item itself if the
value is in the original part or the alias otherwise. A much better
explanation can be found on the web somewhere. Anyway it should not
matter with regards to the performance problem I have.
module Main where

import Random
import System.Random
import Data.Array
import Data.List


main = do g <- getStdGen
          let k = ["a", "b", "c"]
              n = 1000000 :: Int
              t = setup k [0.2, 0.1, 0.7] :: (Array Int String, Array Int String, Array Int Double)
          print $ foldl' (\a b -> if b == "b" then a + 1 else a) 0 (take n $ randomList t g) / fromIntegral n

module Random (setup, next, randomList) where

import System.Random hiding (next)
import Data.Array.IArray


-- Given a list of items and a list of their propabilities generate a tripel
-- consisting of the values vector, the alias vector and the relative propabilities
-- vector which is used in applications of next, etc.
setup :: (Ord a, IArray a2 a, IArray a1 e, Num a) => [e] -> [a] -> (a1 Int e, a1 Int e, a2 Int a)
setup xs ps = 
    let n  = length ps
        xv = listArray (0, n - 1) xs
        rv = listArray (0, n - 1) [fromIntegral n * p | p <- ps]
        (low, high) = splitAt 1 rv (indices rv) [] []
        (a, r) = calcAlias xv xv rv high low
    in
        (xv, a, r)
    where
        -- Return a pair of lists, the first consisting of elements lower than
        -- given threshold and the second with elements greater than threshold,
        -- equal elements are ignored.
        splitAt t v []     l h = (l, h)
        splitAt t v (i:is) l h = case v!i of
            x | x < t     -> splitAt t v is (i:l) h
              | x > t     -> splitAt t v is l (i:h)
              | otherwise -> splitAt t v is l h


-- Given an list of highs and a list of lows, calculate the alias vector and the relative
-- propabilities vector.
calcAlias :: (Ord e, Num e, IArray a e, Ix i, IArray a2 e1, IArray a1 e1) => a2 i e1 -> a1 i e1 -> a i e -> [i] -> [i] -> (a1 i e1, a i e)
calcAlias xv av rv []        _      = (av, rv)
calcAlias xv av rv _         []     = (av, rv)
calcAlias xv av rv hi@(h:hs) (l:ls) =
    let av' = av//[(l, xv!h)]
        rv' = rv//[(h, rv!h + rv!l - 1)]
    in
        if rv'!h >= 1
            then calcAlias xv av' rv' hi ls
            else calcAlias xv av' rv' hs (h:ls)
        

-- Generate a random item according to the given propability distribution as specified
-- by the given tripel which is the result of applying "setup" to a list of items and
-- a list of propabilities.
next :: (IArray a2 e1, IArray a e1, Ord e, IArray a1 e, RandomGen t, Random e) => (a Int e1, a2 Int e1, a1 Int e) -> t -> (e1, t)
next (xs, as, rs) g =
    let n = length $ indices xs
        (x1, g1) = randomR (0, n - 1) g
        (x2, g2) = random g1
        r = rs!x1
    in
        if x2 <= r 
            then (xs!x1, g2) 
            else (as!x1, g2)


-- Generate a infinite list of random items according to the specified propability
-- distribution as given by the triple that results from applying setup to a pair
-- of a list of items and a list of propabilities (see "setup" for details).
randomList :: (Random e, RandomGen t1, IArray a2 e, Ord e, IArray a t, IArray a1 t) => (a Int t, a1 Int t, a2 Int e) -> t1 -> [t]
randomList t g = 
    let (n, g') = next t g 
    in 
        n:randomList t g'

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to