Re: Laziness (was: [Haskell-cafe] Performance problem with random numbers)
On Sun, Oct 14, 2007 at 11:54:54PM +0200, ntupel wrote: On Sat, 2007-10-13 at 09:56 -0400, Brandon S. Allbery KF8NH wrote: Now you need to start forcing things; given laziness, things tend to only get forced when in IO, which leads to time being accounted to the routine where the forcing happened. If random / randomR are invoked with large unevaluated thunks, their forcing will generally be attributed to them, not to functions within the thunks. (Yes, this means profiling lazy programs is a bit of a black art.) After more testing I finally realized how right you are. It appears that my problem is not related to random/randomR but only to laziness. I came up with a test that doesn't use random numbers at all and still needs about 2.5 seconds to complete (it is really just meaningless computations): Here's a modified version of your code that prints out a real result, by using sum rather than seq to force the computation: module Main where main :: IO () main = do let n = 100 :: Int print $ sum (take n $ test 1 [1,2..]) test :: Int - [Int] - [Int] test t g = let (n, g') = next t g in n:test t g' next :: Int - [Int] - (Int, [Int]) next x (y:ys) = let n = func y in if n = 0.5 then (x, ys) else (0, ys) where func x = fromIntegral x / (10 ^ len x) where len 0 = 0 len n = 1 + len (n `div` 10) On my computer this takes 4 seconds to run. I can speed it up by an order of magnitude by writing code that is friendlier to the compiler: module Main where main :: IO () main = do let n = 100 :: Int print $ sum (take n $ test 1 [1,2..]) test :: Int - [Int] - [Int] test t g = map f g where f :: Int - Int f y = if func y = 0.5 then t else 0 func :: Int - Double func x = fromIntegral x / mypow x mypow 0 = 1 mypow n = 10*(mypow (n `div` 10)) Switching to map and simplifying the structure gained me 30% or so, but the big improvement came from the elimination of the use of (^) by writing mypow (ill-named). I have no idea if this example will help your actual code, but it illustrates that at least in this example, it's pretty easy to gain an order of magnitude in speed. (That func is a weird function, by the way.) Incidentally, implementing the same program in C, I get: #include stdio.h int test(int, int); double func(int); int mypow(int); int mypow(int n) { double result = 1; while (n0) { result *= 10; n /= 10; } return result; } double func(int x) { return x / (double) mypow(x); } int test(int t, int y) { if (func(y) = 0.5) { return t; } else { return 0; } } int main() { int i; int sum = 0; for (i=0;i100;i++) { sum += test(1,i); } printf(sum is %d\n, sum); return 0; } Which runs more than 10 times faster than my Haskell version, so there's obviously still a lot of room for optimization. :( Incidentally, a version written in C that uses pow for the 10^(len n) runs in only half the time of my haskell version (five time the time of the C version I give)--confirming that pow is indeed a very expensive operation (as I already knew) and that if you call the pow function it *ought* to dominate your timing. But we've also still clearly got some seriously painful loop overhead. :( -- David Roundy Department of Physics Oregon State University ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Laziness (was: [Haskell-cafe] Performance problem with random numbers)
On Mon, 2007-10-15 at 10:48 -0400, David Roundy wrote: I have no idea if this example will help your actual code, but it illustrates that at least in this example, it's pretty easy to gain an order of magnitude in speed. (That func is a weird function, by the way.) Thanks for your reply David, Unfortunately my original problem was that System.Random.{random, randomR} is used instead of all these weird test functions that I came up with during experimentation. And I can't force anything inside StdGen so I see no way of speeding up the original program sans replacing the random number generator itself. When I did that I became about 4 times faster than with System.Random but still an order of magnitude slower than for instance by using the Java implementation (and I can confirm that (^) is *very* expensive in this context). Many thanks again, Thoralf ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Laziness (was: [Haskell-cafe] Performance problem with random numbers)
On Sat, 2007-10-13 at 09:56 -0400, Brandon S. Allbery KF8NH wrote: Now you need to start forcing things; given laziness, things tend to only get forced when in IO, which leads to time being accounted to the routine where the forcing happened. If random / randomR are invoked with large unevaluated thunks, their forcing will generally be attributed to them, not to functions within the thunks. (Yes, this means profiling lazy programs is a bit of a black art.) After more testing I finally realized how right you are. It appears that my problem is not related to random/randomR but only to laziness. I came up with a test that doesn't use random numbers at all and still needs about 2.5 seconds to complete (it is really just meaningless computations): module Main where import Data.List main :: IO () main = do let n = 100 :: Int print $ foldl' (\x y - seq y x) 0 (take n $ test 1 [1,2..]) test :: Int - [Int] - [Int] test t g = let (n, g') = next t g in n:test t g' next :: Int - [Int] - (Int, [Int]) next x (y:ys) = let n = func y in if n = 0.5 then (x, ys) else (0, ys) where func x = fromIntegral x / (10 ^ len x) where len 0 = 0 len n = 1 + len (n `div` 10) Now my problem still is, that I don't know how to speed things up. I tried putting seq and $! at various places with no apparent improvement. Maybe I need to find a different data structure for my random module and lazy lists are simply not working well enough here? Thanks, Thoralf ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Laziness (was: [Haskell-cafe] Performance problem with random numbers)
On Oct 14, 2007, at 17:54 , ntupel wrote: Now my problem still is, that I don't know how to speed things up. I tried putting seq and $! at various places with no apparent improvement. Maybe I need to find a different data structure for my random module and lazy lists are simply not working well enough here? Unfortunately I'm not so good at that myself. Even more unfortunately, my understanding is that randomly using seq and/or $! not only usually doesn't help, but can actually make things slower; and to do it right, you need to refer to the simplified Core Haskell code generated by GHC. And understanding *that* requires rather more familiarity with Core than I have. :/ -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED] system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED] electrical and computer engineering, carnegie mellon universityKF8NH ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Laziness (was: [Haskell-cafe] Performance problem with random numbers)
On Sun, 2007-10-14 at 18:14 -0400, Brandon S. Allbery KF8NH wrote: On Oct 14, 2007, at 17:54 , ntupel wrote: Now my problem still is, that I don't know how to speed things up. I tried putting seq and $! at various places with no apparent improvement. Maybe I need to find a different data structure for my random module and lazy lists are simply not working well enough here? Unfortunately I'm not so good at that myself. Even more unfortunately, my understanding is that randomly using seq and/or $! not only usually doesn't help, but can actually make things slower; and to do it right, you need to refer to the simplified Core Haskell code generated by GHC. And understanding *that* requires rather more familiarity with Core than I have. :/ A lot of times just unfolding a few evaluations by hand (perhaps mentally) will point out issues readily and readily suggest there solution. After a while you will know what kinds of things are problematic and not write such code to begin with. Unfortunately, this is not something widely and well understood and is not part of almost any of the available educational material for Haskell. Programming in a lazy language is more different than programming in an eager one than almost any resource states. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe