Re: Running out of memory in a simple monad
Simon Marlow [EMAIL PROTECTED] writes: Also, GHCi retains CAFs in the same way as Hugs, the difference is that GHCi can be configured to throw away the results after evaluation (:set +r). If I set this flag, does GHCi discard CAFs during evaluation or at the end of evaluation? Or, to put it another way, do classic examples like module Main(main,primes) where main = print primes primes = ... leak space? -- Alastair ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
Simon Marlow [EMAIL PROTECTED] writes: Also, GHCi retains CAFs in the same way as Hugs, the difference is that GHCi can be configured to throw away the results after evaluation (:set +r). If I set this flag, does GHCi discard CAFs during evaluation or at the end of evaluation? Or, to put it another way, do classic examples like module Main(main,primes) where main = print primes primes = ... leak space? Actually CAF reverting only applies to CAFs in compiled code at the moment, and it happens after evaluation, not during it. The primes example will leak in GHCi (not in plain GHC, though). I guess the right thing to do is to revert them as soon as they're found to be unreferenced, but that's hard - the compiler's symbol table refers to all the top-level bindings, so it would probably have to use weak pointers. The CAF reverting feature was added mainly so that you can recover from doing hGetContents on stdout. Cheers, Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
Alastair Reid writes: The workaround is simple enough: add a dummy argument to the CAF (so that it is not a CAF any more): main _ = loop 5 and then specify the extra argument when invoking it: main () (This is a pretty standard optimisation technique: we're trading time to recompute a result for the space taken to store the result. Coming from other languages where actions (i.e., monadic computations) are not first class values, this is a bit surprising but, from a Haskell perspective, it is completely uniform.) Careful: this isn't guaranteed to turn a CAF into a function. In particular, GHC will optimise away this trick when optimisation is turned on (and perhaps even when it isn't). The point is that adding dummy arguments isn't really a technique that should be relied upon. You might well argue that there ought to be a way to control the operational behaviour of the program w.r.t. CAFs (and in fact sharing in general), and I'd be inclined to agree. Also, GHCi retains CAFs in the same way as Hugs, the difference is that GHCi can be configured to throw away the results after evaluation (:set +r). Cheers, Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Running out of memory in a simple monad
David Bergman [EMAIL PROTECTED] writes: Note: In an unoptimized scenario, such as with Hugs, you do indeed run out of memory in your loop (after some 4 iterations) not having the recursion in the last call. Even loops not constructing cons cells do, such as loop 0 = return () loop n = do { print n; loop (n-1) } -- print n loop (n-1) main = loop 5 (you see Hugs die after some 25000 iterations...) I'm afraid your diagnosis is way off base here. The problem is nothing to do with a 'last call optimization' or with the do syntactic sugar and can be worked around not by changing how you _write_your code (which would obviously be a large burden) but by how you _call_ your code (a much smaller burden). The problem is to do with garbage collection of Constant Applicable Forms (or CAFs). CAFs are definitions like: nil = [] one = 1 :: Int primes = ... main = loop 5 GHC and Hugs differ in how they treat CAFs. Hugs treats CAFs as a special case and never garbage collects a CAF - the only way to discard its value is to unload the module that defines it. GHC treats CAFs the same as normal definitions and garbage collects them when they can no longer contribute to future execution. The difference between these behaviours can be seen when the CAF grows very large as in your example. The workaround is simple enough: add a dummy argument to the CAF (so that it is not a CAF any more): main _ = loop 5 and then specify the extra argument when invoking it: main () (This is a pretty standard optimisation technique: we're trading time to recompute a result for the space taken to store the result. Coming from other languages where actions (i.e., monadic computations) are not first class values, this is a bit surprising but, from a Haskell perspective, it is completely uniform.) Note that this workaround is necessary with GHC too if you have a large CAF which does not die. For example, if you wanted to benchmark your code, you might want to run it 10 times using: main1 = loop 5 main = sequence_ (replicate 10 main1) Now main1 will not be garbage collected until the last time it is executed. The solution is the same as for Hugs: add a dummy argument. -- Alastair Reid [EMAIL PROTECTED] Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/ ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
You are right, After writing that e-mail I looked at a lot of cases in Hugs, and also encountered this CAF problem. And, as I pointed out elsewhere, the last call optimisation is not very interesting in the lazy evaluation scenario... One problem, though, is that I would like not to get rid of the CAF, since I (presumably wrongly) assume that CAFs are implemented more efficiently in Hugs than normal definitions. Am I right in this assumption? Thanks, David -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Alastair Reid Sent: Monday, December 16, 2002 9:18 AM To: David Bergman Cc: 'Richard Uhtenwoldt'; [EMAIL PROTECTED] Subject: Re: Running out of memory in a simple monad David Bergman [EMAIL PROTECTED] writes: Note: In an unoptimized scenario, such as with Hugs, you do indeed run out of memory in your loop (after some 4 iterations) not having the recursion in the last call. Even loops not constructing cons cells do, such as loop 0 = return () loop n = do { print n; loop (n-1) } -- print n loop (n-1) main = loop 5 (you see Hugs die after some 25000 iterations...) I'm afraid your diagnosis is way off base here. The problem is nothing to do with a 'last call optimization' or with the do syntactic sugar and can be worked around not by changing how you _write_your code (which would obviously be a large burden) but by how you _call_ your code (a much smaller burden). The problem is to do with garbage collection of Constant Applicable Forms (or CAFs). CAFs are definitions like: nil = [] one = 1 :: Int primes = ... main = loop 5 GHC and Hugs differ in how they treat CAFs. Hugs treats CAFs as a special case and never garbage collects a CAF - the only way to discard its value is to unload the module that defines it. GHC treats CAFs the same as normal definitions and garbage collects them when they can no longer contribute to future execution. The difference between these behaviours can be seen when the CAF grows very large as in your example. The workaround is simple enough: add a dummy argument to the CAF (so that it is not a CAF any more): main _ = loop 5 and then specify the extra argument when invoking it: main () (This is a pretty standard optimisation technique: we're trading time to recompute a result for the space taken to store the result. Coming from other languages where actions (i.e., monadic computations) are not first class values, this is a bit surprising but, from a Haskell perspective, it is completely uniform.) Note that this workaround is necessary with GHC too if you have a large CAF which does not die. For example, if you wanted to benchmark your code, you might want to run it 10 times using: main1 = loop 5 main = sequence_ (replicate 10 main1) Now main1 will not be garbage collected until the last time it is executed. The solution is the same as for Hugs: add a dummy argument. -- Alastair Reid [EMAIL PROTECTED] Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/ ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Running out of memory in a simple monad
David Bergman [EMAIL PROTECTED] writes: One problem, though, is that I would like not to get rid of the CAF, since I (presumably wrongly) assume that CAFs are implemented more efficiently in Hugs than normal definitions. Am I right in this assumption? There isn't much to choose between them in raw efficiency. A rough initial analysis is: CAFs are looked up in an array of all definitions - access is small constant time. non-CAFs are looked up in an environment which is potentially linear in size of environment but environments are very small and, perhaps more significantly, has been optimized so that it is usually small constant time. Adding a CAF to an environment would add one 'cell' (8 bytes) to heap consumption but, again, optimizations of environment building should mean that environments get built rarely and are shared effectively. [note that these are per-CAF costs] Much more significant than these raw costs are the issues of how many times an expression is evaluated and how much space it takes to store these results between evaluations. I'd expect typical costs to these to be 10-1000 times larger than the raw costs given above. I think that many of these questions can't readily be answered by existing techniques and that some are outside the scope of what we normally ask optimizers to do because they boil down to space-time tradeoffs and solving these requires a lot of contextual information about the application, the input data, desired response times, capability of target machine(s), acceptable impact on other applications running on same machine, etc. Anyone out there wanting a topic for their PhD? -- Alastair Reid [EMAIL PROTECTED] Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/ ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
So, Should I imply that the IO monad is pretty damned useless in Hugs then, since the loop does not run in constant space there? There are a lot of algorithms that cannot be run in constant space (due to either recursion depth or structure generation), even in the most optimized setting. This does not make them useless. /David -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Richard Uhtenwoldt Sent: Saturday, November 30, 2002 11:53 PM To: David Bergman Cc: [EMAIL PROTECTED] Subject: RE: Running out of memory in a simple monad David Bergman writes: It is easy to get the feeling that the recursive call in recurse = do f x recurse is the last one, but this is just an illusion of the iterative layout of do. Sometimes the monads lure us into old iterative thought patterns... Taking away the syntactic sugar, we end up with recurse = f x recurse This reveals the fact that there can be no last call optimization, because the last call is . What do you mean by this? Do you mean that that an implementation cannot execute arbitrarily many iterations/recursions of that last loop/definition in constant space? If that's what you mean, you are wrong. GHC does that. The IO monad would be pretty damned useless for actual work if implementations did not do that! Even if we replace the () with a (:) as the main operator/last call of the loop, the result can execute in constant space because of optimizations. E.g., the program loop n=n:loop (n+1) main = print (loop 0) executes in constant space when compile by GHC 5.02. Details: Specifically, I let it run till it printed 2,500,000 at which time top reported the RSS to be 1,500 with no increase having occurred in a long time. top's manpage says that RSS is the total amount of physical memory used by the task, in kilobytes. The statement about () is true when the () is in the IO monad. I did not check to see what happens in a user-defined monad. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
David Bergman writes: Should I imply that the IO monad is pretty damned useless in Hugs then, since the loop does not run in constant space there? my statement was too broad. allow me to amend it. some are using Haskell for systems programming, as a better C than C. some, including me, would like to see more of that, with Haskell or another pure functional language with an IO monad taking systems programmers away from the C and C++ communities. Hugs is completely useless for *that*. for an example of Haskell as a better C than C, see Chak's Gtk+ bindings. to use them you must write your whole GUI in the IO monad in a style where the basic data structures and control structures closely resemble what you would write in C. see http://www.cse.unsw.edu.au/~chak/haskell/gtk/BoolEd.html and note how most of the functions are in the IO monad. many Haskellers have a negative opinion of such heavy use of the IO monad, but in systems programming you need more control over when (relative to other interactions) your program performs an interaction with a file, network or UI resource than is available in Haskell without the IO monad. -- Richard Uhtenwoldt It's a mammal thing; you wouldn't understand. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Carried away in the monads? was: RE: Running out of memory in a simple monad
Richard wrote: some are using Haskell for systems programming, as a better C than C. some, including me, would like to see more of that, with Haskell or another pure functional language with an IO monad taking systems programmers away from the C and C++ communities. That is good, I would probably call myself a C++ developer primarily, but Haskell + the IO monad is a much better choice (even better than Perl...) in most situations. Hugs is completely useless for *that*. Yes, it is, although it is great for testing the validity of one's Haskell programs. for an example of Haskell as a better C than C, see Chak's Gtk+ bindings. to use them you must write your whole GUI in the IO monad in a style where the basic data structures and control structures closely resemble what you would write in C. I have seen it, and it is a bit imperative in its style. many Haskellers have a negative opinion of such heavy use of the IO monad, but in systems programming you need more control over when (relative to other interactions) your program performs an interaction with a file, network or UI resource than is available in Haskell without the IO monad. I must confess that I belong to that theoretical group, although I am a systems programmer (i.e., developing real systems for real customers, paying real money). In my opinion, the use of categorical monads in a programming language is a brilliant intellectual achievement, combining the pure and real in programming, at least on paper... This is where the Haskellers divide: one group (me included) consider the hidden state in monads as falling back to imperative thought patterns. In some cases, as in IO, we need to have a state, obviously, but it would be beneficial if we could divide the logic into state independent (defined outside the monads) and state dependent code (defined within the monads; such as IO). It is a bit like some C++ developers I meet, who consider themselves to develop in C++ just because they use const or (reference): functional programming is about choice of language and choice of programming style. /David ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
David Bergman writes: It is easy to get the feeling that the recursive call in recurse = do f x recurse is the last one, but this is just an illusion of the iterative layout of do. Sometimes the monads lure us into old iterative thought patterns... Taking away the syntactic sugar, we end up with recurse = f x recurse This reveals the fact that there can be no last call optimization, because the last call is . What do you mean by this? Do you mean that that an implementation cannot execute arbitrarily many iterations/recursions of that last loop/definition in constant space? If that's what you mean, you are wrong. GHC does that. The IO monad would be pretty damned useless for actual work if implementations did not do that! Even if we replace the () with a (:) as the main operator/last call of the loop, the result can execute in constant space because of optimizations. E.g., the program loop n=n:loop (n+1) main = print (loop 0) executes in constant space when compile by GHC 5.02. Details: Specifically, I let it run till it printed 2,500,000 at which time top reported the RSS to be 1,500 with no increase having occurred in a long time. top's manpage says that RSS is the total amount of physical memory used by the task, in kilobytes. The statement about () is true when the () is in the IO monad. I did not check to see what happens in a user-defined monad. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
Richard, I assumed that the compiler had, through strictness analysis, used a call stack in the compiled code, instead of the usual call frames in the heap (equivalent to hey, I thought this was the Standard ML mail group?) ;-) Actually, you are right in that the last call optimization is not vital in most call-by-need scenarios, since both (the implicit) call stack and data structures are often consumed as generated, and the GC can reclaim the thunks. Note: In an unoptimized scenario, such as with Hugs, you do indeed run out of memory in your loop (after some 4 iterations) not having the recursion in the last call. Even loops not constructing cons cells do, such as loop 0 = return () loop n = do { print n; loop (n-1) } -- print n loop (n-1) main = loop 5 (you see Hugs die after some 25000 iterations...) Sorry about the over-simplification, but I wanted people to not forget that the do is just syntactic sugar... Thanks, David -Original Message- From: Richard Uhtenwoldt Sent: Saturday, November 30, 2002 11:53 PM To: David Bergman Cc: [EMAIL PROTECTED] Subject: RE: Running out of memory in a simple monad David Bergman writes: It is easy to get the feeling that the recursive call in recurse = do f x recurse is the last one, but this is just an illusion of the iterative layout of do. Sometimes the monads lure us into old iterative thought patterns... Taking away the syntactic sugar, we end up with recurse = f x recurse This reveals the fact that there can be no last call optimization, because the last call is . What do you mean by this? Do you mean that that an implementation cannot execute arbitrarily many iterations/recursions of that last loop/definition in constant space? If that's what you mean, you are wrong. GHC does that. The IO monad would be pretty damned useless for actual work if implementations did not do that! Even if we replace the () with a (:) as the main operator/last call of the loop, the result can execute in constant space because of optimizations. E.g., the program loop n=n:loop (n+1) main = print (loop 0) executes in constant space when compile by GHC 5.02. Details: Specifically, I let it run till it printed 2,500,000 at which time top reported the RSS to be 1,500 with no increase having occurred in a long time. top's manpage says that RSS is the total amount of physical memory used by the task, in kilobytes. The statement about () is true when the () is in the IO monad. I did not check to see what happens in a user-defined monad. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Running out of memory in a simple monad
Hello, I hope I understand what's going on; if not please someone correct me. I have problems with monads and memory. I have a monad through which I thread output. If I do the concatenation of the output-strings in one way Hugs runs out of memory, but if I do it in another way everything works. I can't see why the first way doesn't work but the second is OK. I woudl appreciate if someone could tell me what I am doing wrong. Here is the non-working monad: -} The problem is not directly connected to monads; what is the problem: [] ++ x = x (h:t) ++ x = h : (t++x), i.e. time complexity of ++ is proportional to the length of first list. first way: putCharM c = M $ \o - ((), o ++ [c]) -- Is this stupid in some way? this takes list (looong) of everything produced before this putCharM and concatenates c as last member; this takes time linear in the length of the list, summing over all putCharMs it is quadratic (and of course, due to laziness a lot of memory is consumed; seq does not help, as it only evaluates first cell of the list so that it sees it is not empty; deepSeq would solve this, but the time consumption would still stay long). the second way: M f = k = M $ let (x, o) = f M f2 = k x (x', o') = f2 in (x', o ++ o') this is done reverse way (because = is bracketed to the right); we concatenate output of f (short) with output of f2 (the rest of computation, i.e. looong); but the time is proportional to the length of first list, so it is constant in our case; summing it over all putCharMs, we get linear time and we are happy :-) If you want to do it the first way, define putCharM c = M $ \o - ((), c : o) and then reverse the list in runM. Zdenek Dvorak _ Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Running out of memory in a simple monad
Just to add to what Zdenek wrote: The linear complexity of string concatenation in a naïve implementation (not having access to an extra-language end-of-list in the diff list sense...) make the total complexity O(n^2), since the number of conses generated is thus sum [1 .. n] which, obviously, is (1+n)*n/2. In the case of the n=5 in the example we get sum [1 .. 5] = 1250025000. So, well over one billion conses. This is why ++ is right associative :-) This time complexity cannot make Hugs crash, though, except for a defect GC, having problems tidying up after each round of ++. The space complexity, which reduces to maximum execution stack space (considering a proper GC) in the example, is what kills Hugs. The problem is that the string concatenation is not the last call, so there is no room for last call optimization. If you want to mimic the complexity of the example while calculating the number of conses required, try evaluating the isomorphic expression last $ scanl1 (+) $ take 5 (repeat 1) It might crash in Hugs, running out of execution stack space, for the same reason as the original example. It is the last that holds up the stack space, by the way. I hope this was helpful. Regarding do: It is easy to get the feeling that the recursive call in recurse = do f x recurse is the last one, but this is just an illusion of the iterative layout of do. Sometimes the monads lure us into old iterative thought patterns... Taking away the syntactic sugar, we end up with recurse = f x recurse This reveals the fact that there can be no last call optimization, because the last call is . Regards, David ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell