Re: [Haskell-cafe] Optmiization of recursion
On 28-Sep-2004, John Goerzen [EMAIL PROTECTED] wrote: As I'm investigating Haskell, it's occured to me that most of the Haskell tutorials out there have omitted something that was quite prominent in the OCaml material I had read: making functions properly tail-recursive. The OCaml compiler was able to optimize tail-recursive functions such that they could be compiled using a loop. This would achieve two main benefits: performance due to not needing to allocate/deallocate stack frames, and the ability to work on very large lists since each recursive call did not consume extra stack space. So I have several questions about Haskell: 1. Do Haskell interpreters/compilers perform similar optimizations? Yes. However, laziness has a major effect on space usage and on the applicability and importance of tail call optimization. 4. Is a reference available documenting which Prelude/standard functions are properly tail-recursive (and thus suitable for very large lists) and which are not? No. But the source code is available... If analyzing the performance and space usage of your programs is important, then Haskell may not be the best choice of language. -- Fergus J. Henderson | I have always known that the pursuit Galois Connections, Inc.| of excellence is a lethal habit Phone: +1 503 626 6616 | -- the last words of T. S. Garp. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optmiization of recursion
Fergus Henderson [EMAIL PROTECTED] held forth: On 28-Sep-2004, John Goerzen [EMAIL PROTECTED] wrote: As I'm investigating Haskell, it's occured to me that most of the Haskell tutorials out there have omitted something that was quite prominent in the OCaml material I had read: making functions properly tail-recursive. The OCaml compiler was able to optimize tail-recursive functions such that they could be compiled using a loop. This would achieve two main benefits: performance due to not needing to allocate/deallocate stack frames, and the ability to work on very large lists since each recursive call did not consume extra stack space. snip If analyzing the performance and space usage of your programs is important, then Haskell may not be the best choice of language. I disagree. If performance and space usage are sufficiently important, Haskell may not be best. But it's really not that much easier to analyze performance or space usage under strict languages. In either case, the golden rule is: profile. Reading the source code will tell you very little. Jon Cast p.s. Of course, I know that reading the source code tells you very little about strict languages and nothing about lazy ones. But in both cases it is overwhelmingly dominated by profiling. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Optmiization of recursion
Hello, As I'm investigating Haskell, it's occured to me that most of the Haskell tutorials out there have omitted something that was quite prominent in the OCaml material I had read: making functions properly tail-recursive. The OCaml compiler was able to optimize tail-recursive functions such that they could be compiled using a loop. This would achieve two main benefits: performance due to not needing to allocate/deallocate stack frames, and the ability to work on very large lists since each recursive call did not consume extra stack space. So I have several questions about Haskell: 1. Do Haskell interpreters/compilers perform similar optimizations? 2. If the answer to 1 is No, then am I correct in saying that my ability to handle very large lists in recursive functions is limited by the maximum stack size on my system? 3. Does it make a difference for optimization purposes whether a particular recursive function is tail-recursive? 4. Is a reference available documenting which Prelude/standard functions are properly tail-recursive (and thus suitable for very large lists) and which are not? (OCaml system docs generally specify when a function is not tail-recursive... for instance, one of foldr/foldl is and the other isn't.) 5. Are there any examples/patterns anywhere that illustrate standard tail-recursion practice in Haskell? Thanks! ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optmiization of recursion
John Goerzen [EMAIL PROTECTED] writes: The OCaml compiler was able to optimize tail-recursive functions such that they could be compiled using a loop. This would achieve two main benefits: performance due to not needing to allocate/deallocate stack frames, and the ability to work on very large lists since each recursive call did not consume extra stack space. So I have several questions about Haskell: 1. Do Haskell interpreters/compilers perform similar optimizations? Yes. 3. Does it make a difference for optimization purposes whether a particular recursive function is tail-recursive? Similarly to OCaml. Except that a non-tail-recursive function doesn't necessarily use a large amount of stack, because of laziness. For example map doesn't process the whole list at once; it immediately returns a cons cell which, when its tail is evaluated, causes map to proceed further. 4. Is a reference available documenting which Prelude/standard functions are properly tail-recursive (and thus suitable for very large lists) and which are not? I don't think so. It should be safe to assume that functions have time/space properties like their sources in the Haskell report, but I'm not sure if there are exceptions. But as I said, non-tail recursion is not necessarily a problem. Most functions are suitable for large lists even if they are not tail recursive; when they use O(N) memory, it's on the heap rather than on the stack and it's unavoidable. (OCaml system docs generally specify when a function is not tail-recursive... for instance, one of foldr/foldl is and the other isn't.) This is a tricky issue. In OCaml and similar languages foldl runs in constant stack space, assuming the function given uses a constant stack space, while foldr must traverse the whole list before it begins any work and it uses O(N) stack space. In Haskell foldr is efficient if the given function is lazy in its second argument, or if it enters its second argument in *its* tail position, and O(N) if it does something after evaluation of its second argument. And it's mixed if the function behaves differently on different times, depending on its first argument for example. OTOH foldl is tail-recursive, but the loop only builds a thunk of size O(N), which then uses O(N) of stack if the folded function does some computation after entering its first argument, or better otherwise. *But* if the compiler is able to inline it and statically determine that the folded function is always strict (GHC usually does this but other implementations don't), it can transform the code accordingly such that it doesn't use O(N) stack. Some implementations also provide foldl', which is a strict variant of foldl: it artificially makes the folded function strict (by using seq) and has this seq use inlined, so it will always be as efficient as the folded function permits, but for weird functions it may evaluate something that foldl would not evaluate at all. I wish foldl' was in standard Haskell, because it's usually the same as foldl but doesn't rely on sophisticated compiler optimizations to be efficient. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optmiization of recursion
hello, John Goerzen wrote: Hello, As I'm investigating Haskell, it's occured to me that most of the Haskell tutorials out there have omitted something that was quite prominent in the OCaml material I had read: making functions properly tail-recursive. The OCaml compiler was able to optimize tail-recursive functions such that they could be compiled using a loop. This would achieve two main benefits: performance due to not needing to allocate/deallocate stack frames, and the ability to work on very large lists since each recursive call did not consume extra stack space. So I have several questions about Haskell: 1. Do Haskell interpreters/compilers perform similar optimizations? yes they do. most functional language compilers do this sort of thing. 2. If the answer to 1 is No, then am I correct in saying that my ability to handle very large lists in recursive functions is limited by the maximum stack size on my system? if they answer was no, perhaps that would be the case, but with lazyness things get weird. may experience is that it is somehow a lot easier for me to run out of space when writing ML programs (this could be because i probably think in Haskell). the reason is that in ML sometimes things get evaluated, when in haskell you would simply acllocate a closure on the heap. OTOH there are a few classical cases in haskell where you run out of space even if you have a tail recursive function: e.g. sum n [] = n sum n (y:ys) = sum (n + y) ys the reason for this that the n+y expression never gets evaluated until the very end, so you create many suspensions in memory. 3. Does it make a difference for optimization purposes whether a particular recursive function is tail-recursive? yes, in a tail recursive call you can reuse your stack frame, so the call becomes simply a jump (i.e. you get a loop). 4. Is a reference available documenting which Prelude/standard functions are properly tail-recursive (and thus suitable for very large lists) and which are not? (OCaml system docs generally specify when a function is not tail-recursive... for instance, one of foldr/foldl is and the other isn't.) i am not sure if there is such a thing, but one can often guess. for the record, foldl is the tail recursive one (it captures the pattern of traversing a list with an acumulator) while foldr is not. 5. Are there any examples/patterns anywhere that illustrate standard tail-recursion practice in Haskell? i'd imagine these would be much the same as in o'caml. a common functional programming pattern when you get tail recursion is to rewrite a function using an accumulator (i.e. a bit of local state). hope this helps -iavpr ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optmiization of recursion
On Tuesday 28 September 2004 01:54 am, Iavor S. Diatchki wrote: OTOH there are a few classical cases in haskell where you run out of space even if you have a tail recursive function: e.g. sum n [] = n sum n (y:ys) = sum (n + y) ys the reason for this that the n+y expression never gets evaluated until the very end, so you create many suspensions in memory. Eep. So that seems like being stuck between a rock and a hard place. (Thanks for mentioning it, btw) If I instead wrote: sum [] = 0 sum (x:xs) = x + sum(xs) then I have the same problem. What is the proper way to solve this little problem then? Would foldl suffer from the same issue? 5. Are there any examples/patterns anywhere that illustrate standard tail-recursion practice in Haskell? i'd imagine these would be much the same as in o'caml. a common functional programming pattern when you get tail recursion is to rewrite a function using an accumulator (i.e. a bit of local state). Yup, I have done just that in OCaml. I find it more tasteful to hide the accumulator from the caller. To write the sum, I might do this (forgive me if this is not valid Haskell, I'm still new to this) sum x = let sum2 n [] = n in let sum2 n (y:ys) = sum (n + y) ys in sum2 0 x Or, in OCaml: let sum x = let rec sum2 n y = match y with [] - n | x:xs - sum2 (n + x) xs in sum2 0 x;; ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optmiization of recursion
John Goerzen [EMAIL PROTECTED] writes: If I instead wrote: sum [] = 0 sum (x:xs) = x + sum(xs) then I have the same problem. What is the proper way to solve this little problem then? sum n [] = n sum n (x:xs) = (sum $! n + x) xs It's unfortunate that it requires $! or seq, but it's the only *reliable* way (not relying on optimizing compilers). That's why I would prefer to have foldl' in the standard. It's too easy to forget about it. It should be encouraged in these accumulator loop patterns. Yup, I have done just that in OCaml. I find it more tasteful to hide the accumulator from the caller. To write the sum, I might do this (forgive me if this is not valid Haskell, I'm still new to this) sum x = let sum2 n [] = n in let sum2 n (y:ys) = sum (n + y) ys in sum2 0 x The first in and second let should be removed. For my language Kogut I invented a loop syntax which would look like this if it was available in Haskell (loop and again are keywords): sum l = loop (l, 0) of (x:xs, n) - again (xs, x + n) ([], n) - n or maybe like this (unfortunately it would be inconsistent with case which uses a single expression only): sum l = loop l 0 of (x:xs) n - again xs (x + n) [] n - n This suffers from the same laziness problem as foldl, $! or seq should be used. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optmiization of recursion
On Tuesday 28 Sep 2004 8:17 pm, John Goerzen wrote: On Tuesday 28 September 2004 01:54 am, Iavor S. Diatchki wrote: OTOH there are a few classical cases in haskell where you run out of space even if you have a tail recursive function: e.g. sum n [] = n sum n (y:ys) = sum (n + y) ys the reason for this that the n+y expression never gets evaluated until the very end, so you create many suspensions in memory. Eep. So that seems like being stuck between a rock and a hard place. (Thanks for mentioning it, btw) You can fix this kind of problem with seq sum n [] = n sum n (y:ys) = let n' = n + y in n' `seq` sum n' ys But AFAIK this should be unnecessary in this case with ghc (and others probably) because sum is strict in it's first argument and the strictness analyser should detect this and avoid pointless laziness. You probably need to compile with -O to get this to happen though. But in cases where the function isn't strict (or you're not sure, or you're not sure about the compilers ability to infer strictness) you can get rid of unwanted laziness for sure using seq. Using $! is an alternative. Regards -- Adrian Hey ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe