Re: [Haskell-cafe] Optmiization of recursion

2004-10-04 Thread Fergus Henderson
On 28-Sep-2004, John Goerzen [EMAIL PROTECTED] wrote:
 As I'm investigating Haskell, it's occured to me that most of the
 Haskell tutorials out there have omitted something that was quite
 prominent in the OCaml material I had read: making functions properly
 tail-recursive.
 
 The OCaml compiler was able to optimize tail-recursive functions such
 that they could be compiled using a loop.  This would achieve two main
 benefits: performance due to not needing to allocate/deallocate stack
 frames, and the ability to work on very large lists since each recursive
 call did not consume extra stack space.
 
 So I have several questions about Haskell:
 
 1. Do Haskell interpreters/compilers perform similar optimizations?

Yes.  However, laziness has a major effect on space usage and on
the applicability and importance of tail call optimization.

 4. Is a reference available documenting which Prelude/standard functions
 are properly tail-recursive (and thus suitable for very large lists) and
 which are not?

No.  But the source code is available...

If analyzing the performance and space usage of your programs is important,
then Haskell may not be the best choice of language.

-- 
Fergus J. Henderson |  I have always known that the pursuit
Galois Connections, Inc.|  of excellence is a lethal habit
Phone: +1 503 626 6616  | -- the last words of T. S. Garp.
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optmiization of recursion

2004-10-04 Thread Jon Cast
Fergus Henderson [EMAIL PROTECTED] held forth:
 On 28-Sep-2004, John Goerzen [EMAIL PROTECTED] wrote:
  As I'm investigating Haskell, it's occured to me that most of the
  Haskell tutorials out there have omitted something that was quite
  prominent in the OCaml material I had read: making functions properly
  tail-recursive.
  
  The OCaml compiler was able to optimize tail-recursive functions such
  that they could be compiled using a loop.  This would achieve two main
  benefits: performance due to not needing to allocate/deallocate stack
  frames, and the ability to work on very large lists since each recursive
  call did not consume extra stack space.

snip

 If analyzing the performance and space usage of your programs is
 important, then Haskell may not be the best choice of language.

I disagree.  If performance and space usage are sufficiently important,
Haskell may not be best.  But it's really not that much easier to
analyze performance or space usage under strict languages.  In either
case, the golden rule is: profile.  Reading the source code will tell
you very little.

Jon Cast

p.s.  Of course, I know that reading the source code tells you very
little about strict languages and nothing about lazy ones.  But in both
cases it is overwhelmingly dominated by profiling.
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Optmiization of recursion

2004-09-28 Thread John Goerzen
Hello,

As I'm investigating Haskell, it's occured to me that most of the
Haskell tutorials out there have omitted something that was quite
prominent in the OCaml material I had read: making functions properly
tail-recursive.

The OCaml compiler was able to optimize tail-recursive functions such
that they could be compiled using a loop.  This would achieve two main
benefits: performance due to not needing to allocate/deallocate stack
frames, and the ability to work on very large lists since each recursive
call did not consume extra stack space.

So I have several questions about Haskell:

1. Do Haskell interpreters/compilers perform similar optimizations?

2. If the answer to 1 is No, then am I correct in saying that my ability
to handle very large lists in recursive functions is limited by the
maximum stack size on my system?

3. Does it make a difference for optimization purposes whether a
particular recursive function is tail-recursive?

4. Is a reference available documenting which Prelude/standard functions
are properly tail-recursive (and thus suitable for very large lists) and
which are not?  (OCaml system docs generally specify when a function is
not tail-recursive... for instance, one of foldr/foldl is and the other
isn't.)

5. Are there any examples/patterns anywhere that illustrate standard
tail-recursion practice in Haskell?

Thanks!


___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optmiization of recursion

2004-09-28 Thread Marcin 'Qrczak' Kowalczyk
John Goerzen [EMAIL PROTECTED] writes:

 The OCaml compiler was able to optimize tail-recursive functions such
 that they could be compiled using a loop.  This would achieve two main
 benefits: performance due to not needing to allocate/deallocate stack
 frames, and the ability to work on very large lists since each recursive
 call did not consume extra stack space.

 So I have several questions about Haskell:

 1. Do Haskell interpreters/compilers perform similar optimizations?

Yes.

 3. Does it make a difference for optimization purposes whether a
 particular recursive function is tail-recursive?

Similarly to OCaml. Except that a non-tail-recursive function doesn't
necessarily use a large amount of stack, because of laziness. For
example map doesn't process the whole list at once; it immediately
returns a cons cell which, when its tail is evaluated, causes map
to proceed further.

 4. Is a reference available documenting which Prelude/standard functions
 are properly tail-recursive (and thus suitable for very large lists) and
 which are not?

I don't think so. It should be safe to assume that functions have
time/space properties like their sources in the Haskell report,
but I'm not sure if there are exceptions.

But as I said, non-tail recursion is not necessarily a problem. Most
functions are suitable for large lists even if they are not tail
recursive; when they use O(N) memory, it's on the heap rather than
on the stack and it's unavoidable.

 (OCaml system docs generally specify when a function is not
 tail-recursive... for instance, one of foldr/foldl is and the other
 isn't.)

This is a tricky issue. In OCaml and similar languages foldl runs in
constant stack space, assuming the function given uses a constant
stack space, while foldr must traverse the whole list before it begins
any work and it uses O(N) stack space.

In Haskell foldr is efficient if the given function is lazy in its
second argument, or if it enters its second argument in *its* tail
position, and O(N) if it does something after evaluation of its second
argument. And it's mixed if the function behaves differently on
different times, depending on its first argument for example.

OTOH foldl is tail-recursive, but the loop only builds a thunk of size
O(N), which then uses O(N) of stack if the folded function does some
computation after entering its first argument, or better otherwise.
*But* if the compiler is able to inline it and statically determine
that the folded function is always strict (GHC usually does this but
other implementations don't), it can transform the code accordingly
such that it doesn't use O(N) stack.

Some implementations also provide foldl', which is a strict variant of
foldl: it artificially makes the folded function strict (by using seq)
and has this seq use inlined, so it will always be as efficient as
the folded function permits, but for weird functions it may evaluate
something that foldl would not evaluate at all. I wish foldl' was in
standard Haskell, because it's usually the same as foldl but doesn't
rely on sophisticated compiler optimizations to be efficient.

-- 
   __( Marcin Kowalczyk
   \__/   [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optmiization of recursion

2004-09-28 Thread Iavor S. Diatchki
hello,
John Goerzen wrote:
Hello,
As I'm investigating Haskell, it's occured to me that most of the
Haskell tutorials out there have omitted something that was quite
prominent in the OCaml material I had read: making functions properly
tail-recursive.
The OCaml compiler was able to optimize tail-recursive functions such
that they could be compiled using a loop.  This would achieve two main
benefits: performance due to not needing to allocate/deallocate stack
frames, and the ability to work on very large lists since each recursive
call did not consume extra stack space.
So I have several questions about Haskell:
1. Do Haskell interpreters/compilers perform similar optimizations?
 

yes they do.  most functional language compilers do this sort of thing.

2. If the answer to 1 is No, then am I correct in saying that my ability
to handle very large lists in recursive functions is limited by the
maximum stack size on my system?
 

if they answer was no, perhaps that would be the case, but with lazyness 
things get weird.
may experience is that it is somehow a lot easier for me to run out of 
space when writing
ML programs (this could be because i probably think in Haskell).  the 
reason is that
in ML sometimes things get evaluated, when in haskell you would simply 
acllocate a closure on the heap.
OTOH there are a few classical cases in haskell where you run out of 
space even if you have a tail
recursive function: e.g.
sum n [] = n
sum n (y:ys) = sum (n + y) ys
the reason for this that the n+y expression never gets evaluated until 
the very end,
so you create many suspensions in memory. 

3. Does it make a difference for optimization purposes whether a
particular recursive function is tail-recursive?
 

yes, in a tail recursive call you can reuse your stack frame, so the 
call becomes simply
a jump (i.e. you get a loop).

4. Is a reference available documenting which Prelude/standard functions
are properly tail-recursive (and thus suitable for very large lists) and
which are not?  (OCaml system docs generally specify when a function is
not tail-recursive... for instance, one of foldr/foldl is and the other
isn't.)
 

i am not sure if there is such a thing, but one can often guess.
for the record, foldl is the tail recursive one (it captures the pattern 
of traversing a list with an acumulator)
while foldr is not.

5. Are there any examples/patterns anywhere that illustrate standard
tail-recursion practice in Haskell?
 

i'd imagine these would be much the same as in o'caml. 
a common functional programming pattern when you get tail recursion is
to rewrite a function using an accumulator (i.e. a bit of local state).

hope this helps
-iavpr



___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optmiization of recursion

2004-09-28 Thread John Goerzen
On Tuesday 28 September 2004 01:54 am, Iavor S. Diatchki wrote:
 OTOH there are a few classical cases in haskell where you run out of
 space even if you have a tail
 recursive function: e.g.
 sum n [] = n
 sum n (y:ys) = sum (n + y) ys
 the reason for this that the n+y expression never gets evaluated
 until the very end,
 so you create many suspensions in memory.

Eep.  So that seems like being stuck between a rock and a hard place.  
(Thanks for mentioning it, btw)

If I instead wrote:

sum [] = 0
sum (x:xs) = x + sum(xs)

then I have the same problem.

What is the proper way to solve this little problem then?

Would foldl suffer from the same issue?

 5. Are there any examples/patterns anywhere that illustrate standard
 tail-recursion practice in Haskell?

 i'd imagine these would be much the same as in o'caml.
 a common functional programming pattern when you get tail recursion
 is to rewrite a function using an accumulator (i.e. a bit of local
 state).

Yup, I have done just that in OCaml.  I find it more tasteful to hide 
the accumulator from the caller.  To write the sum, I might do this 
(forgive me if this is not valid Haskell, I'm still new to this)

sum x =
let sum2 n [] = n in
let sum2 n (y:ys) = sum (n + y) ys in
sum2 0 x

Or, in OCaml:

let sum x =
  let rec sum2 n y = match y with
 [] - n
  |  x:xs - sum2 (n + x) xs in
  sum2 0 x;;

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optmiization of recursion

2004-09-28 Thread Marcin 'Qrczak' Kowalczyk
John Goerzen [EMAIL PROTECTED] writes:

 If I instead wrote:

 sum [] = 0
 sum (x:xs) = x + sum(xs)

 then I have the same problem.

 What is the proper way to solve this little problem then?

sum n [] = n
sum n (x:xs) = (sum $! n + x) xs

It's unfortunate that it requires $! or seq, but it's the only
*reliable* way (not relying on optimizing compilers).

That's why I would prefer to have foldl' in the standard. It's too
easy to forget about it. It should be encouraged in these accumulator
loop patterns.

 Yup, I have done just that in OCaml.  I find it more tasteful to hide 
 the accumulator from the caller.  To write the sum, I might do this 
 (forgive me if this is not valid Haskell, I'm still new to this)

 sum x =
 let sum2 n [] = n in
 let sum2 n (y:ys) = sum (n + y) ys in
 sum2 0 x

The first in and second let should be removed.

For my language Kogut I invented a loop syntax which would look like
this if it was available in Haskell (loop and again are keywords):

sum l = loop (l, 0) of
   (x:xs, n) - again (xs, x + n)
   ([],   n) - n

or maybe like this (unfortunately it would be inconsistent with case
which uses a single expression only):

sum l = loop l 0 of
   (x:xs) n - again xs (x + n)
   [] n - n

This suffers from the same laziness problem as foldl, $! or seq should
be used.  

-- 
   __( Marcin Kowalczyk
   \__/   [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optmiization of recursion

2004-09-28 Thread Adrian Hey
On Tuesday 28 Sep 2004 8:17 pm, John Goerzen wrote:
 On Tuesday 28 September 2004 01:54 am, Iavor S. Diatchki wrote:
  OTOH there are a few classical cases in haskell where you run out of
  space even if you have a tail
  recursive function: e.g.
  sum n [] = n
  sum n (y:ys) = sum (n + y) ys
  the reason for this that the n+y expression never gets evaluated
  until the very end,
  so you create many suspensions in memory.

 Eep.  So that seems like being stuck between a rock and a hard place.
 (Thanks for mentioning it, btw)

You can fix this kind of problem with seq
sum n [] = n
sum n (y:ys) = let n' = n + y in n' `seq` sum n' ys

But AFAIK this should be unnecessary in this case with ghc (and others
probably) because sum is strict in it's first argument and the strictness
analyser should detect this and avoid pointless laziness. You probably
need to compile with -O to get this to happen though.

But in cases where the function isn't strict (or you're not sure,
or you're not sure about the compilers ability to infer strictness)
you can get rid of unwanted laziness for sure using seq.
Using $! is an alternative.

Regards
--
Adrian Hey 

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe