Re: Running out of memory in a simple monad

2002-12-18 Thread Alastair Reid

Simon Marlow [EMAIL PROTECTED] writes:
 Also, GHCi retains CAFs in the same way as Hugs, the difference is
 that GHCi can be configured to throw away the results after
 evaluation (:set +r).

If I set this flag, does GHCi discard CAFs during evaluation or at the
end of evaluation?  Or, to put it another way, do classic examples like

  module Main(main,primes) where

  main = print primes
  primes = ...

leak space?

--
Alastair

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Running out of memory in a simple monad

2002-12-18 Thread Simon Marlow
 Simon Marlow [EMAIL PROTECTED] writes:
  Also, GHCi retains CAFs in the same way as Hugs, the difference is
  that GHCi can be configured to throw away the results after
  evaluation (:set +r).
 
 If I set this flag, does GHCi discard CAFs during evaluation or at the
 end of evaluation?  Or, to put it another way, do classic 
 examples like
 
   module Main(main,primes) where
 
   main = print primes
   primes = ...
 
 leak space?

Actually CAF reverting only applies to CAFs in compiled code at the
moment, and it happens after evaluation, not during it.  The primes
example will leak in GHCi (not in plain GHC, though).

I guess the right thing to do is to revert them as soon as they're found
to be unreferenced, but that's hard - the compiler's symbol table refers
to all the top-level bindings, so it would probably have to use weak
pointers.

The CAF reverting feature was added mainly so that you can recover from
doing hGetContents on stdout.

Cheers,
Simon
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Running out of memory in a simple monad

2002-12-17 Thread Simon Marlow
Alastair Reid writes:

 The workaround is simple enough: add a dummy argument to the CAF (so
 that it is not a CAF any more):
 
main _ = loop 5
 
 and then specify the extra argument when invoking it:
 
main ()
 
 (This is a pretty standard optimisation technique: we're trading time
 to recompute a result for the space taken to store the result.  Coming
 from other languages where actions (i.e., monadic computations) are
 not first class values, this is a bit surprising but, from a Haskell
 perspective, it is completely uniform.)

Careful: this isn't guaranteed to turn a CAF into a function.  In
particular, GHC will optimise away this trick when optimisation is
turned on (and perhaps even when it isn't).  The point is that adding
dummy arguments isn't really a technique that should be relied upon.

You might well argue that there ought to be a way to control the
operational behaviour of the program w.r.t. CAFs (and in fact sharing in
general), and I'd be inclined to agree.

Also, GHCi retains CAFs in the same way as Hugs, the difference is that
GHCi can be configured to throw away the results after evaluation (:set
+r).

Cheers,
Simon
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Running out of memory in a simple monad

2002-12-16 Thread Alastair Reid

David Bergman [EMAIL PROTECTED] writes:
 Note: In an unoptimized scenario, such as
 with Hugs, you do indeed run out of memory in your loop (after
 some 4 iterations) not having the recursion in the last
 call. Even loops not constructing cons cells do, such as

   loop 0 = return () 
   loop n = do { print n; loop (n-1) } -- print n  loop (n-1) 
   main = loop 5

 (you see Hugs die after some 25000 iterations...)

I'm afraid your diagnosis is way off base here.

The problem is nothing to do with a 'last call optimization' or with
the do syntactic sugar and can be worked around not by changing how
you _write_your code (which would obviously be a large burden) but by
how you _call_ your code (a much smaller burden).

The problem is to do with garbage collection of Constant Applicable
Forms (or CAFs).  CAFs are definitions like:

  nil = []
  one = 1 :: Int
  primes = ...
  main = loop 5

GHC and Hugs differ in how they treat CAFs.  Hugs treats CAFs as a
special case and never garbage collects a CAF - the only way to
discard its value is to unload the module that defines it.  GHC treats
CAFs the same as normal definitions and garbage collects them when
they can no longer contribute to future execution.

The difference between these behaviours can be seen when the CAF grows
very large as in your example.

The workaround is simple enough: add a dummy argument to the CAF (so
that it is not a CAF any more):

   main _ = loop 5

and then specify the extra argument when invoking it:

   main ()

(This is a pretty standard optimisation technique: we're trading time
to recompute a result for the space taken to store the result.  Coming
from other languages where actions (i.e., monadic computations) are
not first class values, this is a bit surprising but, from a Haskell
perspective, it is completely uniform.)


Note that this workaround is necessary with GHC too if you have a
large CAF which does not die.  For example, if you wanted to benchmark
your code, you might want to run it 10 times using:

 main1 = loop 5
 main = sequence_ (replicate 10 main1)

Now main1 will not be garbage collected until the last time it is
executed.  The solution is the same as for Hugs: add a dummy argument.

--
Alastair Reid [EMAIL PROTECTED]  
Reid Consulting (UK) Limited  http://www.reid-consulting-uk.ltd.uk/alastair/
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Running out of memory in a simple monad

2002-12-16 Thread David Bergman
You are right,

After writing that e-mail I looked at a lot of cases in Hugs, and also
encountered this CAF problem. And, as I pointed out elsewhere, the last
call optimisation is not very interesting in the lazy evaluation
scenario...

One problem, though, is that I would like not to get rid of the CAF,
since I (presumably wrongly) assume that CAFs are implemented more
efficiently in Hugs than normal definitions. Am I right in this
assumption?

Thanks,

David

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED]] On Behalf Of Alastair Reid
 Sent: Monday, December 16, 2002 9:18 AM
 To: David Bergman
 Cc: 'Richard Uhtenwoldt'; [EMAIL PROTECTED]
 Subject: Re: Running out of memory in a simple monad
 
 
 
 David Bergman [EMAIL PROTECTED] writes:
  Note: In an unoptimized scenario, such as
  with Hugs, you do indeed run out of memory in your loop 
 (after some 
  4 iterations) not having the recursion in the last call. Even 
  loops not constructing cons cells do, such as
 
  loop 0 = return () 
loop n = do { print n; loop (n-1) } -- print n  loop (n-1) 
main = loop 5
 
  (you see Hugs die after some 25000 iterations...)
 
 I'm afraid your diagnosis is way off base here.
 
 The problem is nothing to do with a 'last call optimization' 
 or with the do syntactic sugar and can be worked around not 
 by changing how you _write_your code (which would obviously 
 be a large burden) but by how you _call_ your code (a much 
 smaller burden).
 
 The problem is to do with garbage collection of Constant 
 Applicable Forms (or CAFs).  CAFs are definitions like:
 
   nil = []
   one = 1 :: Int
   primes = ...
   main = loop 5
 
 GHC and Hugs differ in how they treat CAFs.  Hugs treats CAFs 
 as a special case and never garbage collects a CAF - the only 
 way to discard its value is to unload the module that defines 
 it.  GHC treats CAFs the same as normal definitions and 
 garbage collects them when they can no longer contribute to 
 future execution.
 
 The difference between these behaviours can be seen when the 
 CAF grows very large as in your example.
 
 The workaround is simple enough: add a dummy argument to the 
 CAF (so that it is not a CAF any more):
 
main _ = loop 5
 
 and then specify the extra argument when invoking it:
 
main ()
 
 (This is a pretty standard optimisation technique: we're 
 trading time to recompute a result for the space taken to 
 store the result.  Coming from other languages where actions 
 (i.e., monadic computations) are not first class values, this 
 is a bit surprising but, from a Haskell perspective, it is 
 completely uniform.)
 
 
 Note that this workaround is necessary with GHC too if you 
 have a large CAF which does not die.  For example, if you 
 wanted to benchmark your code, you might want to run it 10 
 times using:
 
  main1 = loop 5
  main = sequence_ (replicate 10 main1)
 
 Now main1 will not be garbage collected until the last time 
 it is executed.  The solution is the same as for Hugs: add a 
 dummy argument.
 
 --
 Alastair Reid [EMAIL PROTECTED]  
 Reid Consulting (UK) Limited  
 http://www.reid-consulting-uk.ltd.uk/alastair/
 
 ___
 Haskell mailing list
 [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
 

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Running out of memory in a simple monad

2002-12-16 Thread Alastair Reid

David Bergman [EMAIL PROTECTED] writes:
 One problem, though, is that I would like not to get rid of the CAF,
 since I (presumably wrongly) assume that CAFs are implemented more
 efficiently in Hugs than normal definitions. Am I right in this
 assumption?

There isn't much to choose between them in raw efficiency.  A rough
initial analysis is:
  
  CAFs are looked up in an array of all definitions - access is small
  constant time.
  
  non-CAFs are looked up in an environment which is potentially linear
  in size of environment but environments are very small and, perhaps
  more significantly, has been optimized so that it is usually small
  constant time.
  
  Adding a CAF to an environment would add one 'cell' (8 bytes) to heap
  consumption but, again, optimizations of environment building should
  mean that environments get built rarely and are shared effectively.

  [note that these are per-CAF costs]
  
Much more significant than these raw costs are the issues of how many
times an expression is evaluated and how much space it takes to store
these results between evaluations.  I'd expect typical costs to these
to be 10-1000 times larger than the raw costs given above.  I think
that many of these questions can't readily be answered by existing
techniques and that some are outside the scope of what we normally ask
optimizers to do because they boil down to space-time tradeoffs and
solving these requires a lot of contextual information about the
application, the input data, desired response times, capability of
target machine(s), acceptable impact on other applications running on
same machine, etc.  Anyone out there wanting a topic for their PhD?

--
Alastair Reid [EMAIL PROTECTED]  
Reid Consulting (UK) Limited  http://www.reid-consulting-uk.ltd.uk/alastair/


___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Running out of memory in a simple monad

2002-12-01 Thread David Bergman
So,

Should I imply that the IO monad is pretty damned useless in Hugs
then, since the loop does not run in constant space there?

There are a lot of algorithms that cannot be run in constant space (due
to either recursion depth or structure generation), even in the most
optimized setting. This does not make them useless.

/David

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On
Behalf Of Richard Uhtenwoldt
Sent: Saturday, November 30, 2002 11:53 PM
To: David Bergman
Cc: [EMAIL PROTECTED]
Subject: RE: Running out of memory in a simple monad

David Bergman writes:

It is easy to get the feeling that the recursive call in

   recurse = do
   f x
   recurse

is the last one, but this is just an illusion of the iterative layout
of
do. Sometimes the monads lure us into old iterative thought
patterns...

Taking away the syntactic sugar, we end up with

   recurse = f x  recurse

This reveals the fact that there can be no last call optimization,
because the last call is .

What do you mean by this?  Do you mean that that an implementation
cannot execute arbitrarily many iterations/recursions of that last
loop/definition in constant space?

If that's what you mean, you are wrong.  GHC does that.  The IO monad
would be pretty damned useless for actual work if implementations did
not do that!

Even if we replace the () with a (:) as the main operator/last
call of the loop, the result can execute in constant space because
of optimizations.

E.g., the program

  loop n=n:loop (n+1)
  main = print (loop 0)

executes in constant space when compile by GHC 5.02.


Details: 

Specifically, I let it run till it printed 2,500,000 at which time top
reported the RSS to be 1,500 with no increase having occurred in a
long time.  top's manpage says that RSS is the total amount of
physical memory used by the task, in kilobytes.

The statement about () is true when the () is in the IO monad.  I
did not check to see what happens in a user-defined monad.
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Running out of memory in a simple monad

2002-12-01 Thread Richard Uhtenwoldt
David Bergman writes:

Should I imply that the IO monad is pretty damned useless in Hugs
then, since the loop does not run in constant space there?

my statement was too broad.  allow me to amend it.

some are using Haskell for systems programming, as a better C
than C.  some, including me, would like to see more of that,
with Haskell or another pure functional language with an IO monad
taking systems programmers away from the C and C++ communities.

Hugs is completely useless for *that*.


for an example of Haskell as a better C than C, see Chak's Gtk+
bindings.  to use them you must write your whole GUI in the IO monad
in a style where the basic data structures and control structures
closely resemble what you would write in C.

see
http://www.cse.unsw.edu.au/~chak/haskell/gtk/BoolEd.html
and note how most of the functions are in the IO monad.

many Haskellers have a negative opinion of such heavy use of the
IO monad, but in systems programming you need more control over
when (relative to other interactions) your program performs an
interaction with a file, network or UI resource than is available
in Haskell without the IO monad.

--
Richard Uhtenwoldt
It's a mammal thing; you wouldn't understand.
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Carried away in the monads? was: RE: Running out of memory in a simple monad

2002-12-01 Thread David Bergman
Richard wrote:

 some are using Haskell for systems programming, as a better C
 than C.  some, including me, would like to see more of that,
 with Haskell or another pure functional language with an IO monad
 taking systems programmers away from the C and C++ communities.

That is good, I would probably call myself a C++ developer primarily,
but Haskell + the IO monad is a much better choice (even better than
Perl...) in most situations.

 Hugs is completely useless for *that*.

Yes, it is, although it is great for testing the validity of one's
Haskell programs.

 for an example of Haskell as a better C than C, see Chak's Gtk+
 bindings.  to use them you must write your whole GUI in the IO monad
 in a style where the basic data structures and control structures
 closely resemble what you would write in C.

I have seen it, and it is a bit imperative in its style.

 many Haskellers have a negative opinion of such heavy use of the
 IO monad, but in systems programming you need more control over
 when (relative to other interactions) your program performs an
 interaction with a file, network or UI resource than is available
 in Haskell without the IO monad.

I must confess that I belong to that theoretical group, although I am
a systems programmer (i.e., developing real systems for real customers,
paying real money).

In my opinion, the use of categorical monads in a programming language
is a brilliant intellectual achievement, combining the pure and real
in programming, at least on paper... This is where the Haskellers
divide: one group (me included) consider the hidden state in monads as
falling back to imperative thought patterns. In some cases, as in IO, we
need to have a state, obviously, but it would be beneficial if we could
divide the logic into state independent (defined outside the monads) and
state dependent code (defined within the monads; such as IO).

It is a bit like some C++ developers I meet, who consider themselves
to develop in C++ just because they use const or  (reference):
functional programming is about choice of language and choice of
programming style.

/David

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Running out of memory in a simple monad

2002-11-30 Thread Richard Uhtenwoldt
David Bergman writes:

It is easy to get the feeling that the recursive call in

   recurse = do
   f x
   recurse

is the last one, but this is just an illusion of the iterative layout of
do. Sometimes the monads lure us into old iterative thought
patterns...

Taking away the syntactic sugar, we end up with

   recurse = f x  recurse

This reveals the fact that there can be no last call optimization,
because the last call is .

What do you mean by this?  Do you mean that that an implementation
cannot execute arbitrarily many iterations/recursions of that last
loop/definition in constant space?

If that's what you mean, you are wrong.  GHC does that.  The IO monad
would be pretty damned useless for actual work if implementations did
not do that!

Even if we replace the () with a (:) as the main operator/last
call of the loop, the result can execute in constant space because
of optimizations.

E.g., the program

  loop n=n:loop (n+1)
  main = print (loop 0)

executes in constant space when compile by GHC 5.02.


Details: 

Specifically, I let it run till it printed 2,500,000 at which time top
reported the RSS to be 1,500 with no increase having occurred in a
long time.  top's manpage says that RSS is the total amount of
physical memory used by the task, in kilobytes.

The statement about () is true when the () is in the IO monad.  I
did not check to see what happens in a user-defined monad.
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Running out of memory in a simple monad

2002-11-30 Thread David Bergman
Richard,

I assumed that the compiler had, through strictness analysis, used a
call stack in the compiled code, instead of the usual call frames in the
heap (equivalent to hey, I thought this was the Standard ML mail
group?) ;-)

Actually, you are right in that the last call optimization is not
vital in most call-by-need scenarios, since both (the implicit) call
stack and data structures are often consumed as generated, and the GC
can reclaim the thunks. Note: In an unoptimized scenario, such as with
Hugs, you do indeed run out of memory in your loop (after some 4
iterations) not having the recursion in the last call. Even loops not
constructing cons cells do, such as

loop 0 = return ()
loop n = do { print n; loop (n-1) } -- print n  loop (n-1)
main = loop 5

(you see Hugs die after some 25000 iterations...)

Sorry about the over-simplification, but I wanted people to not forget
that the do is just syntactic sugar...

Thanks,

David

-Original Message-
From: Richard Uhtenwoldt
Sent: Saturday, November 30, 2002 11:53 PM
To: David Bergman
Cc: [EMAIL PROTECTED]
Subject: RE: Running out of memory in a simple monad

David Bergman writes:

It is easy to get the feeling that the recursive call in

   recurse = do
   f x
   recurse

is the last one, but this is just an illusion of the iterative layout
of
do. Sometimes the monads lure us into old iterative thought
patterns...

Taking away the syntactic sugar, we end up with

   recurse = f x  recurse

This reveals the fact that there can be no last call optimization,
because the last call is .

What do you mean by this?  Do you mean that that an implementation
cannot execute arbitrarily many iterations/recursions of that last
loop/definition in constant space?

If that's what you mean, you are wrong.  GHC does that.  The IO monad
would be pretty damned useless for actual work if implementations did
not do that!

Even if we replace the () with a (:) as the main operator/last
call of the loop, the result can execute in constant space because
of optimizations.

E.g., the program

  loop n=n:loop (n+1)
  main = print (loop 0)

executes in constant space when compile by GHC 5.02.


Details: 

Specifically, I let it run till it printed 2,500,000 at which time top
reported the RSS to be 1,500 with no increase having occurred in a
long time.  top's manpage says that RSS is the total amount of
physical memory used by the task, in kilobytes.

The statement about () is true when the () is in the IO monad.  I
did not check to see what happens in a user-defined monad.
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Running out of memory in a simple monad

2002-11-29 Thread Zdenek Dvorak
Hello,

I hope I understand what's going on; if not please someone correct me.


I have problems with monads and memory. I have a monad through which
I thread output. If I do the concatenation of the output-strings in
one way Hugs runs out of memory, but if I do it in another way
everything works. I can't see why the first way doesn't work but the
second is OK. I woudl appreciate if someone could tell me what I am
doing wrong.
  Here is the non-working monad:   -}


The problem is not directly connected to monads; what is the problem:

[] ++ x = x
(h:t) ++ x = h : (t++x),

i.e. time complexity of ++ is proportional to the length of first list.

first way:


putCharM c  = M $ \o  - ((), o ++ [c]) -- Is this stupid in some way?


this takes list (looong) of everything produced before this putCharM and
concatenates c as last member; this takes time linear in the length of
the list, summing over all putCharMs it is quadratic (and of course,
due to laziness a lot of memory is consumed; seq does not help, as it only
evaluates first cell of the list so that it sees it is not empty; deepSeq
would solve this, but the time consumption would still stay long).

the second way:


  M f = k  = M $
let (x, o)   = f
M f2 = k x
(x', o') = f2
in (x', o ++ o')


this is done reverse way (because = is bracketed to the right); we
concatenate output of f (short) with output of f2 (the rest of computation,
i.e. looong); but the time is proportional to the length of first list,
so it is constant in our case; summing it over all putCharMs, we get linear
time and we are happy :-)

If you want to do it the first way, define

putCharM c  = M $ \o  - ((), c : o)

and then reverse the list in runM.

Zdenek Dvorak

_
Add photos to your messages with MSN 8. Get 2 months FREE*. 
http://join.msn.com/?page=features/featuredemail

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell


RE: Running out of memory in a simple monad

2002-11-29 Thread David Bergman
Just to add to what Zdenek wrote:

The linear complexity of string concatenation in a naïve implementation
(not having access to an extra-language end-of-list in the diff list
sense...) make the total complexity O(n^2), since the number of conses
generated is thus

sum [1 .. n]

which, obviously, is (1+n)*n/2. In the case of the n=5 in the
example we get sum [1 .. 5] = 1250025000. So, well over one
billion conses. This is why ++ is right associative :-)

This time complexity cannot make Hugs crash, though, except for a defect
GC, having problems tidying up after each round of ++.

The space complexity, which reduces to maximum execution stack space
(considering a proper GC) in the example, is what kills Hugs. The
problem is that the string concatenation is not the last call, so there
is no room for last call optimization.

If you want to mimic the complexity of the example while calculating the
number of conses required, try evaluating the isomorphic expression

last $ scanl1 (+) $ take 5 (repeat 1)

It might crash in Hugs, running out of execution stack space, for the
same reason as the original example. It is the last that holds up the
stack space, by the way.

I hope this was helpful.

Regarding do:

It is easy to get the feeling that the recursive call in

recurse = do
f x
recurse

is the last one, but this is just an illusion of the iterative layout of
do. Sometimes the monads lure us into old iterative thought
patterns...

Taking away the syntactic sugar, we end up with

recurse = f x  recurse

This reveals the fact that there can be no last call optimization,
because the last call is .

Regards,

David

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell