Re: [Haskell-cafe] Space leak - help needed

2008-03-14 Thread Justin Bailey
On Thu, Mar 13, 2008 at 4:50 PM, Krzysztof Kościuszkiewicz
[EMAIL PROTECTED] wrote:
  Retainers are thunks or objects on stack that keep references to
  live objects. All retainers of an object are called the object's
  retainer set.  Now when one makes a profiling run, say with ./jobname
  +RTS -p -hr, the graph refernces retainer sets from jobname.prof. My
  understanding is that it is the total size of all objects retained by
  retainer sets being plotted, correct?

Yes, all retainer sets are being profiled. However, you can FILTER the
retainer sets profiled to those containing certain cost-centres. This
is a key point because it allows you to divide-and-conquer when
tracking down a retainer leak. That is, if you filter to a certain
cost-centre and the retainer graph is flat, you know that cost-centre
is not involved. For example, if you have a cost-centre annotation
like {-# SCC leaky #-} in your code, you can filter the retainer set
like this:

  Leaky.exe +RTS -hr -hCleaky -RTS

Review the documentation for other options.


  About decoding the sets from jobname.prof - for example in

   SET 2 = {MAIN.SYSTEM}
   SET 16 = {Main.CAF, MAIN.SYSTEM}
   SET 18 = {MAIN.SYSTEM, Main.many1,Main.list,Main.expr,Main.CAF}

  {...} means it's a set, and ccN,...,cc0 is the retainer cost centre
  (ccN) and hierarchy of parent cost centres up to the top level (cc0)?

  My understanding is that SET 18 above refers to objects that are
  retained by exactly two specified cost centres, right?


The docs say

  An object B retains object A if (i) B is a retainer object and (ii)
object A can be reached by recursively following pointers starting
from object B, but not meeting any other retainer objects on the way.
Each live object is retained by one or more retainer objects,
collectively called its retainer set ...

That says to me that SET18 above is the set of all objects which are
retained by those two call stacks, and only those call stacks. The
individual .. items aren't call stacks but I think they refer to
where the retaining object (B in the paragraph) was itself retained,
so they are like call stacks. My intuition is very fuzzy here.

  Finally, what is the MAIN.SYSTEM retainer?

I think that is everything else - any object created in the runtime
system that is not directly attributable to something being profiled.
Maybe it is objects from libraries that were not compiled with
profiling? I imagine objects created by the GHC primitives would fall
in this category too.

Since someone else found your space leak, does the retainer profiling
advice point to it? I'd like to know if it is actually accurate or
not! I've only applied it in some very limited situations.

Justin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-13 Thread Bertram Felgenhauer
Krzysztof Kościuszkiewicz wrote:
 I have tried both Poly.StateLazy and Poly.State and they work quite well
 - at least the space leak is eliminated. Now evaluation of the parser
 state blows the stack...
 
 The code is at http://hpaste.org/6310

Apparently, stUpdate is too lazy. I'd define

stUpdate' :: (s - s) - Parser s t ()
stUpdate' f = stUpdate f  stGet = (`seq` return ())

and try using stUpdate' instead of stUpdate in incCount.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-13 Thread Krzysztof Kościuszkiewicz
On Thu, Mar 13, 2008 at 05:52:05PM +0100, Bertram Felgenhauer wrote:

  ... Now evaluation of the parser state blows the stack...
  
  The code is at http://hpaste.org/6310
 
 Apparently, stUpdate is too lazy. I'd define
 
 stUpdate' :: (s - s) - Parser s t ()
 stUpdate' f = stUpdate f  stGet = (`seq` return ())
 
 and try using stUpdate' instead of stUpdate in incCount.

Yes, that solves the stack issue. Thanks!
-- 
Krzysztof Kościuszkiewicz
Skype: dr.vee,  Gadu: 111851,  Jabber: [EMAIL PROTECTED]
Simplicity is the ultimate sophistication -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-13 Thread Krzysztof Kościuszkiewicz
On Wed, Mar 12, 2008 at 12:34:38PM -0700, Justin Bailey wrote:

 The stack blows up when a bunch of unevaluated thunks build up, and
 you try to evaluate them. One way to determine where those thunks are
 getting built is to use GHCs retainer profiling. Retainer sets will
 show you the call stack that is holding on to memory. That can give
 you a clue where these thunks are being created. To get finer-grained
 results, annotate your code with {#- SCC ... #-} pragmas. Then you
 can filter the retainer profile by those annotations. That will help
 you determine where in a given function the thunks are being created.
 
 If you need help with profiling basics, feel free to ask.

I'm not entirely sure if I understand retainer profiling correctly... So
please clarify if you spot any obvious blunders.

Retainers are thunks or objects on stack that keep references to
live objects. All retainers of an object are called the object's
retainer set.  Now when one makes a profiling run, say with ./jobname
+RTS -p -hr, the graph refernces retainer sets from jobname.prof. My
understanding is that it is the total size of all objects retained by
retainer sets being plotted, correct?

About decoding the sets from jobname.prof - for example in

 SET 2 = {MAIN.SYSTEM}
 SET 16 = {Main.CAF, MAIN.SYSTEM}
 SET 18 = {MAIN.SYSTEM, Main.many1,Main.list,Main.expr,Main.CAF}

{...} means it's a set, and ccN,...,cc0 is the retainer cost centre
(ccN) and hierarchy of parent cost centres up to the top level (cc0)?

My understanding is that SET 18 above refers to objects that are
retained by exactly two specified cost centres, right?

Finally, what is the MAIN.SYSTEM retainer?

Thanks in advance,
-- 
Krzysztof Kościuszkiewicz
Skype: dr.vee,  Gadu: 111851,  Jabber: [EMAIL PROTECTED]
Simplicity is the ultimate sophistication -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-12 Thread Krzysztof Kościuszkiewicz
On Mon, Mar 03, 2008 at 05:20:09AM +0100, Bertram Felgenhauer wrote:

  Another story from an (almost) happy Haskell user that finds himself
  overwhelmed by laziness/space leaks.
  
  I'm trying to parse a large file (600MB) with a single S-expression
  like structure. With the help of ByteStrings I'm down to 4min processing
  time in constant space. However, when I try to wrap the parse results
  in a data structure, the heap blows up - even though I never actually
  inspect the structure being built! This bugs me, so I come here looking
  for answers.
 
 The polyparse library (http://www.cs.york.ac.uk/fp/polyparse/)
 offers some lazy parsers, maybe one of those fits your needs.
 Text.ParserCombinators.Poly.StateLazy is the obvious candidate.

I have tried both Poly.StateLazy and Poly.State and they work quite well
- at least the space leak is eliminated. Now evaluation of the parser
state blows the stack...

The code is at http://hpaste.org/6310

Thanks in advance,
-- 
Krzysztof Kościuszkiewicz
Skype: dr.vee,  Gadu: 111851,  Jabber: [EMAIL PROTECTED]
Simplicity is the ultimate sophistication -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-12 Thread Justin Bailey
On Wed, Mar 12, 2008 at 12:12 PM, Krzysztof Kościuszkiewicz
[EMAIL PROTECTED] wrote:
  I have tried both Poly.StateLazy and Poly.State and they work quite well
  - at least the space leak is eliminated. Now evaluation of the parser
  state blows the stack...

  The code is at http://hpaste.org/6310

  Thanks in advance,

The stack blows up when a bunch of unevaluated thunks build up, and
you try to evaluate them. One way to determine where those thunks are
getting built is to use GHCs retainer profiling. Retainer sets will
show you the call stack that is holding on to memory. That can give
you a clue where these thunks are being created. To get finer-grained
results, annotate your code with {#- SCC ... #-} pragmas. Then you
can filter the retainer profile by those annotations. That will help
you determine where in a given function the thunks are being created.

If you need help with profiling basics, feel free to ask.

Justin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-02 Thread Luke Palmer
On Mon, Mar 3, 2008 at 2:23 AM, Krzysztof Kościuszkiewicz
[EMAIL PROTECTED] wrote:
 Dear Haskellers,

  Another story from an (almost) happy Haskell user that finds himself
  overwhelmed by laziness/space leaks.

  I'm trying to parse a large file (600MB) with a single S-expression
  like structure. With the help of ByteStrings I'm down to 4min processing
  time in constant space. However, when I try to wrap the parse results
  in a data structure, the heap blows up - even though I never actually
  inspect the structure being built! This bugs me, so I come here looking
  for answers.

Well, I haven't read this through, but superficially, it looks like
you're expecting the data structure to be constructed lazily.  But...

   -- Syntax of expressions
   data Exp = Sym !B.ByteString | List ![Exp]
   deriving (Eq, Show)

It is declared as strict, so it's not going to be constructed lazily...

Luke
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-02 Thread Bertram Felgenhauer
Krzysztof Kościuszkiewicz wrote:
 Another story from an (almost) happy Haskell user that finds himself
 overwhelmed by laziness/space leaks.
 
 I'm trying to parse a large file (600MB) with a single S-expression
 like structure. With the help of ByteStrings I'm down to 4min processing
 time in constant space. However, when I try to wrap the parse results
 in a data structure, the heap blows up - even though I never actually
 inspect the structure being built! This bugs me, so I come here looking
 for answers.

Note that Parsec has to parse the whole file before it can decide
whether to return a result (Left _) or an error (Right _). ghc would
have to be quite smart to eliminate the creation of the expression
tree entirely.

The polyparse library (http://www.cs.york.ac.uk/fp/polyparse/)
offers some lazy parsers, maybe one of those fits your needs.
Text.ParserCombinators.Poly.StateLazy is the obvious candidate.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe