Re: [Haskell-cafe] Space leak - help needed

2008-03-14 Thread Justin Bailey
On Thu, Mar 13, 2008 at 4:50 PM, Krzysztof Kościuszkiewicz
[EMAIL PROTECTED] wrote:
  Retainers are thunks or objects on stack that keep references to
  live objects. All retainers of an object are called the object's
  retainer set.  Now when one makes a profiling run, say with ./jobname
  +RTS -p -hr, the graph refernces retainer sets from jobname.prof. My
  understanding is that it is the total size of all objects retained by
  retainer sets being plotted, correct?

Yes, all retainer sets are being profiled. However, you can FILTER the
retainer sets profiled to those containing certain cost-centres. This
is a key point because it allows you to divide-and-conquer when
tracking down a retainer leak. That is, if you filter to a certain
cost-centre and the retainer graph is flat, you know that cost-centre
is not involved. For example, if you have a cost-centre annotation
like {-# SCC leaky #-} in your code, you can filter the retainer set
like this:

  Leaky.exe +RTS -hr -hCleaky -RTS

Review the documentation for other options.


  About decoding the sets from jobname.prof - for example in

   SET 2 = {MAIN.SYSTEM}
   SET 16 = {Main.CAF, MAIN.SYSTEM}
   SET 18 = {MAIN.SYSTEM, Main.many1,Main.list,Main.expr,Main.CAF}

  {...} means it's a set, and ccN,...,cc0 is the retainer cost centre
  (ccN) and hierarchy of parent cost centres up to the top level (cc0)?

  My understanding is that SET 18 above refers to objects that are
  retained by exactly two specified cost centres, right?


The docs say

  An object B retains object A if (i) B is a retainer object and (ii)
object A can be reached by recursively following pointers starting
from object B, but not meeting any other retainer objects on the way.
Each live object is retained by one or more retainer objects,
collectively called its retainer set ...

That says to me that SET18 above is the set of all objects which are
retained by those two call stacks, and only those call stacks. The
individual .. items aren't call stacks but I think they refer to
where the retaining object (B in the paragraph) was itself retained,
so they are like call stacks. My intuition is very fuzzy here.

  Finally, what is the MAIN.SYSTEM retainer?

I think that is everything else - any object created in the runtime
system that is not directly attributable to something being profiled.
Maybe it is objects from libraries that were not compiled with
profiling? I imagine objects created by the GHC primitives would fall
in this category too.

Since someone else found your space leak, does the retainer profiling
advice point to it? I'd like to know if it is actually accurate or
not! I've only applied it in some very limited situations.

Justin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-13 Thread Bertram Felgenhauer
Krzysztof Kościuszkiewicz wrote:
 I have tried both Poly.StateLazy and Poly.State and they work quite well
 - at least the space leak is eliminated. Now evaluation of the parser
 state blows the stack...
 
 The code is at http://hpaste.org/6310

Apparently, stUpdate is too lazy. I'd define

stUpdate' :: (s - s) - Parser s t ()
stUpdate' f = stUpdate f  stGet = (`seq` return ())

and try using stUpdate' instead of stUpdate in incCount.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-13 Thread Krzysztof Kościuszkiewicz
On Thu, Mar 13, 2008 at 05:52:05PM +0100, Bertram Felgenhauer wrote:

  ... Now evaluation of the parser state blows the stack...
  
  The code is at http://hpaste.org/6310
 
 Apparently, stUpdate is too lazy. I'd define
 
 stUpdate' :: (s - s) - Parser s t ()
 stUpdate' f = stUpdate f  stGet = (`seq` return ())
 
 and try using stUpdate' instead of stUpdate in incCount.

Yes, that solves the stack issue. Thanks!
-- 
Krzysztof Kościuszkiewicz
Skype: dr.vee,  Gadu: 111851,  Jabber: [EMAIL PROTECTED]
Simplicity is the ultimate sophistication -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-13 Thread Krzysztof Kościuszkiewicz
On Wed, Mar 12, 2008 at 12:34:38PM -0700, Justin Bailey wrote:

 The stack blows up when a bunch of unevaluated thunks build up, and
 you try to evaluate them. One way to determine where those thunks are
 getting built is to use GHCs retainer profiling. Retainer sets will
 show you the call stack that is holding on to memory. That can give
 you a clue where these thunks are being created. To get finer-grained
 results, annotate your code with {#- SCC ... #-} pragmas. Then you
 can filter the retainer profile by those annotations. That will help
 you determine where in a given function the thunks are being created.
 
 If you need help with profiling basics, feel free to ask.

I'm not entirely sure if I understand retainer profiling correctly... So
please clarify if you spot any obvious blunders.

Retainers are thunks or objects on stack that keep references to
live objects. All retainers of an object are called the object's
retainer set.  Now when one makes a profiling run, say with ./jobname
+RTS -p -hr, the graph refernces retainer sets from jobname.prof. My
understanding is that it is the total size of all objects retained by
retainer sets being plotted, correct?

About decoding the sets from jobname.prof - for example in

 SET 2 = {MAIN.SYSTEM}
 SET 16 = {Main.CAF, MAIN.SYSTEM}
 SET 18 = {MAIN.SYSTEM, Main.many1,Main.list,Main.expr,Main.CAF}

{...} means it's a set, and ccN,...,cc0 is the retainer cost centre
(ccN) and hierarchy of parent cost centres up to the top level (cc0)?

My understanding is that SET 18 above refers to objects that are
retained by exactly two specified cost centres, right?

Finally, what is the MAIN.SYSTEM retainer?

Thanks in advance,
-- 
Krzysztof Kościuszkiewicz
Skype: dr.vee,  Gadu: 111851,  Jabber: [EMAIL PROTECTED]
Simplicity is the ultimate sophistication -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-12 Thread Krzysztof Kościuszkiewicz
On Mon, Mar 03, 2008 at 05:20:09AM +0100, Bertram Felgenhauer wrote:

  Another story from an (almost) happy Haskell user that finds himself
  overwhelmed by laziness/space leaks.
  
  I'm trying to parse a large file (600MB) with a single S-expression
  like structure. With the help of ByteStrings I'm down to 4min processing
  time in constant space. However, when I try to wrap the parse results
  in a data structure, the heap blows up - even though I never actually
  inspect the structure being built! This bugs me, so I come here looking
  for answers.
 
 The polyparse library (http://www.cs.york.ac.uk/fp/polyparse/)
 offers some lazy parsers, maybe one of those fits your needs.
 Text.ParserCombinators.Poly.StateLazy is the obvious candidate.

I have tried both Poly.StateLazy and Poly.State and they work quite well
- at least the space leak is eliminated. Now evaluation of the parser
state blows the stack...

The code is at http://hpaste.org/6310

Thanks in advance,
-- 
Krzysztof Kościuszkiewicz
Skype: dr.vee,  Gadu: 111851,  Jabber: [EMAIL PROTECTED]
Simplicity is the ultimate sophistication -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-12 Thread Justin Bailey
On Wed, Mar 12, 2008 at 12:12 PM, Krzysztof Kościuszkiewicz
[EMAIL PROTECTED] wrote:
  I have tried both Poly.StateLazy and Poly.State and they work quite well
  - at least the space leak is eliminated. Now evaluation of the parser
  state blows the stack...

  The code is at http://hpaste.org/6310

  Thanks in advance,

The stack blows up when a bunch of unevaluated thunks build up, and
you try to evaluate them. One way to determine where those thunks are
getting built is to use GHCs retainer profiling. Retainer sets will
show you the call stack that is holding on to memory. That can give
you a clue where these thunks are being created. To get finer-grained
results, annotate your code with {#- SCC ... #-} pragmas. Then you
can filter the retainer profile by those annotations. That will help
you determine where in a given function the thunks are being created.

If you need help with profiling basics, feel free to ask.

Justin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Space leak - help needed

2008-03-02 Thread Krzysztof Kościuszkiewicz
Dear Haskellers,

Another story from an (almost) happy Haskell user that finds himself
overwhelmed by laziness/space leaks.

I'm trying to parse a large file (600MB) with a single S-expression
like structure. With the help of ByteStrings I'm down to 4min processing
time in constant space. However, when I try to wrap the parse results
in a data structure, the heap blows up - even though I never actually
inspect the structure being built! This bugs me, so I come here looking
for answers.

Parser follows:

 module Main where
 
 import qualified Data.ByteString.Lazy.Char8 as B
 import Text.ParserCombinators.Parsec
 import Text.ParserCombinators.Parsec.Pos
 import System.Environment
 import System.Exit
 import qualified Data.Map as M
 import Lexer
 
 type XdlParser a = GenParser Token XdlState a
 
 -- Parser state
 type XdlState = Counts
 
 type Counts = M.Map Count Integer
 
 data Count = ListCount | SymbolCount
 deriving (Eq, Ord, Show)
 
 emptyXdlState = M.empty
 
 incCount :: Count - (Counts - Counts)
 incCount c = M.insertWith' (+) c 1
 
 -- handling tokens
 myToken  :: (Token - Maybe a) - XdlParser a
 myToken test  = token showTok posTok testTok
 where
 showTok = show
 posTok  = const (initialPos )
 testTok = test
 
 -- Syntax of expressions
 data Exp = Sym !B.ByteString | List ![Exp]
 deriving (Eq, Show)
 
 expr =  list | symbol
 
 rparen = myToken $ \t - case t of
 RParen  - Just ()
 other   - Nothing
 
 lparen = myToken $ \t - case t of
 LParen  - Just ()
 other   - Nothing
 
 name = myToken $ \t - case t of
 Name n - Just n
 other  - Nothing
 
 list = do
 updateState $ incCount ListCount
 lparen
 xs - many1 expr
 rparen
 return ()
 --  return $! (List xs)
 
 symbol = do
 updateState $ incCount SymbolCount
 name  return ()
 --  Sym `fmap` name
 
 -- Top level parser
 top :: XdlParser XdlState
 top =  do
 l - many1 list
 eof
 getState
 
 main = do
 args - getArgs
 case args of
 [fname] - do
 text - B.readFile fname
 let result = runParser top emptyXdlState fname (tokenize text)
 putStrLn $ either show show result
 _ - putStrLn usage: parse filename  exitFailure

And the Lexer:

 module Lexer (Token(..), tokenize) where
 
 import qualified Data.ByteString.Lazy.Char8 as B
 import Control.Monad
 import Data.Char
 import Data.List
 import System.Environment
 import System.Exit
 
 data Token = LParen
| RParen
| Name B.ByteString
 deriving (Ord, Eq, Show)
 
 type Input = B.ByteString
 
 -- Processor returns Nothing if it can't process the Input
 type Processor = Input - Maybe ([Token], Input)
 
 -- Tokenize ends the list when all processors return Nothing
 tokenize :: Input - [Token]
 tokenize  = concat . unfoldr processAll
 where
 processors= [doSpaces, doComment, doParens, doName]
 processAll   :: Processor
 processAll bs = if B.null bs 
 then Nothing
 else foldr mminus Nothing $ map ($ bs) processors
 mminus a@(Just _) _ = a
 mminus Nothingb = b
 
 doSpaces:: Processor
 doSpaces bs =
 if B.null sp
 then Nothing
 else Just ([], nsp)
 where
 (sp, nsp) = B.span isSpace bs
 
 doComment:: Processor
 doComment bs =
 if B.pack #  `B.isPrefixOf` bs
 then Just ([], B.dropWhile (/= '\n') bs)
 else Nothing
 
 doParens :: Processor
 doParens bs  = case B.head bs of
 '(' - Just ([LParen], B.tail bs)
 ')' - Just ([RParen], B.tail bs)
 _   - Nothing
 
 doName   :: Processor
 doName  bs   =
 if B.null nsp
 then Nothing
 else Just ([Name nsp], sp)
 where
 (nsp, sp) = B.span (not . isRest) bs
 isRest c = isSpace c || c == ')' || c == '('

Regards,
-- 
Krzysztof Kościuszkiewicz
Skype: dr.vee,  Gadu: 111851,  Jabber: [EMAIL PROTECTED]
Simplicity is the ultimate sophistication -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-02 Thread Luke Palmer
On Mon, Mar 3, 2008 at 2:23 AM, Krzysztof Kościuszkiewicz
[EMAIL PROTECTED] wrote:
 Dear Haskellers,

  Another story from an (almost) happy Haskell user that finds himself
  overwhelmed by laziness/space leaks.

  I'm trying to parse a large file (600MB) with a single S-expression
  like structure. With the help of ByteStrings I'm down to 4min processing
  time in constant space. However, when I try to wrap the parse results
  in a data structure, the heap blows up - even though I never actually
  inspect the structure being built! This bugs me, so I come here looking
  for answers.

Well, I haven't read this through, but superficially, it looks like
you're expecting the data structure to be constructed lazily.  But...

   -- Syntax of expressions
   data Exp = Sym !B.ByteString | List ![Exp]
   deriving (Eq, Show)

It is declared as strict, so it's not going to be constructed lazily...

Luke
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak - help needed

2008-03-02 Thread Bertram Felgenhauer
Krzysztof Kościuszkiewicz wrote:
 Another story from an (almost) happy Haskell user that finds himself
 overwhelmed by laziness/space leaks.
 
 I'm trying to parse a large file (600MB) with a single S-expression
 like structure. With the help of ByteStrings I'm down to 4min processing
 time in constant space. However, when I try to wrap the parse results
 in a data structure, the heap blows up - even though I never actually
 inspect the structure being built! This bugs me, so I come here looking
 for answers.

Note that Parsec has to parse the whole file before it can decide
whether to return a result (Left _) or an error (Right _). ghc would
have to be quite smart to eliminate the creation of the expression
tree entirely.

The polyparse library (http://www.cs.york.ac.uk/fp/polyparse/)
offers some lazy parsers, maybe one of those fits your needs.
Text.ParserCombinators.Poly.StateLazy is the obvious candidate.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-04 Thread Dominic Steinitz
On Saturday 03 February 2007 19:56, Pepe Iborra wrote:
 pad :: [Word8] - [Word8]
 pad xs = pad' xs 0

 pad' (x:xs) l = x : pad' xs (succ l)
 pad' [] l = [0x80] ++ ps ++ lb
     where
        pl = (64-(l+9)) `mod` 64
        ps = replicate pl 0x00
        lb = i2osp 8 (8*l)
Pepe,

Thanks but this gives me a different problem

[EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS
[2845392438,1191608682,3124634993,2018558572,2630932637]
[2224569924,473682542,3131984545,4182845925,3846598897]
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize' to increase it.

Someone suggested

pad :: Num a = [a] - [a]
pad = pad' 0
  where pad' !l [] = [0x80] ++ ps ++ lb
  where pl = (64-(l+9)) `mod` 64
ps = replicate pl 0x00
lb = i2osp 8 (8*l)
pad' !l (x:xs) = x : pad' (l+1) xs

but that didn't compile

*Main :r
[2 of 2] Compiling Main ( allInOne.hs, interpreted )

allInOne.hs:83:14: Parse error in pattern
Failed, modules loaded: Codec.Utils.

Dominic.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-04 Thread Stefan O'Rear
On Sun, Feb 04, 2007 at 08:20:23AM +, Dominic Steinitz wrote:
 Someone suggested
 
 pad :: Num a = [a] - [a]
 pad = pad' 0
   where pad' !l [] = [0x80] ++ ps ++ lb
   where pl = (64-(l+9)) `mod` 64
 ps = replicate pl 0x00
 lb = i2osp 8 (8*l)
 pad' !l (x:xs) = x : pad' (l+1) xs
 
 but that didn't compile
 
 *Main :r
 [2 of 2] Compiling Main ( allInOne.hs, interpreted )
 
 allInOne.hs:83:14: Parse error in pattern
 Failed, modules loaded: Codec.Utils.
 
 Dominic.

The '!' is a GHC extension, enabled using the flag '-fbang-patterns'.

Equivalently, you can use Haskell 98's seq :

pad :: Num a = [a] - [a]
pad = pad' 0
  where pad' l [] | l `seq` False = undefined
pad' l [] = [0x80] ++ ps ++ lb
  where pl = (64-(l+9)) `mod` 64
ps = replicate pl 0x00
lb = i2osp 8 (8*l)
pad' l (x:xs) = x : pad' (l+1) xs

The first alternative never succeeds, but to see that the compiler
must force the evaluation of 'l'.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-04 Thread Dominic Steinitz
On Saturday 03 February 2007 19:42, [EMAIL PROTECTED] wrote:
   I have re-written SHA1 so that is more idiomatically haskell and it is
   easy to see how it implements the specification. The only problem is I
   now have a space leak. I can see where the leak is but I'm less sure
   what to do about getting rid of it.
  
   Here's the offending function:
  
   pad :: [Word8] - [Word8]
   pad xs =
  xs ++ [0x80] ++ ps ++ lb
  where
 l = length xs
 pl = (64-(l+9)) `mod` 64
 ps = replicate pl 0x00
 lb = i2osp 8 (8*l)

 I would try something along the following lines (untested):

 \begin{spec}
 catWithLen xs f = xs ++ f (length xs)
 \end{spec}

 \begin{code}
 catWithLen :: [a] - (Int - [a]) - [a]
 catWithLen xs f = h 0 xs
   where
 h k [] = f k
 h k (x : xs) = case succ k of-- forcing evaluation
  k' - x : h k' xs
 \end{code}

 \begin{code}
 pad :: [Word8] - [Word8]
 pad xs = catWithLen xs f
   where
 f l = 0x80 : ps lb
   where
  -- we know that |l = length xs|
  pl = (64-(l+9)) `mod` 64
  ps = funPow pl (0x00 :)
  lb = i2osp 8 (8*l)
 \end{code}

 If you are lucky, then the replicate and the (++lb) in the original code
 might be fused by the compiler as an instance of foldr-build
 or something related --- check the optimised core code.

Wolfram,

Thanks but this gives a different problem:

[EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS
[2845392438,1191608682,3124634993,2018558572,2630932637]
[2224569924,473682542,3131984545,4182845925,3846598897]
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize' to increase it.

Dominic.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-04 Thread Stefan O'Rear
On Sun, Feb 04, 2007 at 08:30:44AM +, Dominic Steinitz wrote:
 On Saturday 03 February 2007 19:42, [EMAIL PROTECTED] wrote:
  I would try something along the following lines (untested):
 
  \begin{spec}
  catWithLen xs f = xs ++ f (length xs)
  \end{spec}
 
  \begin{code}
  catWithLen :: [a] - (Int - [a]) - [a]
  catWithLen xs f = h 0 xs
where
  h k [] = f k
  h k (x : xs) = case succ k of-- forcing evaluation
   k' - x : h k' xs

Nice try.  k', as a variable binding, is irrefutable.

  \end{code}
 
  \begin{code}
  pad :: [Word8] - [Word8]
  pad xs = catWithLen xs f
where
  f l = 0x80 : ps lb
where
   -- we know that |l = length xs|
   pl = (64-(l+9)) `mod` 64
   ps = funPow pl (0x00 :)
   lb = i2osp 8 (8*l)
  \end{code}

 Thanks but this gives a different problem:
 
 [EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS
 [2845392438,1191608682,3124634993,2018558572,2630932637]
 [2224569924,473682542,3131984545,4182845925,3846598897]
 Stack space overflow: current size 8388608 bytes.
 Use `+RTS -Ksize' to increase it.

expected result of the excessive laziness above.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-04 Thread Dominic Steinitz
On Sunday 04 February 2007 08:28, Stefan O'Rear wrote:
 On Sun, Feb 04, 2007 at 08:20:23AM +, Dominic Steinitz wrote:
  Someone suggested
 
  pad :: Num a = [a] - [a]
  pad = pad' 0
where pad' !l [] = [0x80] ++ ps ++ lb
where pl = (64-(l+9)) `mod` 64
  ps = replicate pl 0x00
  lb = i2osp 8 (8*l)
  pad' !l (x:xs) = x : pad' (l+1) xs
 
  but that didn't compile
 
  *Main :r
  [2 of 2] Compiling Main ( allInOne.hs, interpreted )
 
  allInOne.hs:83:14: Parse error in pattern
  Failed, modules loaded: Codec.Utils.
 
  Dominic.

 The '!' is a GHC extension, enabled using the flag '-fbang-patterns'.

The test program runs to completion but still has a space leak consuming over 
25m.

 Equivalently, you can use Haskell 98's seq :

 pad :: Num a = [a] - [a]
 pad = pad' 0
   where pad' l [] | l `seq` False = undefined
 pad' l [] = [0x80] ++ ps ++ lb
   where pl = (64-(l+9)) `mod` 64
 ps = replicate pl 0x00
 lb = i2osp 8 (8*l)
 pad' l (x:xs) = x : pad' (l+1) xs

 The first alternative never succeeds, but to see that the compiler
 must force the evaluation of 'l'.

[EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS
[2845392438,1191608682,3124634993,2018558572,2630932637]
[2224569924,473682542,3131984545,4182845925,3846598897]
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize' to increase it.

Dominic.

PS I appreciate all the help I'm getting.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-04 Thread Stefan O'Rear
On Sun, Feb 04, 2007 at 09:45:12AM +, Dominic Steinitz wrote:
  pad :: Num a = [a] - [a]
  pad = pad' 0
where pad' l [] | l `seq` False = undefined

Stupid typo, that should be:

  where pad' l _ | l `seq` False = undefined

  pad' l [] = [0x80] ++ ps ++ lb
where pl = (64-(l+9)) `mod` 64
  ps = replicate pl 0x00
  lb = i2osp 8 (8*l)
  pad' l (x:xs) = x : pad' (l+1) xs
 
  The first alternative never succeeds, but to see that the compiler
  must force the evaluation of 'l'.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-04 Thread kahl
  
   \begin{code}
   catWithLen :: [a] - (Int - [a]) - [a]
   catWithLen xs f = h 0 xs
 where
   h k [] = f k
   h k (x : xs) = case succ k of-- forcing evaluation
k' - x : h k' xs
   \end{code}
  
  
  Thanks but this gives a different problem:
  
  [EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS
  [2845392438,1191608682,3124634993,2018558572,2630932637]
  [2224569924,473682542,3131984545,4182845925,3846598897]
  Stack space overflow: current size 8388608 bytes.
  Use `+RTS -Ksize' to increase it.


Does it still do that if you youse seq instead of case?


\begin{code}
catWithLen :: [a] - (Int - [a]) - [a]
catWithLen xs f = h 0 xs
  where
h k [] = f k
h k (x : xs) =  let k' = succ k
in k' `seq` x : h k' xs
\end{code}


Wolfram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Space Leak Help

2007-02-03 Thread Dominic Steinitz
I have re-written SHA1 so that is more idiomatically haskell and it is easy to 
see how it implements the specification. The only problem is I now have a 
space leak. I can see where the leak is but I'm less sure what to do about 
getting rid of it.

Here's the offending function:

pad :: [Word8] - [Word8]
pad xs =
   xs ++ [0x80] ++ ps ++ lb
   where
  l = length xs
  pl = (64-(l+9)) `mod` 64
  ps = replicate pl 0x00
  lb = i2osp 8 (8*l)

I've thought about zipping the xs with [1..] which will give me a length as I 
go. Is this the right way to go are there better techniques for dealing with 
this?

I've attached the full source below.

Dominic.

module Main(main) where

import Data.Char
import Data.Bits
import Data.List
import Data.Word
import System
import Codec.Utils

type Rotation = Int

rotL :: Rotation - Word32 - Word32
rotL s a = shiftL a s .|. shiftL a (s-32)

instance Num [Word32] where
   a + b = zipWith (+) a b

f n x y z 
   | (0 = n)   (n = 19) = (x .. y) .|. ((complement x) .. z)
   | (20 = n)  (n = 39) = x `xor` y `xor` z
   | (40 = n)  (n = 59) = (x .. y) .|. (x .. z) .|. (y .. z)
   | (60 = n)  (n = 79) = x `xor` y `xor` z
   | otherwise = error invalid index for f

k n
   | (0 = n)   (n = 19) = 0x5a827999
   | (20 = n)  (n = 39) = 0x6ed9eba1
   | (40 = n)  (n = 59) = 0x8f1bbcdc
   | (60 = n)  (n = 79) = 0xca62c1d6
   | otherwise = error invalid index for k

-- Word120 - Word512 - Word120 
oneBlock ss xs = (as!!80):(bs!!80):(cs!!80):(ds!!80):(es!!80):[]
   where
  ws = xs ++ map (rotL 1) (zipWith4 xxxor wm3s wm8s wm14s wm16s)
 where 
xxxor a b c d = a `xor` b `xor` c `xor` d
wm3s  = drop (16-3)  ws
wm8s  = drop (16-8)  ws
wm14s = drop (16-14) ws
wm16s = drop (16-16) ws
  as = (ss!!0):ts
  bs = (ss!!1):as
  cs = (ss!!2):(map (rotL 30) bs)
  ds = (ss!!3):cs 
  es = (ss!!4):ds
  ts = (map (rotL 5) as) + (zipWith4 f [0..] bs cs ds) + es + (map k 
[0..]) + ws

ss :: [Word32]
ss = [0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476, 0xc3d2e1f0]

pad :: [Word8] - [Word8]
pad xs =
   xs ++ [0x80] ++ ps ++ lb
   where
  l = length xs
  pl = (64-(l+9)) `mod` 64
  ps = replicate pl 0x00
  lb = i2osp 8 (8*l)

blockWord8sIn512 :: [Word8] - [[Word8]]
blockWord8sIn512 =
   unfoldr g
   where
  g [] = Nothing
  g xs = Just (splitAt 64 xs)

getWord32s :: [Word8] - [Word32]
getWord32s s = 
   map f [0..15]
   where 
  f i = foldl (+) 0 $ map (\n - toEnum (fromEnum (s!!(i*4+n))) `shiftL` 
(fromIntegral (8 * (3-n [0..3]

blockWord32sIn512 :: [Word8] - [[Word32]]
blockWord32sIn512 = (map getWord32s) . blockWord8sIn512 . pad

-- Word120 - Word512 - Word120
hashOnce ss a = ss + oneBlock ss a

hash = foldl' hashOnce ss . blockWord32sIn512

convert :: String - [Word8]
convert = map (fromIntegral . ord)

short :: [Word8]
short = convert abc

message :: [Word8]
message = convert abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq

performance n =
   (convert . take n . repeat) 'a'

test n = mapM_ (putStrLn . show . hash) [short, message, performance n]

main =
   do progName - getProgName
  args - getArgs
  if length args /= 1
 then putStrLn (Usage:  ++ progName ++  testSize)
 else test (read (args!!0))



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-03 Thread kahl
  
  I have re-written SHA1 so that is more idiomatically haskell and it is easy 
  to 
  see how it implements the specification. The only problem is I now have a 
  space leak. I can see where the leak is but I'm less sure what to do about 
  getting rid of it.
  
  Here's the offending function:
  
  pad :: [Word8] - [Word8]
  pad xs =
 xs ++ [0x80] ++ ps ++ lb
 where
l = length xs
pl = (64-(l+9)) `mod` 64
ps = replicate pl 0x00
lb = i2osp 8 (8*l)


I would try something along the following lines (untested):

\begin{spec}
catWithLen xs f = xs ++ f (length xs)
\end{spec}

\begin{code}
catWithLen :: [a] - (Int - [a]) - [a]
catWithLen xs f = h 0 xs
  where
h k [] = f k
h k (x : xs) = case succ k of-- forcing evaluation
 k' - x : h k' xs
\end{code}

\begin{code}
pad :: [Word8] - [Word8]
pad xs = catWithLen xs f
  where
f l = 0x80 : ps lb
  where
 -- we know that |l = length xs|
 pl = (64-(l+9)) `mod` 64
 ps = funPow pl (0x00 :)
 lb = i2osp 8 (8*l)
\end{code}

If you are lucky, then the replicate and the (++lb) in the original code
might be fused by the compiler as an instance of foldr-build
or something related --- check the optimised core code. 

In my variant I changed this to rely on efficient function powering
instead:

\begin{spec}
funPow k f = foldr (.) id $ replicate k f
\end{spec}

\begin{code}
funPow :: Int - (a - a) - (a - a)
funPow n f = case compare n 0 of
LT - error (funPow: negative argument:  ++ show n)
EQ - id
GT - pow n f
  where
pow m g = if m  1
  then let (m',r) = divMod m 2
   g' = g . g
   in if r == 0
  then pow m' g'
  else pow m' g' . g
  else g
\end{code}

(You will probably also consider using Data.Bits
 for (`mod` 64), (8*), and (`divMod` 2).
)


Wolfram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space Leak Help

2007-02-03 Thread Pepe Iborra

hi Dominic

Explicit recursion works just fine for me and keeps things simple:

pad :: [Word8] - [Word8]
pad xs = pad' xs 0

pad' (x:xs) l = x : pad' xs (succ l)
pad' [] l = [0x80] ++ ps ++ lb
   where
  pl = (64-(l+9)) `mod` 64
  ps = replicate pl 0x00
  lb = i2osp 8 (8*l)


at the cost of (very slightly) hiding data flow.
Seems exactly what you were trying to avoid?

Cheers
pepe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe