Re: [Haskell-cafe] Space leak - help needed
On Thu, Mar 13, 2008 at 4:50 PM, Krzysztof Kościuszkiewicz [EMAIL PROTECTED] wrote: Retainers are thunks or objects on stack that keep references to live objects. All retainers of an object are called the object's retainer set. Now when one makes a profiling run, say with ./jobname +RTS -p -hr, the graph refernces retainer sets from jobname.prof. My understanding is that it is the total size of all objects retained by retainer sets being plotted, correct? Yes, all retainer sets are being profiled. However, you can FILTER the retainer sets profiled to those containing certain cost-centres. This is a key point because it allows you to divide-and-conquer when tracking down a retainer leak. That is, if you filter to a certain cost-centre and the retainer graph is flat, you know that cost-centre is not involved. For example, if you have a cost-centre annotation like {-# SCC leaky #-} in your code, you can filter the retainer set like this: Leaky.exe +RTS -hr -hCleaky -RTS Review the documentation for other options. About decoding the sets from jobname.prof - for example in SET 2 = {MAIN.SYSTEM} SET 16 = {Main.CAF, MAIN.SYSTEM} SET 18 = {MAIN.SYSTEM, Main.many1,Main.list,Main.expr,Main.CAF} {...} means it's a set, and ccN,...,cc0 is the retainer cost centre (ccN) and hierarchy of parent cost centres up to the top level (cc0)? My understanding is that SET 18 above refers to objects that are retained by exactly two specified cost centres, right? The docs say An object B retains object A if (i) B is a retainer object and (ii) object A can be reached by recursively following pointers starting from object B, but not meeting any other retainer objects on the way. Each live object is retained by one or more retainer objects, collectively called its retainer set ... That says to me that SET18 above is the set of all objects which are retained by those two call stacks, and only those call stacks. The individual .. items aren't call stacks but I think they refer to where the retaining object (B in the paragraph) was itself retained, so they are like call stacks. My intuition is very fuzzy here. Finally, what is the MAIN.SYSTEM retainer? I think that is everything else - any object created in the runtime system that is not directly attributable to something being profiled. Maybe it is objects from libraries that were not compiled with profiling? I imagine objects created by the GHC primitives would fall in this category too. Since someone else found your space leak, does the retainer profiling advice point to it? I'd like to know if it is actually accurate or not! I've only applied it in some very limited situations. Justin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak - help needed
Krzysztof Kościuszkiewicz wrote: I have tried both Poly.StateLazy and Poly.State and they work quite well - at least the space leak is eliminated. Now evaluation of the parser state blows the stack... The code is at http://hpaste.org/6310 Apparently, stUpdate is too lazy. I'd define stUpdate' :: (s - s) - Parser s t () stUpdate' f = stUpdate f stGet = (`seq` return ()) and try using stUpdate' instead of stUpdate in incCount. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak - help needed
On Thu, Mar 13, 2008 at 05:52:05PM +0100, Bertram Felgenhauer wrote: ... Now evaluation of the parser state blows the stack... The code is at http://hpaste.org/6310 Apparently, stUpdate is too lazy. I'd define stUpdate' :: (s - s) - Parser s t () stUpdate' f = stUpdate f stGet = (`seq` return ()) and try using stUpdate' instead of stUpdate in incCount. Yes, that solves the stack issue. Thanks! -- Krzysztof Kościuszkiewicz Skype: dr.vee, Gadu: 111851, Jabber: [EMAIL PROTECTED] Simplicity is the ultimate sophistication -- Leonardo da Vinci ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak - help needed
On Wed, Mar 12, 2008 at 12:34:38PM -0700, Justin Bailey wrote: The stack blows up when a bunch of unevaluated thunks build up, and you try to evaluate them. One way to determine where those thunks are getting built is to use GHCs retainer profiling. Retainer sets will show you the call stack that is holding on to memory. That can give you a clue where these thunks are being created. To get finer-grained results, annotate your code with {#- SCC ... #-} pragmas. Then you can filter the retainer profile by those annotations. That will help you determine where in a given function the thunks are being created. If you need help with profiling basics, feel free to ask. I'm not entirely sure if I understand retainer profiling correctly... So please clarify if you spot any obvious blunders. Retainers are thunks or objects on stack that keep references to live objects. All retainers of an object are called the object's retainer set. Now when one makes a profiling run, say with ./jobname +RTS -p -hr, the graph refernces retainer sets from jobname.prof. My understanding is that it is the total size of all objects retained by retainer sets being plotted, correct? About decoding the sets from jobname.prof - for example in SET 2 = {MAIN.SYSTEM} SET 16 = {Main.CAF, MAIN.SYSTEM} SET 18 = {MAIN.SYSTEM, Main.many1,Main.list,Main.expr,Main.CAF} {...} means it's a set, and ccN,...,cc0 is the retainer cost centre (ccN) and hierarchy of parent cost centres up to the top level (cc0)? My understanding is that SET 18 above refers to objects that are retained by exactly two specified cost centres, right? Finally, what is the MAIN.SYSTEM retainer? Thanks in advance, -- Krzysztof Kościuszkiewicz Skype: dr.vee, Gadu: 111851, Jabber: [EMAIL PROTECTED] Simplicity is the ultimate sophistication -- Leonardo da Vinci ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak - help needed
On Mon, Mar 03, 2008 at 05:20:09AM +0100, Bertram Felgenhauer wrote: Another story from an (almost) happy Haskell user that finds himself overwhelmed by laziness/space leaks. I'm trying to parse a large file (600MB) with a single S-expression like structure. With the help of ByteStrings I'm down to 4min processing time in constant space. However, when I try to wrap the parse results in a data structure, the heap blows up - even though I never actually inspect the structure being built! This bugs me, so I come here looking for answers. The polyparse library (http://www.cs.york.ac.uk/fp/polyparse/) offers some lazy parsers, maybe one of those fits your needs. Text.ParserCombinators.Poly.StateLazy is the obvious candidate. I have tried both Poly.StateLazy and Poly.State and they work quite well - at least the space leak is eliminated. Now evaluation of the parser state blows the stack... The code is at http://hpaste.org/6310 Thanks in advance, -- Krzysztof Kościuszkiewicz Skype: dr.vee, Gadu: 111851, Jabber: [EMAIL PROTECTED] Simplicity is the ultimate sophistication -- Leonardo da Vinci ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak - help needed
On Wed, Mar 12, 2008 at 12:12 PM, Krzysztof Kościuszkiewicz [EMAIL PROTECTED] wrote: I have tried both Poly.StateLazy and Poly.State and they work quite well - at least the space leak is eliminated. Now evaluation of the parser state blows the stack... The code is at http://hpaste.org/6310 Thanks in advance, The stack blows up when a bunch of unevaluated thunks build up, and you try to evaluate them. One way to determine where those thunks are getting built is to use GHCs retainer profiling. Retainer sets will show you the call stack that is holding on to memory. That can give you a clue where these thunks are being created. To get finer-grained results, annotate your code with {#- SCC ... #-} pragmas. Then you can filter the retainer profile by those annotations. That will help you determine where in a given function the thunks are being created. If you need help with profiling basics, feel free to ask. Justin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Space leak - help needed
Dear Haskellers, Another story from an (almost) happy Haskell user that finds himself overwhelmed by laziness/space leaks. I'm trying to parse a large file (600MB) with a single S-expression like structure. With the help of ByteStrings I'm down to 4min processing time in constant space. However, when I try to wrap the parse results in a data structure, the heap blows up - even though I never actually inspect the structure being built! This bugs me, so I come here looking for answers. Parser follows: module Main where import qualified Data.ByteString.Lazy.Char8 as B import Text.ParserCombinators.Parsec import Text.ParserCombinators.Parsec.Pos import System.Environment import System.Exit import qualified Data.Map as M import Lexer type XdlParser a = GenParser Token XdlState a -- Parser state type XdlState = Counts type Counts = M.Map Count Integer data Count = ListCount | SymbolCount deriving (Eq, Ord, Show) emptyXdlState = M.empty incCount :: Count - (Counts - Counts) incCount c = M.insertWith' (+) c 1 -- handling tokens myToken :: (Token - Maybe a) - XdlParser a myToken test = token showTok posTok testTok where showTok = show posTok = const (initialPos ) testTok = test -- Syntax of expressions data Exp = Sym !B.ByteString | List ![Exp] deriving (Eq, Show) expr = list | symbol rparen = myToken $ \t - case t of RParen - Just () other - Nothing lparen = myToken $ \t - case t of LParen - Just () other - Nothing name = myToken $ \t - case t of Name n - Just n other - Nothing list = do updateState $ incCount ListCount lparen xs - many1 expr rparen return () -- return $! (List xs) symbol = do updateState $ incCount SymbolCount name return () -- Sym `fmap` name -- Top level parser top :: XdlParser XdlState top = do l - many1 list eof getState main = do args - getArgs case args of [fname] - do text - B.readFile fname let result = runParser top emptyXdlState fname (tokenize text) putStrLn $ either show show result _ - putStrLn usage: parse filename exitFailure And the Lexer: module Lexer (Token(..), tokenize) where import qualified Data.ByteString.Lazy.Char8 as B import Control.Monad import Data.Char import Data.List import System.Environment import System.Exit data Token = LParen | RParen | Name B.ByteString deriving (Ord, Eq, Show) type Input = B.ByteString -- Processor returns Nothing if it can't process the Input type Processor = Input - Maybe ([Token], Input) -- Tokenize ends the list when all processors return Nothing tokenize :: Input - [Token] tokenize = concat . unfoldr processAll where processors= [doSpaces, doComment, doParens, doName] processAll :: Processor processAll bs = if B.null bs then Nothing else foldr mminus Nothing $ map ($ bs) processors mminus a@(Just _) _ = a mminus Nothingb = b doSpaces:: Processor doSpaces bs = if B.null sp then Nothing else Just ([], nsp) where (sp, nsp) = B.span isSpace bs doComment:: Processor doComment bs = if B.pack # `B.isPrefixOf` bs then Just ([], B.dropWhile (/= '\n') bs) else Nothing doParens :: Processor doParens bs = case B.head bs of '(' - Just ([LParen], B.tail bs) ')' - Just ([RParen], B.tail bs) _ - Nothing doName :: Processor doName bs = if B.null nsp then Nothing else Just ([Name nsp], sp) where (nsp, sp) = B.span (not . isRest) bs isRest c = isSpace c || c == ')' || c == '(' Regards, -- Krzysztof Kościuszkiewicz Skype: dr.vee, Gadu: 111851, Jabber: [EMAIL PROTECTED] Simplicity is the ultimate sophistication -- Leonardo da Vinci ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak - help needed
On Mon, Mar 3, 2008 at 2:23 AM, Krzysztof Kościuszkiewicz [EMAIL PROTECTED] wrote: Dear Haskellers, Another story from an (almost) happy Haskell user that finds himself overwhelmed by laziness/space leaks. I'm trying to parse a large file (600MB) with a single S-expression like structure. With the help of ByteStrings I'm down to 4min processing time in constant space. However, when I try to wrap the parse results in a data structure, the heap blows up - even though I never actually inspect the structure being built! This bugs me, so I come here looking for answers. Well, I haven't read this through, but superficially, it looks like you're expecting the data structure to be constructed lazily. But... -- Syntax of expressions data Exp = Sym !B.ByteString | List ![Exp] deriving (Eq, Show) It is declared as strict, so it's not going to be constructed lazily... Luke ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space leak - help needed
Krzysztof Kościuszkiewicz wrote: Another story from an (almost) happy Haskell user that finds himself overwhelmed by laziness/space leaks. I'm trying to parse a large file (600MB) with a single S-expression like structure. With the help of ByteStrings I'm down to 4min processing time in constant space. However, when I try to wrap the parse results in a data structure, the heap blows up - even though I never actually inspect the structure being built! This bugs me, so I come here looking for answers. Note that Parsec has to parse the whole file before it can decide whether to return a result (Left _) or an error (Right _). ghc would have to be quite smart to eliminate the creation of the expression tree entirely. The polyparse library (http://www.cs.york.ac.uk/fp/polyparse/) offers some lazy parsers, maybe one of those fits your needs. Text.ParserCombinators.Poly.StateLazy is the obvious candidate. HTH, Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
On Saturday 03 February 2007 19:56, Pepe Iborra wrote: pad :: [Word8] - [Word8] pad xs = pad' xs 0 pad' (x:xs) l = x : pad' xs (succ l) pad' [] l = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) Pepe, Thanks but this gives me a different problem [EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS [2845392438,1191608682,3124634993,2018558572,2630932637] [2224569924,473682542,3131984545,4182845925,3846598897] Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize' to increase it. Someone suggested pad :: Num a = [a] - [a] pad = pad' 0 where pad' !l [] = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) pad' !l (x:xs) = x : pad' (l+1) xs but that didn't compile *Main :r [2 of 2] Compiling Main ( allInOne.hs, interpreted ) allInOne.hs:83:14: Parse error in pattern Failed, modules loaded: Codec.Utils. Dominic. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
On Sun, Feb 04, 2007 at 08:20:23AM +, Dominic Steinitz wrote: Someone suggested pad :: Num a = [a] - [a] pad = pad' 0 where pad' !l [] = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) pad' !l (x:xs) = x : pad' (l+1) xs but that didn't compile *Main :r [2 of 2] Compiling Main ( allInOne.hs, interpreted ) allInOne.hs:83:14: Parse error in pattern Failed, modules loaded: Codec.Utils. Dominic. The '!' is a GHC extension, enabled using the flag '-fbang-patterns'. Equivalently, you can use Haskell 98's seq : pad :: Num a = [a] - [a] pad = pad' 0 where pad' l [] | l `seq` False = undefined pad' l [] = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) pad' l (x:xs) = x : pad' (l+1) xs The first alternative never succeeds, but to see that the compiler must force the evaluation of 'l'. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
On Saturday 03 February 2007 19:42, [EMAIL PROTECTED] wrote: I have re-written SHA1 so that is more idiomatically haskell and it is easy to see how it implements the specification. The only problem is I now have a space leak. I can see where the leak is but I'm less sure what to do about getting rid of it. Here's the offending function: pad :: [Word8] - [Word8] pad xs = xs ++ [0x80] ++ ps ++ lb where l = length xs pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) I would try something along the following lines (untested): \begin{spec} catWithLen xs f = xs ++ f (length xs) \end{spec} \begin{code} catWithLen :: [a] - (Int - [a]) - [a] catWithLen xs f = h 0 xs where h k [] = f k h k (x : xs) = case succ k of-- forcing evaluation k' - x : h k' xs \end{code} \begin{code} pad :: [Word8] - [Word8] pad xs = catWithLen xs f where f l = 0x80 : ps lb where -- we know that |l = length xs| pl = (64-(l+9)) `mod` 64 ps = funPow pl (0x00 :) lb = i2osp 8 (8*l) \end{code} If you are lucky, then the replicate and the (++lb) in the original code might be fused by the compiler as an instance of foldr-build or something related --- check the optimised core code. Wolfram, Thanks but this gives a different problem: [EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS [2845392438,1191608682,3124634993,2018558572,2630932637] [2224569924,473682542,3131984545,4182845925,3846598897] Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize' to increase it. Dominic. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
On Sun, Feb 04, 2007 at 08:30:44AM +, Dominic Steinitz wrote: On Saturday 03 February 2007 19:42, [EMAIL PROTECTED] wrote: I would try something along the following lines (untested): \begin{spec} catWithLen xs f = xs ++ f (length xs) \end{spec} \begin{code} catWithLen :: [a] - (Int - [a]) - [a] catWithLen xs f = h 0 xs where h k [] = f k h k (x : xs) = case succ k of-- forcing evaluation k' - x : h k' xs Nice try. k', as a variable binding, is irrefutable. \end{code} \begin{code} pad :: [Word8] - [Word8] pad xs = catWithLen xs f where f l = 0x80 : ps lb where -- we know that |l = length xs| pl = (64-(l+9)) `mod` 64 ps = funPow pl (0x00 :) lb = i2osp 8 (8*l) \end{code} Thanks but this gives a different problem: [EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS [2845392438,1191608682,3124634993,2018558572,2630932637] [2224569924,473682542,3131984545,4182845925,3846598897] Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize' to increase it. expected result of the excessive laziness above. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
On Sunday 04 February 2007 08:28, Stefan O'Rear wrote: On Sun, Feb 04, 2007 at 08:20:23AM +, Dominic Steinitz wrote: Someone suggested pad :: Num a = [a] - [a] pad = pad' 0 where pad' !l [] = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) pad' !l (x:xs) = x : pad' (l+1) xs but that didn't compile *Main :r [2 of 2] Compiling Main ( allInOne.hs, interpreted ) allInOne.hs:83:14: Parse error in pattern Failed, modules loaded: Codec.Utils. Dominic. The '!' is a GHC extension, enabled using the flag '-fbang-patterns'. The test program runs to completion but still has a space leak consuming over 25m. Equivalently, you can use Haskell 98's seq : pad :: Num a = [a] - [a] pad = pad' 0 where pad' l [] | l `seq` False = undefined pad' l [] = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) pad' l (x:xs) = x : pad' (l+1) xs The first alternative never succeeds, but to see that the compiler must force the evaluation of 'l'. [EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS [2845392438,1191608682,3124634993,2018558572,2630932637] [2224569924,473682542,3131984545,4182845925,3846598897] Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize' to increase it. Dominic. PS I appreciate all the help I'm getting. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
On Sun, Feb 04, 2007 at 09:45:12AM +, Dominic Steinitz wrote: pad :: Num a = [a] - [a] pad = pad' 0 where pad' l [] | l `seq` False = undefined Stupid typo, that should be: where pad' l _ | l `seq` False = undefined pad' l [] = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) pad' l (x:xs) = x : pad' (l+1) xs The first alternative never succeeds, but to see that the compiler must force the evaluation of 'l'. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
\begin{code} catWithLen :: [a] - (Int - [a]) - [a] catWithLen xs f = h 0 xs where h k [] = f k h k (x : xs) = case succ k of-- forcing evaluation k' - x : h k' xs \end{code} Thanks but this gives a different problem: [EMAIL PROTECTED]:~/sha1 ./allInOne 101 +RTS -hc -RTS [2845392438,1191608682,3124634993,2018558572,2630932637] [2224569924,473682542,3131984545,4182845925,3846598897] Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize' to increase it. Does it still do that if you youse seq instead of case? \begin{code} catWithLen :: [a] - (Int - [a]) - [a] catWithLen xs f = h 0 xs where h k [] = f k h k (x : xs) = let k' = succ k in k' `seq` x : h k' xs \end{code} Wolfram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Space Leak Help
I have re-written SHA1 so that is more idiomatically haskell and it is easy to see how it implements the specification. The only problem is I now have a space leak. I can see where the leak is but I'm less sure what to do about getting rid of it. Here's the offending function: pad :: [Word8] - [Word8] pad xs = xs ++ [0x80] ++ ps ++ lb where l = length xs pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) I've thought about zipping the xs with [1..] which will give me a length as I go. Is this the right way to go are there better techniques for dealing with this? I've attached the full source below. Dominic. module Main(main) where import Data.Char import Data.Bits import Data.List import Data.Word import System import Codec.Utils type Rotation = Int rotL :: Rotation - Word32 - Word32 rotL s a = shiftL a s .|. shiftL a (s-32) instance Num [Word32] where a + b = zipWith (+) a b f n x y z | (0 = n) (n = 19) = (x .. y) .|. ((complement x) .. z) | (20 = n) (n = 39) = x `xor` y `xor` z | (40 = n) (n = 59) = (x .. y) .|. (x .. z) .|. (y .. z) | (60 = n) (n = 79) = x `xor` y `xor` z | otherwise = error invalid index for f k n | (0 = n) (n = 19) = 0x5a827999 | (20 = n) (n = 39) = 0x6ed9eba1 | (40 = n) (n = 59) = 0x8f1bbcdc | (60 = n) (n = 79) = 0xca62c1d6 | otherwise = error invalid index for k -- Word120 - Word512 - Word120 oneBlock ss xs = (as!!80):(bs!!80):(cs!!80):(ds!!80):(es!!80):[] where ws = xs ++ map (rotL 1) (zipWith4 xxxor wm3s wm8s wm14s wm16s) where xxxor a b c d = a `xor` b `xor` c `xor` d wm3s = drop (16-3) ws wm8s = drop (16-8) ws wm14s = drop (16-14) ws wm16s = drop (16-16) ws as = (ss!!0):ts bs = (ss!!1):as cs = (ss!!2):(map (rotL 30) bs) ds = (ss!!3):cs es = (ss!!4):ds ts = (map (rotL 5) as) + (zipWith4 f [0..] bs cs ds) + es + (map k [0..]) + ws ss :: [Word32] ss = [0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476, 0xc3d2e1f0] pad :: [Word8] - [Word8] pad xs = xs ++ [0x80] ++ ps ++ lb where l = length xs pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) blockWord8sIn512 :: [Word8] - [[Word8]] blockWord8sIn512 = unfoldr g where g [] = Nothing g xs = Just (splitAt 64 xs) getWord32s :: [Word8] - [Word32] getWord32s s = map f [0..15] where f i = foldl (+) 0 $ map (\n - toEnum (fromEnum (s!!(i*4+n))) `shiftL` (fromIntegral (8 * (3-n [0..3] blockWord32sIn512 :: [Word8] - [[Word32]] blockWord32sIn512 = (map getWord32s) . blockWord8sIn512 . pad -- Word120 - Word512 - Word120 hashOnce ss a = ss + oneBlock ss a hash = foldl' hashOnce ss . blockWord32sIn512 convert :: String - [Word8] convert = map (fromIntegral . ord) short :: [Word8] short = convert abc message :: [Word8] message = convert abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq performance n = (convert . take n . repeat) 'a' test n = mapM_ (putStrLn . show . hash) [short, message, performance n] main = do progName - getProgName args - getArgs if length args /= 1 then putStrLn (Usage: ++ progName ++ testSize) else test (read (args!!0)) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
I have re-written SHA1 so that is more idiomatically haskell and it is easy to see how it implements the specification. The only problem is I now have a space leak. I can see where the leak is but I'm less sure what to do about getting rid of it. Here's the offending function: pad :: [Word8] - [Word8] pad xs = xs ++ [0x80] ++ ps ++ lb where l = length xs pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) I would try something along the following lines (untested): \begin{spec} catWithLen xs f = xs ++ f (length xs) \end{spec} \begin{code} catWithLen :: [a] - (Int - [a]) - [a] catWithLen xs f = h 0 xs where h k [] = f k h k (x : xs) = case succ k of-- forcing evaluation k' - x : h k' xs \end{code} \begin{code} pad :: [Word8] - [Word8] pad xs = catWithLen xs f where f l = 0x80 : ps lb where -- we know that |l = length xs| pl = (64-(l+9)) `mod` 64 ps = funPow pl (0x00 :) lb = i2osp 8 (8*l) \end{code} If you are lucky, then the replicate and the (++lb) in the original code might be fused by the compiler as an instance of foldr-build or something related --- check the optimised core code. In my variant I changed this to rely on efficient function powering instead: \begin{spec} funPow k f = foldr (.) id $ replicate k f \end{spec} \begin{code} funPow :: Int - (a - a) - (a - a) funPow n f = case compare n 0 of LT - error (funPow: negative argument: ++ show n) EQ - id GT - pow n f where pow m g = if m 1 then let (m',r) = divMod m 2 g' = g . g in if r == 0 then pow m' g' else pow m' g' . g else g \end{code} (You will probably also consider using Data.Bits for (`mod` 64), (8*), and (`divMod` 2). ) Wolfram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Space Leak Help
hi Dominic Explicit recursion works just fine for me and keeps things simple: pad :: [Word8] - [Word8] pad xs = pad' xs 0 pad' (x:xs) l = x : pad' xs (succ l) pad' [] l = [0x80] ++ ps ++ lb where pl = (64-(l+9)) `mod` 64 ps = replicate pl 0x00 lb = i2osp 8 (8*l) at the cost of (very slightly) hiding data flow. Seems exactly what you were trying to avoid? Cheers pepe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe