Re: Lazy streams and unsafeInterleaveIO
Glynn Clements wrote: Jyrinx wrote: [...] and the inability to handle exceptions (the actual exception won't occur until after e.g. getContents has returned). But how does this differ from strict I/O? I mean, say there's a disk error in the middle of some big file I want to crunch. Under traditional I/O, I open the file and proceed to read each piece of data, process it, and continue to the next one, reading the raw data only as I need it. When I hit the error, an exception will be thrown in the middle of the operation. In lazy I/O, I might use getContents to get all the characters lazily; the getContents call will read each piece of data as it's needed in the operation - in other words, the data is read as the program uses it, just like with traditional I/O. And when the error occurs, the operation will be unceremoniously interrupted, again the same as by strict I/O. In mean, if an exception is thrown because of a file error, I can't hope to catch it in the data-crunching part of the program anyway ... No, but with strict I/O, you are bound to be within the IO monad when the exception is thrown, so you *can* catch it. If you are just going to allow all exceptions to be fatal, and don't need any control over I/O ordering, you may as well just use lazy I/O. However, if you are writing real software as opposed to just toy programs, you have to handle exceptions; e.g. a web browser which died every time that a server refused a connection wouldn't be of much use. Sure - and that's why I don't do *everything* in purely-functional-land. I suppose what I'm going for is separation of concerns: Anything with any business catching exceptions should be in the IO monad; calculations, transformations, etc., which depend on such a continuous stream of data couldn't deal with the exception if I wanted them to, but the IO code that invokes them can. In the Web browser example, I imagine (and this is off the top of my head) that a major functional part of the program would be a function that takes a bunch of HTML (presumably passed as a lazy stream from a server, achieved in the IO monad), processes it, and renders a bunch of graphical data for the screen (of which IO code could control the display). If a connection is refused, IO code catches the error before it can pass the stream to the rendering function; if a connection is cut off or something, the rendering code can't deal with that, and the exception gets caught back in I/O-land. Luke Maurer [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Space explosion
I'm tinkering with a little program I wrote a while back to generate probability distributions and averages for rolling various kinds and quanitities of dice and applying various rules (for instance, roll four six-sided dice, drop the lowest, and add the other three). I'm trying to optimize the thing, and I've run into a space leak. I've found that, if I tell it to roll four 20-sided dice and add, it finishes in under a second and heap usage peaks at under 2MB (not sterling, but not catastrophic); but if I make it five 20-sided dice, it runs out of stack space after thrashing for a while (and getting partway through the final output); for six dice, it thrashes into oblivion. (Running WinXP with 256MB memory.) What puzzles me is that, according to GHC's heap-usage-by-call-center graph, the hog isn't the code that generates every single outcome of rolling the dice (and we're talking about four 20-sided dice here ...), whose heap usage is constant (go GHC! :-) ), but the one that turns the list of outcomes (for instance, [2,3,3,4,4,4 ... 7,7,7,7,7,7 ... 11,11,12] for adding two 6-sided dice) into a probability distribution (like [(2,1),(3,2),(4,3) ... (7,6) ... (11,2),(12,1)]). It is this: -- Specs contains the options for the simulation; rollBounds simply finds the lowest and highest possible rolls -- type RollValue = Int -- type RollValueDist = Array RollValue Int probDist :: Specs - [RollValue] - RollValueDist probDist specs rolls = accumArray (+) 0 (rollBounds specs) [(r, 1) | r - rolls] It should be doing an in-place array update as it evaluates the comprehension, but apparently something's being kept in memory longer than it has an excuse for. I tried another, more hand-coded version, to get rid of the mid-calculation comprehension and some possible overhead: probDist specs = foldl update orig where orig = array bounds [(r, 0) | r - range bounds] update dist r = dist // [(r, 1 + dist!r)] bounds = rollBounds specs But this one was even worse. I've sprinkled $! around quite liberally, to no avail; also, I'm compiling with -O, and I've tried various RTS options with no radical change in behavior. What's going on here? The array constructed has to be big, but not *that* big (77 Ints for four 20-sided) ... Is there an inherent inefficiency with GHC's arrays when used like this? I can't come up with a better algorithm - making a zeroed array with an element for each outcome, then adding 1 to the element for each outcome as it is read in seems straightforward enough. (Would a FiniteMap be any better? I'm not sure how ...) Thanks for any help ... I'm starting to get the hang of Haskell, but some of these semantic subtleties are driving me nutty ... Luke Maurer [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Lazy streams and unsafeInterleaveIO
As an experiment for a bigger project, I cooked up a simple program: It asks for integers interactively, and after each input, it spits out the running total. The wrinkle is that the function for calculating the total should be a non-monadic stream function (that is, type [Integer] - [Integer] so that runningTotals [1,2,3,4,5] == [1,3,6,10,15]). The task is then to write a function to return a stream of integers, grabbing them from IO-land lazily (a la getContents). My first attempts had it not displaying a running total until all input (terminated by an input of 0) had finished, at which point it spit out all the totals (i.e. it wasn't an interactive program anymore). I poked around in the docs and on the Web for a while, and found out about unsafeInterleaveIO, which solved the problem neatly (after I modified runningTotals to be less eager, as it was reading ahead by an extra integer each time). I ended up with the attached code (for GHC 5.04.2). My question is this: Is there a more elegant (i.e. non-unsafe) way to do this? I vaguely recall from the Hudak book (which I unfortunately don't have convenient at the moment) that he used a channel for something like this (the interactive graphics stuff), but IIRC his system would be overkill for my application (including the bigger project). It doesn't seem like it should need any black magic, and concurrency (which channels need, right?) doesn't appear worth the hassle. Really, my desire comes down to a simple, safe, single-threaded way to write a function to generate a lazy stream. Is there such? Luke Maurer [EMAIL PROTECTED] -- running-total -- Haskell program that takes integers as input, outputting a running total -- after each input -- Demonstrates use of lazy streams module Main where import IO import System.IO.Unsafe import Monad runningTotals :: [Integer] - [Integer] runningTotals [] = [] runningTotals (x:xs) = rt' 0 (x:xs) where rt' tot (x:xs) = (tot+x) `seq` (tot+x):(rt' (tot+x) xs) rt' _ [] = [] -- Note that runningTotals does what appears to be a stateful calculation when -- numbers are read one at a time; however, lazy streams allow this to be a -- pure function. Haskell is cool. inputNumbers :: IO [Integer] inputNumbers = do x - putStr ? readLn if x == 0 then return [] else do xs - (unsafeInterleaveIO inputNumbers) return (x:xs) main = do numbers - inputNumbers mapM_ (putStrLn . (flip shows) ) (runningTotals numbers)
Re: Lazy streams and unsafeInterleaveIO
Remi Turk wrote: On Sun, Dec 22, 2002 at 04:00:45AM -0800, Jyrinx wrote: As an experiment for a bigger project, I cooked up a simple program: It asks for integers interactively, and after each input, it spits out the running total. The wrinkle is that the function for calculating the total should be a non-monadic stream function (that is, type [Integer] - [Integer] so that runningTotals [1,2,3,4,5] == [1,3,6,10,15]). The task is then to write a function to return a stream of integers, grabbing them from IO-land lazily (a la getContents). Hi, what about module Main where main= getContents = mapM_ print . scanl1 (+) . map read . lines Ooh, neat! :-) (I love these one-liners - Haskell is absurdly concise :-D ) Hrm ... wasn't aware of the scanl1 thingie; looks like I reinvented the wheel a little ... (Come to think of it, is there any sort of handy quick-reference card for all these combinators? Seems like I and other novices could stand to save some typing ...) One sticking point, though (and this is relevant to the bigger project): I'd like to print a prompt somehow before each input, which I'm not sure is possible if I just slurp up everything from getContents ... I've thought of using interact somehow, but I'm not sure where I'd start with that one ... (Out of curiosity: How is the compiler deciding on a type for the input? (That is, how does it know we want integers? Is it just a default?) Looks to me like all it can infer is that it's of classes Read, Show, and Num ... that doesn't much narrow things down ...) BTW, I already found a major problem with the code I attached earlier, using unsafeInterleaveIO: Run in GHCi (as I had done), it works fine; but compiled by GHC and run as an executable, it waits for input and *then* displays the prompt after the user hits Enter ... not very helpful. I didn't think it would do that, since (putStr ? readLn) seemed pretty explicit as to order of evaluation, but I guess that's what I get for breaking referential transparency ... Luke Maurer [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Lazy streams and unsafeInterleaveIO
Hal Daume III wrote: BTW, I already found a major problem with the code I attached earlier, using unsafeInterleaveIO: Run in GHCi (as I had done), it works fine; but compiled by GHC and run as an executable, it waits for input and *then* displays the prompt after the user hits Enter ... not very helpful. I didn't think it would do that, since (putStr ? readLn) seemed pretty explicit as to order of evaluation, but I guess that's what I get for breaking referential transparency ... You probably want to set the buffering otherwise. [...] Ah, you're right ... I changed (putStr ? readLn) to (putStr ? hFlush stdout readLn) and it worked. So is this lazy-stream-via-unsafeInterleaveIO not so nasty, then, so long as a few precautions (not reading too far into the stream, accounting for buffering, etc.) are taken? I like the idiom Hudak uses (passing a stream of I/O results to the purely functional part of the program), so if it's kosher enough I'd like to get hacking elsewhere ... Luke Maurer [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: ANNOUNCE: GHC version 5.04.2 released
We are pleased to announce a new patchlevel release of the Glasgow Haskell Compiler (GHC), version 5.04.2. This is a bugfix-only release. For all the changes since 5.02.3, see the release notes: http://www.haskell.org/ghc/docs/latest/users_guide/release-5-04.html Looks like that link's broken; should it be this? http://www.haskell.org/ghc/docs/latest/html/users_guide/release-5-04.html (Seems like the html is missing.) Jyrinx [EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: ANNOUNCE: GHC version 5.04.2 released
We are pleased to announce a new patchlevel release of the Glasgow Haskell Compiler (GHC), version 5.04.2. This is a bugfix-only release. For all the changes since 5.02.3, see the release notes: http://www.haskell.org/ghc/docs/latest/users_guide/release-5-04.html Looks like that link's broken; should it be this? http://www.haskell.org/ghc/docs/latest/html/users_guide/release-5-04.html (Seems like the html is missing.) Jyrinx [EMAIL PROTECTED] ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Isn't this tail recursive?
Aha! Gotcha. Thanks for the explanation. I suppose that, in general, for tail recursion to work right, the accumulator has to be evaluated strictly (as is how my code was fixed)? Jyrinx [EMAIL PROTECTED] On Tue, 2002-03-12 at 09:34, Hal Daume III wrote: Here's the basic idea. Suppose we have the function: sum [] acc = acc sum (x:xs) acc = sum xs (acc+x) This is tail recursive, but not strict in the accumulator argument. What this means is that the computation will be performed lazily, so sum [4,5,8,10,14,20] 0 will go like this: sum [4,5,8,10,14,20] 0 = sum [5,8,10,14,20] (0+4) = sum [8,10,14,20] ((0+4)+5) = sum [10,14,20] (((0+4)+5)+8) = sum [14,20] 0+4)+5)+8)+10) = sum [20] (0+4)+5)+8)+10)+14) = sum [] ((0+4)+5)+8)+10)+14)+20) = ((0+4)+5)+8)+10)+14)+20) this computation in the accumulator argument won't be evaluated until you try to print it or something, which will reduce it and perform the computation. this means that for a list of length n, the the sum computation will grow in size O(n). what you need is to make sure that the computation is done strictly and that is done using seq or $!, as in: sum2 [] acc = acc sum2 (x:xs) acc = sum2 xs $! (acc+x) this means that acc+x will be computed at each step, so the accumulator will hold only the integer (or whatever type) and not the thunk (the computation). the type of $! is the same as $: $! :: (a - b) - a - b the sematics of $! are: f $! a = f a but the difference is that $! causes a to be reduced completely, so it won't build a huge thunk. at least that's my understanding; i'm willing to be corrected :) - Hal -- Hal Daume III Computer science is no more about computers| [EMAIL PROTECTED] than astronomy is about telescopes. -Dijkstra | www.isi.edu/~hdaume On 11 Mar 2002, Jyrinx wrote: Normal - countAll' cs 0 nl (nw + newWord) (nc + 1) White - countAll' cs 1 nl nw (nc + 1) Newline - countAll' cs 1 (nl + 1) nw (nc + 1) make this something like ... Normal - nw' `seq` nc' `seq` countAll' cs 0 nl nw' nc' White - nc' `seq` countAll' cs 1 nl nw nc' Newline- nl' `seq` nc` `seq` countAll' cs 1 nl' nw nc' where nw' = nw + newWord nc' = nc + 1 nl' = nl + 1 Cool! That did the trick ... (runs on very little memory *and* time now ... very cool) I've read through the other responses (thanks all!), and I'm still not exactly sure what's going on ... I'm relatively new to Haskell, and my understanding of laziness is hardly rigorous; in general, how should I know where I need to use seq, and what I need to use it on? Is there a paper I should read? (I've got Hudak's book, but it does everything lazily IIRC) Jyrinx [EMAIL PROTECTED] ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Isn't this tail recursive?
Normal - countAll' cs 0 nl (nw + newWord) (nc + 1) White - countAll' cs 1 nl nw (nc + 1) Newline - countAll' cs 1 (nl + 1) nw (nc + 1) make this something like ... Normal - nw' `seq` nc' `seq` countAll' cs 0 nl nw' nc' White - nc' `seq` countAll' cs 1 nl nw nc' Newline- nl' `seq` nc` `seq` countAll' cs 1 nl' nw nc' where nw' = nw + newWord nc' = nc + 1 nl' = nl + 1 Cool! That did the trick ... (runs on very little memory *and* time now ... very cool) I've read through the other responses (thanks all!), and I'm still not exactly sure what's going on ... I'm relatively new to Haskell, and my understanding of laziness is hardly rigorous; in general, how should I know where I need to use seq, and what I need to use it on? Is there a paper I should read? (I've got Hudak's book, but it does everything lazily IIRC) Jyrinx [EMAIL PROTECTED] ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Isn't this tail recursive?
For practice, I'm playing with reimplementing the solution to the word count problem on the Great Computer Language Shootout (www.bagley.org/~doug/shootout). My current solution looks tail recursive to me: --- snip --- -- wc-luke.hs -- Reimplimentation of the Haskell word count program for the Great -- Computer Language Shootout -- Luke Maurer -- [EMAIL PROTECTED] module Main where import IO data CharKind = Normal | White | Newline charKind :: Char - CharKind charKind c = case c of '\n' - Newline ' ' - White '\t' - White _- Normal countAll :: String - (Int, Int, Int) countAll str = countAll' str 0 0 0 0 where countAll' [] _ nl nw nc = (nl, nw, nc) countAll' (c:cs) newWord nl nw nc = case charKind c of -- The following should all be tail calls ... right? Normal - countAll' cs 0 nl (nw + newWord) (nc + 1) White - countAll' cs 1 nl nw (nc + 1) Newline - countAll' cs 1 (nl + 1) nw (nc + 1) main = do -- We need a 4K buffer, as per the rules hSetBuffering stdin (BlockBuffering (Just 4096)) file - getContents let (l, w, c) = countAll file putStrLn ((show l) ++ ++ (show w) ++ ++ (show c)) --- snip --- In the case expression at the end of countAll, each of the values looks to me like a recursive tail call - I should think (hope?) that it would be optimized by GHC into a goto statement (a la Scheme). Instead, my program eats up memory (I've got 256 MB) until the RTS whines about a stack overflow. Am I wrong about the tail call? Is there some optimization I should be aware of (I'm compiling with -O2)? Is this a flaw in GHC? (BTW, as a beginner, I'd be glad to hear general commentary on my code ...) Thanks! Jyrinx jyrinx_list at mindspring dot com ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell