Re: Lazy streams and unsafeInterleaveIO

2002-12-25 Thread Jyrinx


Glynn Clements wrote:


Jyrinx wrote:

 

[...] and
the inability to handle exceptions (the actual exception won't occur
until after e.g. getContents has returned).
 

But how does this differ from strict I/O? I mean, say there's a disk 
error in the middle of some big file I want to crunch. Under traditional 
I/O, I open the file and proceed to read each piece of data, process it, 
and continue to the next one, reading the raw data only as I need it. 
When I hit the error, an exception will be thrown in the middle of the 
operation. In lazy I/O, I might use getContents to get all the 
characters lazily; the getContents call will read each piece of data as 
it's needed in the operation - in other words, the data is read as the 
program uses it, just like with traditional I/O. And when the error 
occurs, the operation will be unceremoniously interrupted, again the 
same as by strict I/O. In mean, if an exception is thrown because of a 
file error, I can't hope to catch it in the data-crunching part of the 
program anyway ...
   


No, but with strict I/O, you are bound to be within the IO monad
when the exception is thrown, so you *can* catch it.

If you are just going to allow all exceptions to be fatal, and don't
need any control over I/O ordering, you may as well just use lazy I/O. 
However, if you are writing real software as opposed to just toy
programs, you have to handle exceptions; e.g. a web browser which died
every time that a server refused a connection wouldn't be of much use.

Sure - and that's why I don't do *everything* in purely-functional-land. 
I suppose what I'm going for is separation of concerns: Anything with 
any business catching exceptions should be in the IO monad; 
calculations, transformations, etc., which depend on such a continuous 
stream of data couldn't deal with the exception if I wanted them to, but 
the IO code that invokes them can. In the Web browser example, I imagine 
(and this is off the top of my head) that a major functional part of the 
program would be a function that takes a bunch of HTML (presumably 
passed as a lazy stream from a server, achieved in the IO monad), 
processes it, and renders a bunch of graphical data for the screen (of 
which IO code could control the display). If a connection is refused, IO 
code catches the error before it can pass the stream to the rendering 
function; if a connection is cut off or something, the rendering code 
can't deal with that, and the exception gets caught back in I/O-land.

Luke Maurer
[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Space explosion

2002-12-25 Thread Jyrinx
I'm tinkering with a little program I wrote a while back to generate 
probability distributions and averages for rolling various kinds and 
quanitities of dice and applying various rules (for instance, roll four 
six-sided dice, drop the lowest, and add the other three). I'm trying to 
optimize the thing, and I've run into a space leak. I've found that, if 
I tell it to roll four 20-sided dice and add, it finishes in under a 
second and heap usage peaks at under 2MB (not sterling, but not 
catastrophic); but if I make it five 20-sided dice, it runs out of stack 
space after thrashing for a while (and getting partway through the final 
output); for six dice, it thrashes into oblivion. (Running WinXP with 
256MB memory.)

What puzzles me is that, according to GHC's heap-usage-by-call-center 
graph, the hog isn't the code that generates every single outcome of 
rolling the dice (and we're talking about four 20-sided dice here ...), 
whose heap usage is constant (go GHC! :-) ), but the one that turns the 
list of outcomes (for instance, [2,3,3,4,4,4 ... 7,7,7,7,7,7 ... 
11,11,12] for adding two 6-sided dice) into a probability distribution 
(like [(2,1),(3,2),(4,3) ... (7,6) ... (11,2),(12,1)]). It is this:

 -- Specs contains the options for the simulation; rollBounds simply 
finds the lowest and highest possible rolls
 -- type RollValue = Int
 -- type RollValueDist = Array RollValue Int
 probDist :: Specs - [RollValue] - RollValueDist
 probDist specs rolls =
accumArray (+) 0 (rollBounds specs) [(r, 1) | r - rolls]

It should be doing an in-place array update as it evaluates the 
comprehension, but apparently something's being kept in memory longer 
than it has an excuse for. I tried another, more hand-coded version, to 
get rid of the mid-calculation comprehension and some possible overhead:

 probDist specs =
 foldl update orig
 where   orig = array bounds [(r, 0) | r - range bounds]
 update dist r = dist // [(r, 1 + dist!r)]
 bounds = rollBounds specs

But this one was even worse. I've sprinkled $! around quite liberally, 
to no avail; also, I'm compiling with -O, and I've tried various RTS 
options with no radical change in behavior.

What's going on here? The array constructed has to be big, but not 
*that* big (77 Ints for four 20-sided) ... Is there an inherent 
inefficiency with GHC's arrays when used like this? I can't come up with 
a better algorithm - making a zeroed array with an element for each 
outcome, then adding 1 to the element for each outcome as it is read in 
seems straightforward enough. (Would a FiniteMap be any better? I'm not 
sure how ...) Thanks for any help ... I'm starting to get the hang of 
Haskell, but some of these semantic subtleties are driving me nutty ...

Luke Maurer
[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Lazy streams and unsafeInterleaveIO

2002-12-22 Thread Jyrinx
As an experiment for a bigger project, I cooked up a simple program: It 
asks for integers interactively, and after each input, it spits out the 
running total. The wrinkle is that the function for calculating the 
total should be a non-monadic stream function (that is, type [Integer] 
- [Integer] so that runningTotals [1,2,3,4,5] == [1,3,6,10,15]). The 
task is then to write a function to return a stream of integers, 
grabbing them from IO-land lazily (a la getContents).

My first attempts had it not displaying a running total until all input 
(terminated by an input of 0) had finished, at which point it spit out 
all the totals (i.e. it wasn't an interactive program anymore). I poked 
around in the docs and on the Web for a while, and found out about 
unsafeInterleaveIO, which solved the problem neatly (after I modified 
runningTotals to be less eager, as it was reading ahead by an extra 
integer each time). I ended up with the attached code (for GHC 5.04.2).

My question is this: Is there a more elegant (i.e. non-unsafe) way to 
do this? I vaguely recall from the Hudak book (which I unfortunately 
don't have convenient at the moment) that he used a channel for 
something like this (the interactive graphics stuff), but IIRC his 
system would be overkill for my application (including the bigger 
project). It doesn't seem like it should need any black magic, and 
concurrency (which channels need, right?) doesn't appear worth the 
hassle. Really, my desire comes down to a simple, safe, single-threaded 
way to write a function to generate a lazy stream. Is there such?

Luke Maurer
[EMAIL PROTECTED]
-- running-total
-- Haskell program that takes integers as input, outputting a running total
--  after each input
-- Demonstrates use of lazy streams

module Main where

import IO
import System.IO.Unsafe
import Monad

runningTotals :: [Integer] - [Integer]
runningTotals [] = []
runningTotals (x:xs) = rt' 0 (x:xs)
where   rt' tot (x:xs) = (tot+x) `seq` (tot+x):(rt' (tot+x) xs)
rt' _   [] = []

-- Note that runningTotals does what appears to be a stateful calculation when
-- numbers are read one at a time; however, lazy streams allow this to be a
-- pure function. Haskell is cool.

inputNumbers :: IO [Integer]
inputNumbers = do
x - putStr ?   readLn
if x == 0 then return [] else do
xs - (unsafeInterleaveIO inputNumbers)
return (x:xs)

main = do
numbers - inputNumbers
mapM_ (putStrLn . (flip shows) ) (runningTotals numbers)



Re: Lazy streams and unsafeInterleaveIO

2002-12-22 Thread Jyrinx
Remi Turk wrote:


On Sun, Dec 22, 2002 at 04:00:45AM -0800, Jyrinx wrote:
 

As an experiment for a bigger project, I cooked up a simple program: It 
asks for integers interactively, and after each input, it spits out the 
running total. The wrinkle is that the function for calculating the 
total should be a non-monadic stream function (that is, type [Integer] 
- [Integer] so that runningTotals [1,2,3,4,5] == [1,3,6,10,15]). The 
task is then to write a function to return a stream of integers, 
grabbing them from IO-land lazily (a la getContents).
   


Hi,
what about

module Main where

main= getContents = mapM_ print . scanl1 (+) . map read . lines
 

Ooh, neat! :-) (I love these one-liners - Haskell is absurdly concise 
:-D ) Hrm ... wasn't aware of the scanl1 thingie; looks like I 
reinvented the wheel a little ... (Come to think of it, is there any 
sort of handy quick-reference card for all these combinators? Seems like 
I and other novices could stand to save some typing ...)

One sticking point, though (and this is relevant to the bigger project): 
I'd like to print a prompt somehow before each input, which I'm not sure 
is possible if I just slurp up everything from getContents ... I've 
thought of using interact somehow, but I'm not sure where I'd start with 
that one ...

(Out of curiosity: How is the compiler deciding on a type for the input? 
(That is, how does it know we want integers? Is it just a default?) 
Looks to me like all it can infer is that it's of classes Read, Show, 
and Num ... that doesn't much narrow things down ...)

BTW, I already found a major problem with the code I attached earlier, 
using unsafeInterleaveIO: Run in GHCi (as I had done), it works fine; 
but compiled by GHC and run as an executable, it waits for input and 
*then* displays the prompt after the user hits Enter ... not very 
helpful. I didn't think it would do that, since (putStr ?   readLn) 
seemed pretty explicit as to order of evaluation, but I guess that's 
what I get for breaking referential transparency ...

Luke Maurer
[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Lazy streams and unsafeInterleaveIO

2002-12-22 Thread Jyrinx


Hal Daume III wrote:


BTW, I already found a major problem with the code I attached earlier, 
using unsafeInterleaveIO: Run in GHCi (as I had done), it works fine; 
but compiled by GHC and run as an executable, it waits for input and 
*then* displays the prompt after the user hits Enter ... not very 
helpful. I didn't think it would do that, since (putStr ?   readLn) 
seemed pretty explicit as to order of evaluation, but I guess that's 
what I get for breaking referential transparency ...
   


You probably want to set the buffering otherwise. [...]
 

Ah, you're right ... I changed (putStr ?   readLn) to (putStr ?  
 hFlush stdout  readLn) and it worked.

So is this lazy-stream-via-unsafeInterleaveIO not so nasty, then, so 
long as a few precautions (not reading too far into the stream, 
accounting for buffering, etc.) are taken? I like the idiom Hudak uses 
(passing a stream of I/O results to the purely functional part of the 
program), so if it's kosher enough I'd like to get hacking elsewhere ...

Luke Maurer
[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: ANNOUNCE: GHC version 5.04.2 released

2002-12-04 Thread Jyrinx


We are pleased to announce a new patchlevel release of the Glasgow
Haskell Compiler (GHC), version 5.04.2.  This is a bugfix-only
release.  For all the changes since 5.02.3, see the release notes:

  http://www.haskell.org/ghc/docs/latest/users_guide/release-5-04.html
 

Looks like that link's broken; should it be this?

http://www.haskell.org/ghc/docs/latest/html/users_guide/release-5-04.html

(Seems like the html is missing.)

Jyrinx
[EMAIL PROTECTED]

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: ANNOUNCE: GHC version 5.04.2 released

2002-12-04 Thread Jyrinx


We are pleased to announce a new patchlevel release of the Glasgow
Haskell Compiler (GHC), version 5.04.2.  This is a bugfix-only
release.  For all the changes since 5.02.3, see the release notes:

  http://www.haskell.org/ghc/docs/latest/users_guide/release-5-04.html
 

Looks like that link's broken; should it be this?

http://www.haskell.org/ghc/docs/latest/html/users_guide/release-5-04.html

(Seems like the html is missing.)

Jyrinx
[EMAIL PROTECTED]

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Isn't this tail recursive?

2002-03-12 Thread Jyrinx

Aha! Gotcha. Thanks for the explanation.

I suppose that, in general, for tail recursion to work right, the
accumulator has to be evaluated strictly (as is how my code was fixed)?

Jyrinx
[EMAIL PROTECTED]

On Tue, 2002-03-12 at 09:34, Hal Daume III wrote:
 Here's the basic idea.  Suppose we have the function:
 
  sum [] acc = acc
  sum (x:xs) acc = sum xs (acc+x)
 
 This is tail recursive, but not strict in the accumulator argument.  What
 this means is that the computation will be performed lazily, so sum
 [4,5,8,10,14,20] 0 will go like this:
 
  sum [4,5,8,10,14,20] 0 = 
  sum [5,8,10,14,20] (0+4) =
  sum [8,10,14,20] ((0+4)+5) =
  sum [10,14,20] (((0+4)+5)+8) =
  sum [14,20] 0+4)+5)+8)+10) =
  sum [20] (0+4)+5)+8)+10)+14) =
  sum [] ((0+4)+5)+8)+10)+14)+20) =
  ((0+4)+5)+8)+10)+14)+20)
 
 this computation in the accumulator argument won't be evaluated until you
 try to print it or something, which will reduce it and perform the
 computation.  this means that for a list of length n, the the sum
 computation will grow in size O(n).  what you need is to make sure that
 the computation is done strictly and that is done using seq or $!, as in:
 
  sum2 [] acc = acc
  sum2 (x:xs) acc = sum2 xs $! (acc+x)
 
 this means that acc+x will be computed at each step, so the accumulator
 will hold only the integer (or whatever type) and not the thunk (the
 computation).
 
 the type of $! is the same as $:
 
  $! :: (a - b) - a - b
 
 the sematics of $! are:
 
  f $! a = f a
 
 but the difference is that $! causes a to be reduced completely, so it
 won't build a huge thunk.
 
 at least that's my understanding; i'm willing to be corrected :)
 
  - Hal
 
 --
 Hal Daume III
 
  Computer science is no more about computers| [EMAIL PROTECTED]
   than astronomy is about telescopes. -Dijkstra | www.isi.edu/~hdaume
 
 On 11 Mar 2002, Jyrinx wrote:
 
Normal  - countAll' cs 0 nl (nw + newWord) (nc + 1)
White   - countAll' cs 1 nl nw (nc + 1)
Newline - countAll' cs 1 (nl + 1) nw (nc + 1)
   
   
   make this something like
   
   ...
   
 Normal - nw' `seq` nc' `seq` countAll' cs 0 nl nw' nc'
 White  - nc' `seq`   countAll' cs 1 nl nw  nc'
 Newline- nl' `seq` nc` `seq` countAll' cs 1 nl' nw nc'
   where nw' = nw + newWord
 nc' = nc + 1
 nl' = nl + 1
  
  Cool! That did the trick ... (runs on very little memory *and* time now
  ... very cool) I've read through the other responses (thanks all!), and
  I'm still not exactly sure what's going on ... I'm relatively new to
  Haskell, and my understanding of laziness is hardly rigorous; in
  general, how should I know where I need to use seq, and what I need to
  use it on? Is there a paper I should read? (I've got Hudak's book, but
  it does everything lazily IIRC)
  
  Jyrinx
  [EMAIL PROTECTED]
  
 
 ___
 Haskell mailing list
 [EMAIL PROTECTED]
 http://www.haskell.org/mailman/listinfo/haskell


___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Isn't this tail recursive?

2002-03-11 Thread Jyrinx

  Normal  - countAll' cs 0 nl (nw + newWord) (nc + 1)
  White   - countAll' cs 1 nl nw (nc + 1)
  Newline - countAll' cs 1 (nl + 1) nw (nc + 1)
 
 
 make this something like
 
 ...
 
   Normal - nw' `seq` nc' `seq` countAll' cs 0 nl nw' nc'
   White  - nc' `seq`   countAll' cs 1 nl nw  nc'
   Newline- nl' `seq` nc` `seq` countAll' cs 1 nl' nw nc'
 where nw' = nw + newWord
   nc' = nc + 1
   nl' = nl + 1

Cool! That did the trick ... (runs on very little memory *and* time now
... very cool) I've read through the other responses (thanks all!), and
I'm still not exactly sure what's going on ... I'm relatively new to
Haskell, and my understanding of laziness is hardly rigorous; in
general, how should I know where I need to use seq, and what I need to
use it on? Is there a paper I should read? (I've got Hudak's book, but
it does everything lazily IIRC)

Jyrinx
[EMAIL PROTECTED]

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Isn't this tail recursive?

2002-03-10 Thread Jyrinx

For practice, I'm playing with reimplementing the solution to the word
count problem on the Great Computer Language Shootout
(www.bagley.org/~doug/shootout). My current solution looks tail
recursive to me:

--- snip ---

-- wc-luke.hs
-- Reimplimentation of the Haskell word count program for the Great
--  Computer Language Shootout
-- Luke Maurer
-- [EMAIL PROTECTED]

module Main where

import IO

data CharKind = Normal | White | Newline

charKind :: Char - CharKind
charKind c =
case c of
'\n' - Newline
' '  - White
'\t' - White
_- Normal

countAll :: String - (Int, Int, Int)
countAll str =
countAll' str 0 0 0 0
where countAll' [] _ nl nw nc = (nl, nw, nc)
  countAll' (c:cs) newWord nl nw nc =
case charKind c of
-- The following should all be tail calls ... right?
Normal  - countAll' cs 0 nl (nw + newWord) (nc + 1)
White   - countAll' cs 1 nl nw (nc + 1)
Newline - countAll' cs 1 (nl + 1) nw (nc + 1)

main = do
-- We need a 4K buffer, as per the rules
hSetBuffering stdin (BlockBuffering (Just 4096))

file - getContents
let (l, w, c) = countAll file

putStrLn ((show l) ++   ++ (show w) ++   ++ (show c))

--- snip ---

In the case expression at the end of countAll, each of the values looks
to me like a recursive tail call - I should think (hope?) that it would
be optimized by GHC into a goto statement (a la Scheme). Instead, my
program eats up memory (I've got 256 MB) until the RTS whines about a
stack overflow.

Am I wrong about the tail call? Is there some optimization I should be
aware of (I'm compiling with -O2)? Is this a flaw in GHC?

(BTW, as a beginner, I'd be glad to hear general commentary on my code
...)

Thanks!

Jyrinx
jyrinx_list at mindspring dot com


___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell