We demonstrate how lazy IO breaks referential transparency. A pure function of the type Int->Int->Int gives different integers depending on the order of evaluation of its arguments. Our Haskell98 code uses nothing but the standard input. We conclude that extolling the purity of Haskell and advertising lazy IO is inconsistent.
Henning Thielemann wrote on Haskell-Cafe: > Say I have a chain of functions: read a file, ... write that to a file, > all of these functions are written in a lazy way, which is currently > considered good style Lazy IO should not be considered good style. One of the common definitions of purity is that pure expressions should evaluate to the same results regardless of evaluation order, or that equals can be substituted for equals. If an expression of the type Int evaluates to 1, we should be able to replace every occurrence of the expression with 1 without changing the results and other observables. The expression (read s) where s is the result of the (hGetContents h1) action has the pure type, e.g., Int. Yet its evaluation does more than just producing an integer: it may also close a file associated with the handle h1. Closing the file has an observable effect: the file descriptor, which is a scarce resource, is freed. Some OS lock the open file, or prevent operations such as renaming and deletion on the open file. A file system whose file is open cannot be unmounted. Suppose I use an Haskell application such as an editor to process data from a removable drive. I mount the drive, tell the application the file name. The (still running) application finished with the file and displayed the result. And yet I can't unplug the removable drive, because it turns out that the final result was produced without needing to read all the data from the file, and the file remains unclosed. Some people say: the programmer should have used seq. That is the admission of guilt! An expression (read s)::Int that evaluates to 1 is equal to 1. So, one should be able substitute (read s) with 1. If the result of evaluating 1 turns out not needed for the final outcome, then not evaluating (read s) should not affect the outcome. And yet it does. One uses seq to force evaluation of an expression even if the result may be unused. Thus the expression of a pure type has *side-effect*. The advice about using seq reminds me of advice given to C programmers that they should not free a malloc'ed pointer twice, dereference a zero pointer and write past the boundary of an array. If lazy IO is considered good style given the proper use of seq, then C should be considered safe given the proper use of pointers and arrays. Side affects like closing a file are observable in the external world. A program may observe these effects, in an IO monad. One can argue there are no guarantees at all about what happens in the IO monad. Can side-effects of lazy IO be observable in _pure_ Haskell code, in functions with pure type? Yes, they can. The examples are depressingly easy to write, once one realizes that reading does have side effects, POSIX provides for file descriptor aliasing, and sharing becomes observable with side effects. Here is a simple Haskell98 code. > {- Haskell98! -} > > module Main where > > import System.IO > import System.Posix.IO (fdToHandle, stdInput) > > -- f1 and f2 are both pure functions, with the pure type. > -- Both compute the result of the subtraction e1 - e2. > -- The only difference between them is the sequence of > -- evaluating their arguments, e1 `seq` e2 vs. e2 `seq` e1 > -- For really pure functions, that difference should not be observable > > f1, f2:: Int -> Int -> Int > > f1 e1 e2 = e1 `seq` e2 `seq` e1 - e2 > f2 e1 e2 = e2 `seq` e1 `seq` e1 - e2 > > read_int s = read . head . words $ s > > main = do > let h1 = stdin > h2 <- fdToHandle stdInput > s1 <- hGetContents h1 > s2 <- hGetContents h2 > print $ f1 (read_int s1) (read_int s2) > -- print $ f2 (read_int s1) (read_int s2) One can compile it with GHC (I used 6.8.2, with and without -O2) and run like this ~> /tmp/a.out 1 2 -1 That is, we run the program and type 1 and 2 as the inputs. The program prints out the result, -1. If we comment out the line "print $ f1 (read_int s1) (read_int s2)" and uncomment out the last line the transcript looks like this ~> /tmp/a.out 1 2 1 Running the code with Hugs exhibits the same behavior. Thus a pure function (-) of the pure type gives different results depending on the order of evaluating its arguments. Is 1 = -1? _______________________________________________ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell