RE: lazy file reading in H98

Simon Peyton-Jones Tue, 05 Sep 2000 01:03:56 -0700
Folks,

Finalising the Haskell 98 report is (still) on my to-do list.  I plan to
do it after ICFP, but I need a clear week which is why I've been
procrastinating.
And comments do keep coming in occasionally.  Manuel's is a case in point.


I'd be interested to hear people's opinion about the lazy-file read
question.  
I'm not prepared to add new functions to Haskell 98, but I think
the clarification of (1) or (2) below would be useful.  (2) is nice
but it makes *all* file reading more expensive, perhaps significantly
so (e.g. making a complete copy of the file).  So I am personally inclined
to go for (1) and require Haskell programmers to do the consequent file-name
changing themselves.   

I'm sending this message to the haskell-cafe!

Simon

| -----Original Message-----
| From: Manuel M. T. Chakravarty [mailto:[EMAIL PROTECTED]]
| Sent: 05 September 2000 02:10
| To: [EMAIL PROTECTED]
| Subject: lazy file reading in H98
| 
| 
| In an assignment, in my class, we came across a lack of
| specification of the behaviour of `Prelude.readFile' and
| `IO.hGetContents' and IMHO also a lack of functionality.  As
| both operations read a file lazily, subsequent writes to the
| same file are potentially disastrous.  In this assignment,
| the file was used to make a Haskell data structure
| persistent over multiple runs of the program - ie, 
| 
|   readFile fname >>= return . read
| 
| at the start of the program and
| 
|   writeFile fname . show
| 
| at the end of the program.  For certain inputs, where the
| data structure stored in the file was only partially used,
| the file was overwritten before it was fully read.
| 
| H98 doesn't really specify what happens in this situation.
| I think, there are two ways to solve that:
| 
| (1) At least, the definition should say that the behaviour
|     is undefined if a program every writes to a file that it
|     has read with `readFile' or `hGetContents' before.
| 
| (2) Alternatively, it could demand more sophistication from
|     the implementation and require that upon opening of a
|     file for writing that is currently semi-closed, the
|     implementation has to make sure that the contents of the
|     semi-closed file is not corrupted before it is fully
|     read.[1]
| 
| In the case that solution (1) is chosen, I think, we should
| also have something like `strictReadFile' (and
| `hStrictGetContents') which reads the whole file before
| proceeding to the next IO action.  Otherwise, in situations
| like in the mentioned assignment, you have to resort to
| reading the file character by character, which seems very
| awkward.
| 
| So, overall, I think solution (2) is more elegant.
| 
| Cheers,
| Manuel
| 
| [1] On Unix-like (POSIX?) systems, unlinking the file and
|     then opening the writable file would be sufficient.  On
|     certain legacy OSes, the implementation would have to
|     read the rest of the file into memory before creating
|     a new file under the same name.
|
RE: lazy file reading in H98

Reply via email to