Folks,
Finalising the Haskell 98 report is (still) on my to-do list. I plan to
do it after ICFP, but I need a clear week which is why I've been
procrastinating.
And comments do keep coming in occasionally. Manuel's is a case in point.
I'd be interested to hear people's opinion about the lazy-file read
question.
I'm not prepared to add new functions to Haskell 98, but I think
the clarification of (1) or (2) below would be useful. (2) is nice
but it makes *all* file reading more expensive, perhaps significantly
so (e.g. making a complete copy of the file). So I am personally inclined
to go for (1) and require Haskell programmers to do the consequent file-name
changing themselves.
I'm sending this message to the haskell-cafe!
Simon
| -----Original Message-----
| From: Manuel M. T. Chakravarty [mailto:[EMAIL PROTECTED]]
| Sent: 05 September 2000 02:10
| To: [EMAIL PROTECTED]
| Subject: lazy file reading in H98
|
|
| In an assignment, in my class, we came across a lack of
| specification of the behaviour of `Prelude.readFile' and
| `IO.hGetContents' and IMHO also a lack of functionality. As
| both operations read a file lazily, subsequent writes to the
| same file are potentially disastrous. In this assignment,
| the file was used to make a Haskell data structure
| persistent over multiple runs of the program - ie,
|
| readFile fname >>= return . read
|
| at the start of the program and
|
| writeFile fname . show
|
| at the end of the program. For certain inputs, where the
| data structure stored in the file was only partially used,
| the file was overwritten before it was fully read.
|
| H98 doesn't really specify what happens in this situation.
| I think, there are two ways to solve that:
|
| (1) At least, the definition should say that the behaviour
| is undefined if a program every writes to a file that it
| has read with `readFile' or `hGetContents' before.
|
| (2) Alternatively, it could demand more sophistication from
| the implementation and require that upon opening of a
| file for writing that is currently semi-closed, the
| implementation has to make sure that the contents of the
| semi-closed file is not corrupted before it is fully
| read.[1]
|
| In the case that solution (1) is chosen, I think, we should
| also have something like `strictReadFile' (and
| `hStrictGetContents') which reads the whole file before
| proceeding to the next IO action. Otherwise, in situations
| like in the mentioned assignment, you have to resort to
| reading the file character by character, which seems very
| awkward.
|
| So, overall, I think solution (2) is more elegant.
|
| Cheers,
| Manuel
|
| [1] On Unix-like (POSIX?) systems, unlinking the file and
| then opening the writable file would be sufficient. On
| certain legacy OSes, the implementation would have to
| read the rest of the file into memory before creating
| a new file under the same name.
|