Re: [Haskell-cafe] getContents and lazy evaluation
Hi On 9/6/06, David Roundy <[EMAIL PROTECTED]> wrote: Fortunately, the undefined behavior in this case is unrelated to the lazy IO. On windows, the removal of the file will fail, while on posix systems there won't be any failure at all. The same behavior would show up if you opened the file for non-lazy reading, and tried to read part of the file, then delete it, then read the rest. This is not strictly speaking true. If all the handles opened to the file in question are in FILE_SHARE_DELETE-sharing mode, it can be marked for deletion when last handle to it is closed. It can also be moved and renamed. But it is true that removal might fail because of open handle, and it is true that it will fail as implemented currently for ghc (and probably for other compilers as well.) The "undefinedness" in this example, isn't in the haskell language, but in the filesystem semantics, and that's not something we want the language specifying (since it's something over which it has no Happily this isn't lazy IO-issue, it's just file IO issue for all files opened as specified by haskell98. Sharing mode would be really nice to have in Windows, as would security attributes. But as you say, these are hard things to specify because not everyone has those features. So, at least it works nicely in posixy-systems, eh? Best regards, --Esa Ilari Vuokko ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
On Fri, Sep 01, 2006 at 11:47:20PM +0100, Duncan Coutts wrote: > On Fri, 2006-09-01 at 17:36 -0400, Robert Dockins wrote: > > Well, AFAIK, the behavior is officially undefined, which is my > > real beef. I agree that it _should_ throw an exception. > > Ah, I had thought it was defined to simply truncate. It being > undefined isn't good. It seems that it would be straightforward to > define it to have the truncation behaviour. If Haskell-prime gets > imprecise exceptions then that could be changed. Fortunately, the undefined behavior in this case is unrelated to the lazy IO. On windows, the removal of the file will fail, while on posix systems there won't be any failure at all. The same behavior would show up if you opened the file for non-lazy reading, and tried to read part of the file, then delete it, then read the rest. The "undefinedness" in this example, isn't in the haskell language, but in the filesystem semantics, and that's not something we want the language specifying (since it's something over which it has no control). Lazy IO definitely works much more nicely with posix filesystems, but that's unsurprising, since posix filesystem semantics are much nicer than those of Windows. -- David Roundy ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
Quoth Julien Oster <[EMAIL PROTECTED]>: ... | But what happens when two processes use the same file and one process is | writing into it using lazy IO which didn't happen yet? The other process | wouldn't see its changes yet. That's actually a much more general problem, one that I imagine applies to hPutStr et al. too. Application level writes are ordinarily buffered in process space by the I/O library, so output from an ordinary C program may not appear on disk (or in kernel space disk I/O buffer) until just before the program exits. | As for two processes writing to the same file at the same time, very bad | things may happen anyway. Sure, lazy IO prevents doing communication | between running processes using plain files, but why would you do | something like that? Quite a few reasons, depending on how you define communication. You might even be tempted to use hGetContents in such cases. For example, one common way to share a file is to interlock around some resource, and when you acquire the lock, you read the file (get its contents) and release the lock. Donn Cave, [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
Duncan Coutts wrote: Hi, > In practise I expect that most programs that deal with file IO strictly > do not handle the file disappearing under them very well either. At best > the probably throw an exception and let something else clean up. And at least in Unix world, they just don't disappear. Normally, if you delete a file, you just delete its directory entry. If there still is something with an open handle to it, i.e. your program, the corresponding "inode" (that's basically the file itself without its name or names) still happily exists for your seeking, reading and writing. Then, when your program closes the file and there really is no remaining directory entry and no other process accessing it, the inode is removed as well. One trick for temporary files on unix is opening a new file, immediately deleting it but still using it to write and read data. So no problem here. But what happens when two processes use the same file and one process is writing into it using lazy IO which didn't happen yet? The other process wouldn't see its changes yet. I'm not sure if it matters, however, since sooner or later that IO will happen. And I believe that lazy IO still means that for one operation actually taking place, all prior operations take place in the right order beforehand as well, no? As for two processes writing to the same file at the same time, very bad things may happen anyway. Sure, lazy IO prevents doing communication between running processes using plain files, but why would you do something like that? Regards, Julien ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
On Fri, 2006-09-01 at 17:36 -0400, Robert Dockins wrote: > Perhaps I should be more clear. When I said "advanced" above I meant "any > use > whereby you treat a file as random access, read/write storage, or do any kind > of directory manipulation (including deleting and or renaming files)". Lazy > I/O (as it currently stands) doesn't play very nice with those use cases. Indeed, it can't be used in that case. > I agree generally with the idea that lazy I/O is good. The problem is that > it > is a "leaky abstraction"; details are exposed to the user that should ideally > be completely hidden. Unfortunately, the leaks aren't likely to get plugged > without pretty tight operating system support, which I suspect won't be > happening anytime soon. Yes it is leaky. > Well, AFAIK, the behavior is officially undefined, which is my real beef. I > agree that it _should_ throw an exception. Ah, I had thought it was defined to simply truncate. It being undefined isn't good. It seems that it would be straightforward to define it to have the truncation behaviour. If Haskell-prime gets imprecise exceptions then that could be changed. Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
On Friday 01 September 2006 18:01, Donn Cave wrote: > On Fri, 1 Sep 2006, Robert Dockins wrote: > > On Friday 01 September 2006 16:46, Duncan Coutts wrote: > > ... > > >> Note also, that with lazy IO we can write really short programs that are > >> blindingly quick. Lazy IO allows us to save a copy through the Handle > >> buffer. > > (Never understood why some people think it would be such a good thing > to be blinded, but as long as it's you and not me ... ) > > >> BTW in the above case the "bad thing that will happen" is that contents > >> will be truncated. As I said, I think it's better to throw an exception, > >> which is what Data.ByteString.Lazy.hGetContents does. > > > > Well, AFAIK, the behavior is officially undefined, which is my real beef. > > I agree that it _should_ throw an exception. > > Is this about Microsoft Windows? On UNIX, I would expect deletion of > a file to have no effect on I/O of any kind on that file. I thought > the problems with hGetContents more commonly involve operations on > the file handle, e.g., hClose. Ahh... I think you're right. However, this just illustrates the problem. The point is that the answer the question "what happens when I do " is "it depends". And to the obvious followup question "what does it depend on?" the answer is "well it's complicated". > Donn Cave, [EMAIL PROTECTED] -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
On Fri, 1 Sep 2006, Robert Dockins wrote: > On Friday 01 September 2006 16:46, Duncan Coutts wrote: ... >> Note also, that with lazy IO we can write really short programs that are >> blindingly quick. Lazy IO allows us to save a copy through the Handle >> buffer. (Never understood why some people think it would be such a good thing to be blinded, but as long as it's you and not me ... ) >> BTW in the above case the "bad thing that will happen" is that contents >> will be truncated. As I said, I think it's better to throw an exception, >> which is what Data.ByteString.Lazy.hGetContents does. > > Well, AFAIK, the behavior is officially undefined, which is my real beef. I > agree that it _should_ throw an exception. Is this about Microsoft Windows? On UNIX, I would expect deletion of a file to have no effect on I/O of any kind on that file. I thought the problems with hGetContents more commonly involve operations on the file handle, e.g., hClose. Donn Cave, [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
On Friday 01 September 2006 16:46, Duncan Coutts wrote: > On Fri, 2006-09-01 at 16:28 -0400, Robert Dockins wrote: > > On Friday 01 September 2006 15:19, Tamas K Papp wrote: > > > Hi, > > > > > > I am newbie, reading the Gentle Introduction. Chapter 7 > > > (Input/Output) says > > > > > > Pragmatically, it may seem that getContents must immediately read an > > > entire file or channel, resulting in poor space and time performance > > > under certain conditions. However, this is not the case. The key > > > point is that getContents returns a "lazy" (i.e. non-strict) list of > > > characters (recall that strings are just lists of characters in > > > Haskell), whose elements are read "by demand" just like any other > > > list. An implementation can be expected to implement this > > > demand-driven behavior by reading one character at a time from the > > > file as they are required by the computation. > > > > > > So what happens if I do > > > > > > contents <- getContents handle > > > putStr (take 5 contents) -- assume that the implementation > > >-- only reads a few chars > > > -- delete the file in some way > > > putStr (take 500 contents) -- but the file is not there now > > > > > > If an IO function is lazy, doesn't that break sequentiality? Sorry if > > > the question is stupid. > > > > This is not a stupid question at all, and it highlights the main problem > > with lazy IO. The solution is, in essence "don't do that, because Bad > > Things will happen". It's pretty unsatisfactory, but there it is. For > > this reason, lazy IO is widely regarded as somewhat dangerous (or even as > > an outright misfeature, by a few). > > > > If you are going to be doing simple pipe-style IO (ie, read some data > > sequentially, manipulate it, spit out the output), lazy IO is very > > convenient, and it makes putting together quick scripts very easy. > > However, if you're doing something more advanced, you'd probably do best > > to stay away from lazy IO. > > Since working on Data.ByteString.Lazy I'm now even more of a pro-lazy-IO > zealot than I was before ;-) > > In practise I expect that most programs that deal with file IO strictly > do not handle the file disappearing under them very well either. That's probably true, except for especially robust applications where such a thing is a regular (or at least expected) event. > At best > the probably throw an exception and let something else clean up. The > same can be done with lazy I, though it requires using imprecise > exceptions which some people grumble about. So I would contend that lazy > IO is actually applicable in rather a wider range of circumstances than > you might. :-) Perhaps I should be more clear. When I said "advanced" above I meant "any use whereby you treat a file as random access, read/write storage, or do any kind of directory manipulation (including deleting and or renaming files)". Lazy I/O (as it currently stands) doesn't play very nice with those use cases. I agree generally with the idea that lazy I/O is good. The problem is that it is a "leaky abstraction"; details are exposed to the user that should ideally be completely hidden. Unfortunately, the leaks aren't likely to get plugged without pretty tight operating system support, which I suspect won't be happening anytime soon. > Note also, that with lazy IO we can write really short programs that are > blindingly quick. Lazy IO allows us to save a copy through the Handle > buffer. > BTW in the above case the "bad thing that will happen" is that contents > will be truncated. As I said, I think it's better to throw an exception, > which is what Data.ByteString.Lazy.hGetContents does. Well, AFAIK, the behavior is officially undefined, which is my real beef. I agree that it _should_ throw an exception. > Duncan -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
On Fri, 2006-09-01 at 16:28 -0400, Robert Dockins wrote: > On Friday 01 September 2006 15:19, Tamas K Papp wrote: > > Hi, > > > > I am newbie, reading the Gentle Introduction. Chapter 7 > > (Input/Output) says > > > > Pragmatically, it may seem that getContents must immediately read an > > entire file or channel, resulting in poor space and time performance > > under certain conditions. However, this is not the case. The key > > point is that getContents returns a "lazy" (i.e. non-strict) list of > > characters (recall that strings are just lists of characters in > > Haskell), whose elements are read "by demand" just like any other > > list. An implementation can be expected to implement this > > demand-driven behavior by reading one character at a time from the > > file as they are required by the computation. > > > > So what happens if I do > > > > contents <- getContents handle > > putStr (take 5 contents) -- assume that the implementation > > -- only reads a few chars > > -- delete the file in some way > > putStr (take 500 contents) -- but the file is not there now > > > > If an IO function is lazy, doesn't that break sequentiality? Sorry if > > the question is stupid. > > This is not a stupid question at all, and it highlights the main problem with > lazy IO. The solution is, in essence "don't do that, because Bad Things will > happen". It's pretty unsatisfactory, but there it is. For this reason, lazy > IO is widely regarded as somewhat dangerous (or even as an outright > misfeature, by a few). > > If you are going to be doing simple pipe-style IO (ie, read some data > sequentially, manipulate it, spit out the output), lazy IO is very > convenient, and it makes putting together quick scripts very easy. However, > if you're doing something more advanced, you'd probably do best to stay away > from lazy IO. Since working on Data.ByteString.Lazy I'm now even more of a pro-lazy-IO zealot than I was before ;-) In practise I expect that most programs that deal with file IO strictly do not handle the file disappearing under them very well either. At best the probably throw an exception and let something else clean up. The same can be done with lazy I, though it requires using imprecise exceptions which some people grumble about. So I would contend that lazy IO is actually applicable in rather a wider range of circumstances than you might. :-) Note also, that with lazy IO we can write really short programs that are blindingly quick. Lazy IO allows us to save a copy through the Handle buffer. BTW in the above case the "bad thing that will happen" is that contents will be truncated. As I said, I think it's better to throw an exception, which is what Data.ByteString.Lazy.hGetContents does. Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] getContents and lazy evaluation
On Friday 01 September 2006 15:19, Tamas K Papp wrote: > Hi, > > I am newbie, reading the Gentle Introduction. Chapter 7 > (Input/Output) says > > Pragmatically, it may seem that getContents must immediately read an > entire file or channel, resulting in poor space and time performance > under certain conditions. However, this is not the case. The key > point is that getContents returns a "lazy" (i.e. non-strict) list of > characters (recall that strings are just lists of characters in > Haskell), whose elements are read "by demand" just like any other > list. An implementation can be expected to implement this > demand-driven behavior by reading one character at a time from the > file as they are required by the computation. > > So what happens if I do > > contents <- getContents handle > putStr (take 5 contents) -- assume that the implementation >-- only reads a few chars > -- delete the file in some way > putStr (take 500 contents) -- but the file is not there now > > If an IO function is lazy, doesn't that break sequentiality? Sorry if > the question is stupid. This is not a stupid question at all, and it highlights the main problem with lazy IO. The solution is, in essence "don't do that, because Bad Things will happen". It's pretty unsatisfactory, but there it is. For this reason, lazy IO is widely regarded as somewhat dangerous (or even as an outright misfeature, by a few). If you are going to be doing simple pipe-style IO (ie, read some data sequentially, manipulate it, spit out the output), lazy IO is very convenient, and it makes putting together quick scripts very easy. However, if you're doing something more advanced, you'd probably do best to stay away from lazy IO. Welcome to Haskell, BTW :-) > Thanks, > > Tamas -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] getContents and lazy evaluation
Hi, I am newbie, reading the Gentle Introduction. Chapter 7 (Input/Output) says Pragmatically, it may seem that getContents must immediately read an entire file or channel, resulting in poor space and time performance under certain conditions. However, this is not the case. The key point is that getContents returns a "lazy" (i.e. non-strict) list of characters (recall that strings are just lists of characters in Haskell), whose elements are read "by demand" just like any other list. An implementation can be expected to implement this demand-driven behavior by reading one character at a time from the file as they are required by the computation. So what happens if I do contents <- getContents handle putStr (take 5 contents) -- assume that the implementation -- only reads a few chars -- delete the file in some way putStr (take 500 contents) -- but the file is not there now If an IO function is lazy, doesn't that break sequentiality? Sorry if the question is stupid. Thanks, Tamas ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe