Re: Unsafe hGetContents
On 10/10/2009 18:59, Iavor Diatchki wrote: Hello, well, I think that the fact that we seem to have a program context that can distinguish f1 from f2 is worth discussing because I would have thought that in a pure language they are interchangable. The question is, does the context in Oleg's example really distinguish between f1 and f2? You seem to be saying that this is not the case: in both cases you end up with the same non-deterministic program that reads two numbers from the standard input and subtracts them but you can't assume anything about the order in which the numbers are extracted from the input---it is merely an artifact of the GHC implementation that with f1 the subtraction always happens the one way, and with f2 it happens the other way. I can (sort of) buy this argument, after all, it is quite similar to what happens with asynchronous exceptions (f1 (error 1) (error 2) vs f2 (error 1) (error 2)). Still, the whole thing does not smell right: there is some impurity going on here, and trying to offload the problem onto the IO monad only makes reasoning about IO computations even harder (and it is petty hard to start with). So, discussion and alternative solutions should be strongly encouraged, I think. Duncan has found a definition of hGetContents that explains why it has surprising behaviour, and that's very nice because it lets us write the compilers that we want to write, and we get to tell the users to stop moaning because the strange behaviour they're experiencing is allowed according to the spec. :-) Of course, the problem is that users don't want the hGetContents that has non-deterministic semantics, they want a deterministic one. And for that, they want to fix the evaluation order (or something). The obvious drawback with fixing the evaluation order is that it ties the hands of the compiler developers, and makes a fundamental change to the language definition. Things will get a lot worse in the future as we experiment with more elaborate compiler optimisations and evaluation strategies. I predict that eventually we'll have to ditch hGetContents, at least in its current generality. Cheers, Simon -Iavor On Sat, Oct 10, 2009 at 7:38 AM, Duncan Coutts duncan.cou...@googlemail.com wrote: On Sat, 2009-10-10 at 02:51 -0700, o...@okmij.org wrote: The reason it's hard is that to demonstrate a difference you have to get the lazy I/O to commute with some other I/O, and GHC will never do that. The keyword here is GHC. I may well believe that GHC is able to divine programmer's true intent and so it always does the right thing. But writing in the language standard ``do what the version x.y.z of GHC does'' does not seem very appropriate, or helpful to other implementors. With access to unsafeInterleaveIO it's fairly straightforward to show that it is non-deterministic. These programs that bypass the safety mechanisms on hGetContents just get us back to having access to the non-deterministic semantics of unsafeInterleaveIO. Haskell's IO library is carefully designed to not run into this problem on its own. It's normally not possible to get two Handles with the same FD... Is this behavior is specified somewhere, or is this just an artifact of a particular GHC implementation? It is in the Haskell 98 report, in the design of the IO library. It does not not mention FDs of course. The IO/Handle functions it provides give no (portable) way to obtain two read handles on the same OS file descriptor. The hGetContents behaviour of semi-closing is to stop you from getting two lazy lists of the same read Handle. There's nothing semantically wrong with you bypassing those restrictions (eg openFile /dev/fd/0) it just means you end up with a non-deterministic IO program, which is something we typically try to avoid. I am a bit perplexed by this whole discussion. It seems to come down to saying that unsafeInterleaveIO is non-deterministic and that things implemented on top are also non-deterministic. The standard IO library puts up some barriers to restrict the non-determinism, but if you walk around the barrier then you can still find it. It's not clear to me what is supposed to be surprising or alarming here. Duncan ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On Tue, 2009-10-20 at 13:58 +0100, Simon Marlow wrote: Duncan has found a definition of hGetContents that explains why it has surprising behaviour, and that's very nice because it lets us write the compilers that we want to write, and we get to tell the users to stop moaning because the strange behaviour they're experiencing is allowed according to the spec. :-) :-) Of course, the problem is that users don't want the hGetContents that has non-deterministic semantics, they want a deterministic one. And for that, they want to fix the evaluation order (or something). The obvious drawback with fixing the evaluation order is that it ties the hands of the compiler developers, and makes a fundamental change to the language definition. I've not yet seen anyone put forward any practical programs that have confusing behaviour but were not written deliberately to be as wacky as possible and avoid all the safety mechanism. The standard use case for hGetContents is reading a read-only file, or stdin where it really does not matter when the read actions occur with respect to other IO actions. You could do it in parallel rather than on-demand and it'd still be ok. There's the beginner mistake where people don't notice that they're not actually demanding anything before closing the file, that's nothing new of course. Duncan ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On Tue, 2009-10-20 at 15:45 +0100, Simon Marlow wrote: I've not yet seen anyone put forward any practical programs that have confusing behaviour but were not written deliberately to be as wacky as possible and avoid all the safety mechanism. The standard use case for hGetContents is reading a read-only file, or stdin where it really does not matter when the read actions occur with respect to other IO actions. You could do it in parallel rather than on-demand and it'd still be ok. There's the beginner mistake where people don't notice that they're not actually demanding anything before closing the file, that's nothing new of course. If the parallel runtime reads files eagerly, that might hide a resource problem that would occur when the program is run on a sequential system, for example. That's true, but we have the same problem without doing any IO. There are many ways of generating large amounts of data. Duncan ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On 11/10/2009 09:26, Florian Weimer wrote: * Simon Marlow: Oleg's example is quite close, don't you think? URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html Ah yes, if you have two lazy input streams both referring to the same underlying stream, that is enough to demonstrate a problem. As for whether Oleg's example is within the rules, it depends whether you consider fdToHandle as unsafe: Is relying on seq to show the difference allowed, according to your rules on an insecurity proof? Absolutely. What about handles from System.Process? Do they count as well? Sure - we hopefully don't consider System.Process to be unsafe. Cheers, Simon ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
* Simon Marlow: Oleg's example is quite close, don't you think? URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html Ah yes, if you have two lazy input streams both referring to the same underlying stream, that is enough to demonstrate a problem. As for whether Oleg's example is within the rules, it depends whether you consider fdToHandle as unsafe: Is relying on seq to show the difference allowed, according to your rules on an insecurity proof? What about handles from System.Process? Do they count as well? ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
Hmm, Don't you think forkIO deserves some of the same complaints as unsafeInterleaveIO? Things happen in a nondeterministic order! I think what irritates us about unsafeInterleaveIO is that it's IO that tinkers with the internals of the Haskell evaluation system. The OS can't do it: in a C program it might, because there's libc and debuggers and all kinds of things that understand compiled C to some extent. But the Haskell runtime system is pretty much obfuscated to anyone except ourselves. This obfuscation maintains its conceptual purity to a greater extent than is really guaranteed by the standards. This obfuscation is supported in our minds by the fact that functions (-) cannot be compared for equality or deconstructed or serialized in any way, only applied. forkIO causes IO to happen in a nondeterministic order. So does unsafeInterleaveIO. But for unsafeInterleaveIO, the nondeterminism depends in part on how pure functions are written: partly because there is a compiler that makes arbitrary choices, and also partly affected by the strictness properties of the functions. This feels disconcerting to us. And worse: I am not sure if forkIO has a formal guarantee that its IO will complete, but we tend to assume that it will, sooner or later; unsafeInterleaveIO might not happen at all, and frequently does not, due to the observations of how pure functions are written. It's disconcerting. It can affect how we choose to write our pure code, the same way that efficiency and memory concerns can. But if 'catch' can catch a different exception depending even, conceptually, on the phase of the moon, it is a similarly strange stretch to imagine unsafeInterleaveIO doing so. It plays with chronology (like forkIO does) and with the ways Haskell functions are written (like 'catch' does) at the same time. A result is that it makes a lot of people confused when they do something they didn't intend with it. Also, it's a powerful enough tool that when you want to replace its formal nondeterminism with something more precise, you may have quite a bit of work cut out for you, restructuring your code (like Darcs did, IIRC). -Isaac ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
Hello, well, I think that the fact that we seem to have a program context that can distinguish f1 from f2 is worth discussing because I would have thought that in a pure language they are interchangable. The question is, does the context in Oleg's example really distinguish between f1 and f2? You seem to be saying that this is not the case: in both cases you end up with the same non-deterministic program that reads two numbers from the standard input and subtracts them but you can't assume anything about the order in which the numbers are extracted from the input---it is merely an artifact of the GHC implementation that with f1 the subtraction always happens the one way, and with f2 it happens the other way. I can (sort of) buy this argument, after all, it is quite similar to what happens with asynchronous exceptions (f1 (error 1) (error 2) vs f2 (error 1) (error 2)). Still, the whole thing does not smell right: there is some impurity going on here, and trying to offload the problem onto the IO monad only makes reasoning about IO computations even harder (and it is petty hard to start with). So, discussion and alternative solutions should be strongly encouraged, I think. -Iavor On Sat, Oct 10, 2009 at 7:38 AM, Duncan Coutts duncan.cou...@googlemail.com wrote: On Sat, 2009-10-10 at 02:51 -0700, o...@okmij.org wrote: The reason it's hard is that to demonstrate a difference you have to get the lazy I/O to commute with some other I/O, and GHC will never do that. The keyword here is GHC. I may well believe that GHC is able to divine programmer's true intent and so it always does the right thing. But writing in the language standard ``do what the version x.y.z of GHC does'' does not seem very appropriate, or helpful to other implementors. With access to unsafeInterleaveIO it's fairly straightforward to show that it is non-deterministic. These programs that bypass the safety mechanisms on hGetContents just get us back to having access to the non-deterministic semantics of unsafeInterleaveIO. Haskell's IO library is carefully designed to not run into this problem on its own. It's normally not possible to get two Handles with the same FD... Is this behavior is specified somewhere, or is this just an artifact of a particular GHC implementation? It is in the Haskell 98 report, in the design of the IO library. It does not not mention FDs of course. The IO/Handle functions it provides give no (portable) way to obtain two read handles on the same OS file descriptor. The hGetContents behaviour of semi-closing is to stop you from getting two lazy lists of the same read Handle. There's nothing semantically wrong with you bypassing those restrictions (eg openFile /dev/fd/0) it just means you end up with a non-deterministic IO program, which is something we typically try to avoid. I am a bit perplexed by this whole discussion. It seems to come down to saying that unsafeInterleaveIO is non-deterministic and that things implemented on top are also non-deterministic. The standard IO library puts up some barriers to restrict the non-determinism, but if you walk around the barrier then you can still find it. It's not clear to me what is supposed to be surprising or alarming here. Duncan ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On 03/10/2009 19:59, Florian Weimer wrote: * Nicolas Pouillard: Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Would you consider something like [1] as an acceptable replacement? [1]: http://hackage.haskell.org/package/safe-lazy-io It only addresses two known issues with lazy I/O, doesn't it? It still injects input operations into pure code not in the IO monad. While what you say is true, and I've complained about the same thing myself in the past, it turns out to be quite difficult to demonstrate the unsafety. Try it! Here's the rules. - write a program that gives different results when compiled with different optimisation flags only. (one exception: you're not allowed to take advantage of -fno-state-hack). - Using exceptions is not allowed (they're non-determinstic). - A difference caused by resources (e.g. stack overflow) doesn't count. - The only unsafe operation you're allowed to use is hGetContents. - You're allowed to use any other I/O operations, including from libraries, as long as they're not unsafe, and as long as the I/O itself is deterministic. The reason it's hard is that to demonstrate a difference you have to get the lazy I/O to commute with some other I/O, and GHC will never do that. If you find a way to do it, then we'll probably consider it a bug in GHC. You can get lazy I/O to commute with other lazy I/O, and perhaps with some cunning arrangement of pipes (or something) that might be a way to solve the puzzle. Good luck! Cheers, Simon ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
Excerpts from Simon Marlow's message of Tue Oct 06 14:59:06 +0200 2009: On 03/10/2009 19:59, Florian Weimer wrote: * Nicolas Pouillard: Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Would you consider something like [1] as an acceptable replacement? [1]: http://hackage.haskell.org/package/safe-lazy-io It only addresses two known issues with lazy I/O, doesn't it? It still injects input operations into pure code not in the IO monad. While what you say is true, and I've complained about the same thing myself in the past, it turns out to be quite difficult to demonstrate the unsafety. Try it! Here's the rules. - write a program that gives different results when compiled with different optimisation flags only. (one exception: you're not allowed to take advantage of -fno-state-hack). - Using exceptions is not allowed (they're non-determinstic). - A difference caused by resources (e.g. stack overflow) doesn't count. - The only unsafe operation you're allowed to use is hGetContents. - You're allowed to use any other I/O operations, including from libraries, as long as they're not unsafe, and as long as the I/O itself is deterministic. The reason it's hard is that to demonstrate a difference you have to get the lazy I/O to commute with some other I/O, and GHC will never do that. If you find a way to do it, then we'll probably consider it a bug in GHC. You can get lazy I/O to commute with other lazy I/O, and perhaps with some cunning arrangement of pipes (or something) that might be a way to solve the puzzle. Good luck! Oleg's example is quite close, don't you think? URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html Cheers, -- Nicolas Pouillard http://nicolaspouillard.fr ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On 06/10/2009 14:18, Nicolas Pouillard wrote: Excerpts from Simon Marlow's message of Tue Oct 06 14:59:06 +0200 2009: On 03/10/2009 19:59, Florian Weimer wrote: * Nicolas Pouillard: Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Would you consider something like [1] as an acceptable replacement? [1]: http://hackage.haskell.org/package/safe-lazy-io It only addresses two known issues with lazy I/O, doesn't it? It still injects input operations into pure code not in the IO monad. While what you say is true, and I've complained about the same thing myself in the past, it turns out to be quite difficult to demonstrate the unsafety. Try it! Here's the rules. - write a program that gives different results when compiled with different optimisation flags only. (one exception: you're not allowed to take advantage of -fno-state-hack). - Using exceptions is not allowed (they're non-determinstic). - A difference caused by resources (e.g. stack overflow) doesn't count. - The only unsafe operation you're allowed to use is hGetContents. - You're allowed to use any other I/O operations, including from libraries, as long as they're not unsafe, and as long as the I/O itself is deterministic. The reason it's hard is that to demonstrate a difference you have to get the lazy I/O to commute with some other I/O, and GHC will never do that. If you find a way to do it, then we'll probably consider it a bug in GHC. You can get lazy I/O to commute with other lazy I/O, and perhaps with some cunning arrangement of pipes (or something) that might be a way to solve the puzzle. Good luck! Oleg's example is quite close, don't you think? URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html Ah yes, if you have two lazy input streams both referring to the same underlying stream, that is enough to demonstrate a problem. As for whether Oleg's example is within the rules, it depends whether you consider fdToHandle as unsafe: Haskell's IO library is carefully designed to not run into this problem on its own. It's normally not possible to get two Handles with the same FD, however GHC.IO.Handle.hDuplicate also lets you do this. Cheers, Simon ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On Tue, 2009-10-06 at 15:18 +0200, Nicolas Pouillard wrote: The reason it's hard is that to demonstrate a difference you have to get the lazy I/O to commute with some other I/O, and GHC will never do that. If you find a way to do it, then we'll probably consider it a bug in GHC. You can get lazy I/O to commute with other lazy I/O, and perhaps with some cunning arrangement of pipes (or something) that might be a way to solve the puzzle. Good luck! Oleg's example is quite close, don't you think? URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html I didn't think that showed very much. He showed two different runs of two different IO programs where he got different results after having bypassed the safety switch on hGetContents. It shows that lazy IO is non-deterministic, but then we knew that. It didn't show anything was impure. As a software engineering thing, it's recommended to use lazy IO in the cases where the non-determinism has a low impact, ie where the order of the actions with respect to other actions doesn't really matter. When it does matter then your programs will probably be more comprehensible if you do the actions more explicitly. For example we have the shoot-yourself-in-the-foot restriction that you can only use hGetContents on a handle a single time (this is the safety mechanism that Oleg turned off) and after that you cannot write to the same handle. That's not because it'd be semantically unsound if those restrictions were not there, but it would let you write some jolly confusing non-deterministic programs. Duncan ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On 16/09/2009 21:17, Florian Weimer wrote: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) There is no current proposal for this, no. Feel free to start one; information about the process for Haskell Prime proposals is here http://hackage.haskell.org/trac/haskell-prime/wiki/Process Cheers, Simon ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
On 17/09/2009 13:58, Nicolas Pouillard wrote: Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Would you consider something like [1] as an acceptable replacement? [1]: http://hackage.haskell.org/package/safe-lazy-io I rater like this as a workaround for the most common practical problems with lazy I/O, those of resource control. It doesn't address the deeper concern that lazy I/O requires a particular evaluation order and is therefore a bit warty as a language feature - implementing lazy I/O properly in GHC's parallel mutator was somewhat tricky. I'm not of the opinion that we should throw out lazy I/O, but it's still a problematic area in Haskell. Cheers, Simon ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
Excerpts from Simon Marlow's message of Mon Sep 21 11:41:38 +0200 2009: On 16/09/2009 21:17, Florian Weimer wrote: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) There is no current proposal for this, no. Feel free to start one; information about the process for Haskell Prime proposals is here http://hackage.haskell.org/trac/haskell-prime/wiki/Process An alternate proposition (instead of removing it) would to to move it to System.IO.Unsafe. -- Nicolas Pouillard http://nicolaspouillard.fr ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
Excerpts from Simon Marlow's message of Mon Sep 21 11:52:41 +0200 2009: On 17/09/2009 13:58, Nicolas Pouillard wrote: Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Would you consider something like [1] as an acceptable replacement? [1]: http://hackage.haskell.org/package/safe-lazy-io I rater like this as a workaround for the most common practical problems with lazy I/O, those of resource control. It doesn't address the deeper concern that lazy I/O requires a particular evaluation order and is therefore a bit warty as a language feature When using safe-lazy-io we no longer rely (or a lot less) on the evaluation order (assuming you mean the order of side-effects). Since the way of combining the different inputs is statically chosen by user. - implementing lazy I/O properly in GHC's parallel mutator was somewhat tricky. I'm not of the opinion that we should throw out lazy I/O, but it's still a problematic area in Haskell. Maybe the 'unsafeGetContents' feature required by a safe-lazy-io would be less problematic, in particular it does not have to ignore exceptions. Best regards, -- Nicolas Pouillard http://nicolaspouillard.fr ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Would you consider something like [1] as an acceptable replacement? [1]: http://hackage.haskell.org/package/safe-lazy-io -- Nicolas Pouillard http://nicolaspouillard.fr ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
fw: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Isn't this a broader complaint about lazy IO in general? -- Don ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Unsafe hGetContents
* Don Stewart: fw: Are there any plans to get rid of hGetContents and the semi-closed handle state for Haskell Prime? (I call hGetContents unsafe because it adds side effects to pattern matching, stricly speaking invalidating most of the transformations which are expected to be valid in a pure language.) Isn't this a broader complaint about lazy IO in general? Yes, sort of. But doesn't lazy input derive its justification from being present in the prelude? ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime