Re: behaviour change in getDirectoryContents in GHC 7.2?
On 07/11/2011 17:57, Ian Lynagh wrote: On Mon, Nov 07, 2011 at 05:02:32PM +, Simon Marlow wrote: Basically, imagine a reversible transformation: encode :: String - [Word8] decode :: [Word8] - String this transformation is applied in the appropriate direction by the IO library to translate filesystem paths into FilePath and vice versa. No information is lost I think that would be great if it were true, but it isn't: $ touch `printf '\x80'` $ touch `printf '\xEE\xBE\x80'` $ ghc -e 'System.Directory.getDirectoryContents .= print' [\61312,.,\61312,..] Both of those filenames get encoded as \61312 (U+EF80). Ouch, I missed that. I was under the impression that we guaranteed roundtripping, but it seems not. Max - we need to fix this. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: behaviour change in getDirectoryContents in GHC 7.2?
On 07/11/2011 17:32, John Millikin wrote: On Mon, Nov 7, 2011 at 09:02, Simon Marlowmarlo...@gmail.com wrote: I think you might be misunderstanding how the new API works. Basically, imagine a reversible transformation: encode :: String - [Word8] decode :: [Word8] - String this transformation is applied in the appropriate direction by the IO library to translate filesystem paths into FilePath and vice versa. No information is lost; furthermore you can apply the transformation yourself in order to recover the original [Word8] from a String, or to inject your own [Word8] file path. Ok? I understand how the API is intended / designed to work; however, the implementation does not actually do this. My argument is that this transformation should be in a high-level library like directory, and the low-level libraries like base or unix ought to provide functions which do not transform their inputs. That way, when an error is found in the encoding logic, it can be fixed by just pushing a new version of the affected library to Hackage, instead of requiring a new version of the compiler. I am also not convinced that it is possible to correctly implement either of these functions if their behavior is dependent on the user's locale. All this does is mean that the common case where you want to interpret file system paths as text works with no fuss, without breaking anything in the case when the file system paths are not actually text. As mentioned earlier in the thread, this behavior is breaking things. Due to an implementation error, programs compiled with GHC 7.2 on POSIX systems cannot open files unless their paths also happen to be valid text according to their locale. It is very difficult to work around this error, because the paths-are-text logic was placed at a very low level in the library stack. So your objection is that there is a bug? What if we fixed the bug? It would probably be better to have an abstract FilePath type and to keep the original bytes, decoding on demand. But that is a big change to the API and would break much more code. One day we'll do this properly; for now we have this, which I think is a pretty reasonble compromise. Please understand, I am not arguing against the existence of this encoding layer in general. It's a fine idea for a simplistic high-level filesystem interaction library. But it should be *optional*, not part of the compiler or base. Ok, so I was about to reply and say that the low-level API is available via the unix and Win32 packages, and then I thought I should check first, and I discovered that even using System.Posix you get the magic encoding behaviour. I really think we should provide the native APIs. The problem is that the System.Posix.Directory API is all in terms of FilePath (=String), and if we gave that a different meaning from the System.Directory FilePaths then confusion would ensue. So perhaps we need to add another API to System.Posix with filesystem operations in terms of ByteString, and similarly for Win32. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: behaviour change in getDirectoryContents in GHC 7.2?
On 02/11/2011 21:40, Max Bolingbroke wrote: On 2 November 2011 20:16, Ian Lynaghig...@earth.li wrote: Are you saying there's a bug that should be fixed? You can choose between two options: 1. Failing to roundtrip some strings (in our case, those containing the 0xEFNN byte sequences) 2. Having GHC's decoding functions return strings including codepoints that should not be allowed (i.e. lone surrogates) At the time I implemented this there was significant support for 2, so that is what we have. Don't you mean 1 is what we have? At the time I was convinced that 2 was the right thing to do, but now I'm more agnostic. But anyway the current behaviour is not really a bug -- it is by design :-) Failing to roundtrip in some cases, and doing so silently, seems highly suboptimal to me. I'm sorry I didn't pick up on this at the time (Unicode is a swamp :). Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Should GHC default to -O1 ?
On the haskell-cafe as well as the beginners mailing lists, there frequently (for some value of frequent) are posts where the author inquires about a badly performing programme, in the form of stack overflows, space leaks or slowness. Often this is because they compiled their programme without optimisations, simply recompiling with -O or -O2 yields a decently performing programme. So I wonder, should ghc compile with -O1 by default? What would be the downsides? ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Records in Haskell
Am Montag, den 07.11.2011, 21:41 + schrieb Barney Hilken: The problem with this approach is that different labels do not have different representations at the value level. I think this is an advantage, because it means you don't have to carry this stuff about at runtime. This allows me to pattern match records, since I can construct record patterns that contain fixed labels: X : MyName1 := myValue1 : MyName2 := myValue2 I cannot see how this could be done using kind String. Do you see a solution for this? A similar problem arises when you want to define a selector function. You could implement a function get that receives a record and a label as arguments. However, you could not say something like the following then: get myRecord MyName1 Instead, you would have to write something like this: get myRecord (Label :: MyName1) Just define a constant myName1 = Label :: MyName1 for each label you actually use, and you can use it in both get and pattern matching You cannot use such a constant in a pattern. You need a data constructor if you want to use pattern matching. Best wishes, Wolfgang ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
On Tue, Nov 8, 2011 at 6:31 AM, Daniel Fischer daniel.is.fisc...@googlemail.com wrote: On the haskell-cafe as well as the beginners mailing lists, there frequently (for some value of frequent) are posts where the author inquires about a badly performing programme, in the form of stack overflows, space leaks or slowness. Often this is because they compiled their programme without optimisations, simply recompiling with -O or -O2 yields a decently performing programme. So I wonder, should ghc compile with -O1 by default? What would be the downsides? I think this is an excellent idea. Even better -O2. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Compiling using gmake
On 05/11/2011 23:41, Christian Brolin wrote: I try to set-up a gnu makefile for compiling Haskell programs with GHC. I want to generate dependencies automatically and I want to put my object (.o) files in a binary specifc directories to be able to compile for different architechtures. The problem is when GHC derives the dependencies it names the object file for the Main module to Main.o and not filename.o as it does if I don't specifiy an odir. This gives me two problems, first I cannot have more than one Main module in the same directory as I often need, e.g. for different test programs. The second problem is that it doesn't match my compile command which always names the object files after the the source files by just changing extensions from .hs to .o. So gmake does not recognize dependencies from my Main modules to other modules. I am stuck here. Any ideas? In GHC's build system we use explicit -o options, as well as -odir and -hidir. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: behaviour change in getDirectoryContents in GHC 7.2?
On Tue, Nov 8, 2011 at 03:04, Simon Marlow marlo...@gmail.com wrote: As mentioned earlier in the thread, this behavior is breaking things. Due to an implementation error, programs compiled with GHC 7.2 on POSIX systems cannot open files unless their paths also happen to be valid text according to their locale. It is very difficult to work around this error, because the paths-are-text logic was placed at a very low level in the library stack. So your objection is that there is a bug? What if we fixed the bug? My objection is that the current implementation provides no way to work around potential bugs. GHC is software. Like all software, it contains errors, and new features are likely to contain more errors. When adding behavior like automatic path encoding, there should always be a way to avoid or work around it, in case a severe bug is discovered. It would probably be better to have an abstract FilePath type and to keep the original bytes, decoding on demand. But that is a big change to the API and would break much more code. One day we'll do this properly; for now we have this, which I think is a pretty reasonble compromise. Please understand, I am not arguing against the existence of this encoding layer in general. It's a fine idea for a simplistic high-level filesystem interaction library. But it should be *optional*, not part of the compiler or base. Ok, so I was about to reply and say that the low-level API is available via the unix and Win32 packages, and then I thought I should check first, and I discovered that even using System.Posix you get the magic encoding behaviour. I really think we should provide the native APIs. The problem is that the System.Posix.Directory API is all in terms of FilePath (=String), and if we gave that a different meaning from the System.Directory FilePaths then confusion would ensue. So perhaps we need to add another API to System.Posix with filesystem operations in terms of ByteString, and similarly for Win32. +1 I think most users would be OK with having System.Posix treat FilePath differently, as long as this is clearly documented, but if you feel a separate API is better then I have no objection. As long as there's some way to say I know what I'm doing, here's the bytes to the library. The Win32 package uses wide-character functions, so I'm not sure whether bytes would be appropriate there. My instinct says to stick with chars, via withCWString or equivalent. The package maintainer will have a better idea of what fits with the OS's idioms. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Way to expose BLACKHOLES through an API?
On 07/11/2011 14:50, Ryan Newton wrote: Hi GHC users, When implementing certain concurrent systems-level software in Haskell it is good to be aware of all potentially blocking operations. Presently, blocking on an MVar is explicit (it only happens when you do a takeMVar), but blocking on a BLACKHOLE is implicit and can potentially happen anywhere. If there are known thunks where we, the programmers, know that contention might occur, would it be possible to create a variant of Control.Monad.Evaluate that allows us to construct non-blocking software: evaluate :: a - IO a evaluateNonblocking :: a - IO (Maybe a) It would simply return Nothing if the value is BLACKHOLE'd. Of course it may be helpful to also distinguish the evaluated and unevaluated states. Further, the above simple version allows data-races (it may become blackhole'd right after we evaluate). An extreme version would actively blackhole it to lock the thunk... but maybe that's overkill and there are some other good ideas out there. A mechanism like the proposed should, for example, allow us to consume just as much of a lazy Bytestring as has already been computed by a producer, WITHOUT blocking and waiting on that producer thread, or migrating the producer computation over to our own thread (blowing its cache). The problem is that a thunk may depend on other thunks, which may or may not themselves be BLACKHOLEs. So you might be a long way into evaluating the argument and have accumulated a deep stack before you encounter the BLACKHOLE. Hmm, but there is something you could do. Suppose a thread could be in a mode in which instead of blocking on a BLACKHOLE it would just throw an asynchronous exception WouldBlock. Any computation in progress would be safely abandoned via the usual asynchronous exception mechanism, and you could catch the exception to implement your evaluateNonBlocking operation. I'm not sure this would actually be useful in practice, but it's certainly doable. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
On Tue, Nov 8, 2011 at 7:11 AM, David Fox dds...@gmail.com wrote: On Tue, Nov 8, 2011 at 6:31 AM, Daniel Fischer daniel.is.fisc...@googlemail.com wrote: On the haskell-cafe as well as the beginners mailing lists, there frequently (for some value of frequent) are posts where the author inquires about a badly performing programme, in the form of stack overflows, space leaks or slowness. Often this is because they compiled their programme without optimisations, simply recompiling with -O or -O2 yields a decently performing programme. So I wonder, should ghc compile with -O1 by default? What would be the downsides? I think this is an excellent idea. Even better -O2. FWIW gcc defaults to -O0. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
On 08/11/2011 14:31, Daniel Fischer wrote: On the haskell-cafe as well as the beginners mailing lists, there frequently (for some value of frequent) are posts where the author inquires about a badly performing programme, in the form of stack overflows, space leaks or slowness. Often this is because they compiled their programme without optimisations, simply recompiling with -O or -O2 yields a decently performing programme. So I wonder, should ghc compile with -O1 by default? What would be the downsides? I understand the problem. However, -O has a couple of serious downsides: 1. it costs about 2x compile time and memory usage 2. it forces a lot more recompilation to happen after changes, due to inter-module optimisations. most people know about 1, but I think 2 is probably less well-known. When in the edit-compile-debug cycle it really helps to have -O off, because your compiles will be so much quicker due to both factors 1 2. So the default -O setting is a careful compromise, trying to hit a good compile-time/runtime tradeoff. Perhaps we're more sensitive in Haskell because -O can easily give you an order of magnitude or more speedup, whereas in C you're likely to get a pretty consistent 30% or so. The difference between -O and -O2 is another careful tradeoff. Also bear in mind that using GHCi gives you another 2x speedup in compilation (approx), but 30x slowdown in runtime (varies wildly from program to program though). And subsequent recompiles are much faster because GHCi has cached a lot of interfaces. I suppose we should really run an up to date set of benchmarks on some real Haskell programs (i.e. not nofib) and reconsider how we set these defaults. I really doubt that we'll want to turn on -O by default, though. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
On Tue, Nov 8, 2011 at 15:31, Daniel Fischer wrote: Often this is because they compiled their programme without optimisations, simply recompiling with -O or -O2 yields a decently performing programme. So I wonder, should ghc compile with -O1 by default? What would be the downsides? Previous discussion on this topic: http://thread.gmane.org/gmane.comp.lang.haskell.glasgow.user/18540 Regards, Sean ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
On Tue, Nov 8, 2011 at 8:16 AM, Simon Marlow marlo...@gmail.com wrote: On 08/11/2011 14:31, Daniel Fischer wrote: On the haskell-cafe as well as the beginners mailing lists, there frequently (for some value of frequent) are posts where the author inquires about a badly performing programme, in the form of stack overflows, space leaks or slowness. Often this is because they compiled their programme without optimisations, simply recompiling with -O or -O2 yields a decently performing programme. So I wonder, should ghc compile with -O1 by default? What would be the downsides? I understand the problem. However, -O has a couple of serious downsides: 1. it costs about 2x compile time and memory usage 2. it forces a lot more recompilation to happen after changes, due to inter-module optimisations. most people know about 1, but I think 2 is probably less well-known. When in the edit-compile-debug cycle it really helps to have -O off, because your compiles will be so much quicker due to both factors 1 2. So the default -O setting is a careful compromise, trying to hit a good compile-time/runtime tradeoff. Perhaps we're more sensitive in Haskell because -O can easily give you an order of magnitude or more speedup, whereas in C you're likely to get a pretty consistent 30% or so. The difference between -O and -O2 is another careful tradeoff. Also bear in mind that using GHCi gives you another 2x speedup in compilation (approx), but 30x slowdown in runtime (varies wildly from program to program though). And subsequent recompiles are much faster because GHCi has cached a lot of interfaces. I suppose we should really run an up to date set of benchmarks on some real Haskell programs (i.e. not nofib) and reconsider how we set these defaults. I really doubt that we'll want to turn on -O by default, though. We should remember that we are only talking about which default leads to the best outcome when, due to inexperience, someone fails to set the option the way they want it. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
On Tuesday 08 November 2011, 17:16:27, Simon Marlow wrote: most people know about 1, but I think 2 is probably less well-known. When in the edit-compile-debug cycle it really helps to have -O off, because your compiles will be so much quicker due to both factors 1 2. Of course. So defaulting to -O1 would mean one has to specify -O0 in the .cabal or Makefile resp. on the command line during development, which certainly is an inconvenience. So the default -O setting is a careful compromise, trying to hit a good compile-time/runtime tradeoff. Perhaps we're more sensitive in Haskell because -O can easily give you an order of magnitude or more speedup, It can even make the difference between a smoothly running programme and a dying one, if one is naively using the right (wrong) constructs. So the nub of the question is, which downside is worse? My experience is limited, so I haven't sufficient data to form a reasoned opinion on that, hence I ask. whereas in C you're likely to get a pretty consistent 30% or so. The difference between -O and -O2 is another careful tradeoff. I suppose we should really run an up to date set of benchmarks on some real Haskell programs (i.e. not nofib) and reconsider how we set these defaults. I really doubt that we'll want to turn on -O by default, though. I suppose there are no cheap but effective optimisations one could move to the default behaviour, or that would've been done :( ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: Records in Haskell
Am Montag, den 07.11.2011, 23:30 + schrieb Simon Peyton-Jones: Wolfgang Is there a wiki page giving a specific, concrete design for the proposal you advocate? Something at the level of detail of http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields? Well, I don’t propose a new record system as a language feature. Instead, I’ve implemented a record system as a library. The paper at http://www.informatik.tu-cottbus.de/~jeltsch/research/ppdp-2010-paper.pdf describes this in detail, and the records package at http://hackage.haskell.org/package/records is the actual library. My stance is that it is possibly better if we do not try to include a one-size-fits-it-all record system into the language, but if the language provided support for basic things that almost all record system *libraries* would need. In my opinion, there is at least one such thing that should get language support: field labels. There is already the proposal at http://hackage.haskell.org/trac/haskell-prime/wiki/FirstClassLabels for first-class field labels. I am unsure whether you regard it as an alternative to the above, or something that should be done as well. And if the former, how does it relate to the challenge articulated on http://hackage.haskell.org/trac/ghc/wiki/Records, namely how to make Haskell's existing named-field system work better? I don’t think that everyone should use my record system. I see it as one member of a family of reasonable record systems. My intention, when developing my record system, was not to make the existing system better, since I needed quite a lot of advanced features that anything near Haskell’s existing record system couldn’t give me. So I started something completely new. Thanks Simon Best wishes, Wolfgang ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Records in Haskell
My stance is that it is possibly better if we do not try to include a one-size-fits-it-all record system into the language, but if the language provided support for basic things that almost all record system *libraries* would need. Agreed. To the extent that such libraries could be improved by sugar, a general solution for such library-specific sugar might be sought. But the record libraries tended to be quite useful, modulo (of the top of my head, probably incomplete): - first-class labels I wonder whether the recent lifting of values into the type-level (towards typed types) offers sufficient convenience? Do they address the sharing issue? Haven't had the time to read yet; - soundness All the type-class based libraries work in the grey area of things that GHC allows for pragmatic reasons and that Hugs disallowed for soundness reasons; the success of GHC shows that Hugs was too careful, but I'd prefer if GHC either acquired safe features that could replace the current interplay of FDs and Overlapping Instances, or if someone proved the set of features in use safe; - optimization there is no reason why record libraries need to be slow to run, and the compilation time increases needed to make it so might be optimized away, too; but someone needs to do the work, or record field selection will -naively and in overloaded style- involve linear lookup; If these were to be addressed, record libraries would be more widely acceptable than they are today (though I recommend playing with the ones that exist, and reporting on their strengths and weaknesses in practice); initially, everyone would use their favorite, but I am hopeful that a common API would emerge eventually, from use. We haven't had any luck agreeing on a common API before use, and none of the many good proposals have managed to sway everyone, which is why I agree on not settling on a single design just yet - just pave the road for wider adoption of record libraries). Back to the side-line for me;-) Claus ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: behaviour change in getDirectoryContents in GHC 7.2?
On 11/8/11 6:04 AM, Simon Marlow wrote: I really think we should provide the native APIs. The problem is that the System.Posix.Directory API is all in terms of FilePath (=String), and if we gave that a different meaning from the System.Directory FilePaths then confusion would ensue. So perhaps we need to add another API to System.Posix with filesystem operations in terms of ByteString, and similarly for Win32. +1. It'd be nice to have an abstract FilePath. But until that happens, it's important to distinguish the automagic type from the raw type. H98's FilePath=String vs ByteString seems a good way to do that. -- Live well, ~wren ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
On Tue, Nov 8, 2011 at 3:01 PM, Daniel Fischer daniel.is.fisc...@googlemail.com wrote: On Tuesday 08 November 2011, 17:16:27, Simon Marlow wrote: most people know about 1, but I think 2 is probably less well-known. When in the edit-compile-debug cycle it really helps to have -O off, because your compiles will be so much quicker due to both factors 1 2. Of course. So defaulting to -O1 would mean one has to specify -O0 in the .cabal or Makefile resp. on the command line during development, which certainly is an inconvenience. AFAIK, Cabal already uses -O1 by default. Cheers, -- Felipe. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Should GHC default to -O1 ?
Quoting Conrad Parker con...@metadecks.org: I don't think compile time is an issue for new users when building HelloWorld.hs and getting the hang of basic algorithms and data structures. Anyone could explicitly set -O0 if they are worried about compile times for a larger project. I don't agree that GHC's user interface should be optimized for newcomers to Haskell. GHC is an industrial-strength compiler with some very advanced features; the majority of its target audience is professional programmers. Let its interface reflect that fact. As Simon explained, GHC's current defaults are a very nice point in the programming space for people who are actively building and changing their programs. ~d ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users