Re: [Haskell-cafe] Some random newbie questions
I'm constantly surprised hearing from so many people about their space problems. I cannot remember having space problems with my programs. I don't know what everybody else is doing wrong :-) At least two common cases. Extracting compact data structures from large files. The contents of the large file is read as a linked list (ugh) of pointers (double ugh) to 32-bit Chars (triple ugh) -- twelve times the size of the file, if my calculations are correct. The contents can't be GC'ed before the extracted data is fully evaluated. (Now if the file was an mmap'ed array, it wouldn't be so bad, perhaps in the next generation IO that people are discussing this will be easier?) Naive use of foldl. I tend to think the default foldl should be strict (ie. replaced by foldl') -- are there important cases where it needs to be lazy? Indeed, extracting a compact data structure from a large data structure can easily cost much space because of lazy evaluation. foldl is probably used mostly for that purpose. Me not having space problems is probably related to the kind of programs I write. Most of my programs are program transformations that take an abstract syntax tree as input and produce a different abstract syntax tree (again a large structure). Trying to be lazy then means trying to produce as much output as possible with processing as little output as possible. More formally that means if there is some partial input for a function such that for all completions of this partial input to fully defined inputs all outputs of the function have a common prefix, then the function should already yield this prefix as output for the partial input. In the example that I mentioned in my previous posting I did actually originally compute size information for a large data structure, so did extract something compact from something large. However, I then realised that I only did some very basic arithmetic with the size information before generating another large data structure of this computed size. So then I decided to not to compute integers at all but do the arithmetic directly on the algebraic data type. Gone was the space problem, without using seq. You might also want to look at Colin Runciman's paper What about the Natural Numbers? in the Journal of Functional Programming in which he argues for a type of lazy natural numbers, with the same semantics as data Nat = Zero | Succ Nat. It fits much better for computing the size of a lazy data structure. I don't claim that all space problems can easily be dealt with. Olaf ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
On Friday 07 January 2005 12:03, Ketil Malde wrote: Naive use of foldl. I tend to think the default foldl should be strict (ie. replaced by foldl') -- are there important cases where it needs to be lazy? Hi, One simple example would be, reverse = foldl (flip (:)) [] J.A. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
Jorge Adriano Aires [EMAIL PROTECTED] writes: Naive use of foldl. I tend to think the default foldl should be strict (ie. replaced by foldl') -- are there important cases where it needs to be lazy? Hi, One simple example would be, reverse = foldl (flip (:)) [] No, it would work with strict foldl too. In fact in the absence of optimization it would work better (uses less time and space). The optimization required is inlining and strictness analysis. A function which requires lazy foldl for correctness would have to be sometimes lazy in its first argument, and at the same time some partial results would have to be undefined. By function I mean the first argument of foldl, treated as a binary function. Here this doesn't apply because flip (:) x y is always defined. And another common case for foldl, sum, doesn't apply because (+) is usually strict on both arguments (although in principle it does not have to be true because of overloading, which implies that a compiler can only optimize particular specializations of sum, not generic sum). I don't know of any real-life example. Perhaps there are cases where evaluating partial results is correct but inefficient. I don't know such case either. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
On Friday 07 January 2005 12:03, Ketil Malde wrote: Naive use of foldl. I tend to think the default foldl should be strict (ie. replaced by foldl') -- are there important cases where it needs to be lazy? Hi, One simple example would be, reverse = foldl (flip (:)) [] A better example would be building some other lazy structure that is strict on it's elements... J.A. --- module Test where import Data.List data L = E | !Int :+: L deriving Show -- my head h (x:+:xs) = x h E= error ops -- rev1 = foldl (flip (:+:)) E rev2 = foldl' (flip (:+:)) E l= [error , error , 1::Int] -- *Test h (rev1 l) 1 (0.00 secs, 264560 bytes) *Test h (rev2 l) *** Exception: (0.01 secs, 264524 bytes) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
No, it would work with strict foldl too. In fact in the absence of optimization it would work better (uses less time and space). The optimization required is inlining and strictness analysis. Is this also true if your just going to use the first few elements after reversing it? A function which requires lazy foldl for correctness would have to be sometimes lazy in its first argument, and at the same time some partial results would have to be undefined. By function I mean the first argument of foldl, treated as a binary function. Here this doesn't apply because flip (:) x y is always defined. And another common case for foldl, sum, doesn't apply because (+) is usually strict on both arguments (although in principle it does not have to be true because of overloading, which implies that a compiler can only optimize particular specializations of sum, not generic sum). I don't know of any real-life example. Yes you are right, my bad. I was thinking as if lists were not lazy on their elements, therefore my second example... But yes, now it is not a common example anymore. Perhaps there are cases where evaluating partial results is correct but inefficient. I don't know such case either. Just replace the errors on my second example by some big computations... J.A. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
Jorge Adriano Aires [EMAIL PROTECTED] writes: No, it would work with strict foldl too. In fact in the absence of optimization it would work better (uses less time and space). The optimization required is inlining and strictness analysis. Is this also true if your just going to use the first few elements after reversing it? Yes. A strict fold would evaluate cons cells of the result while they are constructed, not list elements. They are all defined (flip (:) x y is always defined), so a strict foldl is correct. Making a cons cell should be not more expensive than making a thunk which will make a cons cell when evaluated. Well, unless the implementation doesn't inline flip and thus making these thunks is actually faster than running them. In this case a lazy foldl is more efficient than a strict foldl, as long as a sufficiently small part of the result is used; it's always less efficient if the whole result is examined. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
On Sunday 09 January 2005 21:30, Marcin 'Qrczak' Kowalczyk wrote: Jorge Adriano Aires [EMAIL PROTECTED] writes: No, it would work with strict foldl too. In fact in the absence of optimization it would work better (uses less time and space). The optimization required is inlining and strictness analysis. Is this also true if your just going to use the first few elements after reversing it? Yes. A strict fold would evaluate cons cells of the result while they are constructed, not list elements. They are all defined (flip (:) x y is always defined), so a strict foldl is correct. Yes, now I was refering to the efficiency issue only. Making a cons cell should be not more expensive than making a thunk which will make a cons cell when evaluated. Ok, I wasn't sure about this... Well, unless the implementation doesn't inline flip and thus making these thunks is actually faster than running them. In this case a lazy foldl is more efficient than a strict foldl, as long as a sufficiently small part of the result is used; it's always less efficient if the whole result is examined. Right. J.A. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
(+) is usually strict on both arguments (although in principle it does not have to be true because of overloading, which implies that a compiler can only optimize particular specializations of sum, not generic sum). Since you mention it, there was some talk about this in the #haskell channel, and I wondered why aren't sum and product members of Num with default instances, just like mconcat is also a member of Data.Monoid.Monoid. From the docs: mconcat :: [a] - a Fold a list using the monoid. For most types, the default definition for mconcat will be used, but the function is included in the class definition so that an optimized version can be provided for specific types J.A. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
Many thanks to everyone for the very helpful answers to my queries! - Benjamin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: [Haskell-cafe] Some random newbie questions
| * As far as I can determine, there is no way to check pattern matches for | exhaustiveness. Coming from OCaml, this feels like losing a significant | safety net! How do people program so as not to be getting dynamic match | failures all the time? GHC has -fwarn-incomplete-patterns and -fwarn-overlapped-patterns. But the code implementing these checks is old and crufty, and the warnings are sometimes a bit wrong -- at least when guards and numeric literals are involved. I think they are accurate when you are just using ordinary pattern matching. Cleaning up this bit of GHC is a long-standing to-do item, if anyone feels motivated to undertake it. It's a well-defined task, with plenty of well-written papers explaining how to do it -- but it's tricker than it seems at first! Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
[EMAIL PROTECTED] writes: I'm constantly surprised hearing from so many people about their space problems. I cannot remember having space problems with my programs. I don't know what everybody else is doing wrong :-) At least two common cases. Extracting compact data structures from large files. The contents of the large file is read as a linked list (ugh) of pointers (double ugh) to 32-bit Chars (triple ugh) -- twelve times the size of the file, if my calculations are correct. The contents can't be GC'ed before the extracted data is fully evaluated. (Now if the file was an mmap'ed array, it wouldn't be so bad, perhaps in the next generation IO that people are discussing this will be easier?) Naive use of foldl. I tend to think the default foldl should be strict (ie. replaced by foldl') -- are there important cases where it needs to be lazy? I do disagree with people recommending strictness annotations (seq etc). In contrast, I make my programs as lazy as possible. ...but no lazier :-) -kzm -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
Benjamin Pierce wrote: OK, I'm taking the plunge and using Haskell in a course I'm teaching this semester. To get ready, I've been doing quite a bit of Haskell programming myself, and this has raised a few questions... * What are the relative advantages of Hugs and GHC, beyond the obvious (Hugs is smaller and easier for people not named Simon to modify, while GHC is a real compiler and has the most up-to-date hacks to the type checker)? Do people generally use one or the other for everything, or are they similar enough to use Hugs at some moments and GHC at others? I taught our FP class this fall using Hugs, but in the end wish that I had used GHC. There are lots of little reasons for this, but a big one was a problem with unpredictable space utilization. I don't have the examples at my fingertips, but there were simple variations of the same program that, by all common-sense reasoning, should have behaved in the opposite way with respect to space than what they exhibited. Indeed, the problem that you report in your Sierpinkski Carpet may likely be a problem with Hugs, and not the graphics lib, and Jacob Nelson's message seems to bear this out. SOEGraphics, by the way, is built on top of HGL, a general graphics lib written by Alastair Reid. At the time, it was the best option that we had, but Alastair no longer has time to maintain it, although I believe that Ross Paterson may be maintaining it now. In any case, SOEGraphics has grown a big buggy with respect to portability across platforms and compilers. I am about to update the SOE webpage with our current best shot at a portable and bug-free version of this, but ultimately I'd like to port everything over to something like wxHaskell. -Paul ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
On Thu, Jan 06, 2005 at 09:11:13AM -0800, Benjamin Pierce wrote: * As far as I can determine, there is no way to check pattern matches for exhaustiveness. Coming from OCaml, this feels like losing a significant safety net! How do people program so as not to be getting dynamic match failures all the time? ghc does give warnings when pattern matches aren't exhaustive, at least when called with the compile flags used with darcs. It seems that you may be interested in the -fwarn-incomplete-patterns compile flag with ghc. -- David Roundy http://civet.berkeley.edu/droundy/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
On Thu, 6 Jan 2005, Benjamin Pierce wrote: * What are the relative advantages of Hugs and GHC, beyond the obvious (Hugs is smaller and easier for people not named Simon to modify, while GHC is a real compiler and has the most up-to-date hacks to the type checker)? Do people generally use one or the other for everything, or are they similar enough to use Hugs at some moments and GHC at others? Hugs is compiles faster, that is, it detects type errors faster than GHC and thus it starts program execution earlier. So I use Hugs for fast type checking and simple scripts, that should start quickly rather than run short. I'm using GHC for maximum execution speed and to track down type errors, because its error messages are more detailed. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
Benjamin Pierce [EMAIL PROTECTED] writes: * What are the relative advantages of Hugs and GHC, beyond the obvious (Hugs is smaller and easier for people not named Simon to modify, while GHC is a real compiler and has the most up-to-date hacks to the type checker)? Do people generally use one or the other for everything, or are they similar enough to use Hugs at some moments and GHC at others? Hugs is written in C, it's easy to build and doesn't use much ram/cpu/drivespace. GHC can be difficult to bootstrap for less popular setups (IBM Mainframes, BeOS, Amiga, etc), and both building and using GHC can eat ram/cpu/drivespace. On the feature side, Hugs is just that, a Haskell User's Gofer System. GHC is more like a hotrod research compiler, there's always some neat new feature in CVS that does really cool stuff. (ie Software Transactional Memory) If you have a Sharp Zaurus, Hugs will work but GHC won't. * HUnit and QuickCheck seem to offer very nice -- but different -- testing facilities. Has anyone thought of combining them? (In fact, is HUnit actually used? The last revision seems to be a couple of years ago.) I hacked up a test-first version of QuickCheck that saves failing test cases and checks them again on the next run. That is effectively a combination of HUnit and QuickCheck. I sent in my code when the call for QuickCheck2 ideas happened. I know there was a recent presentation on QC2 at Chalmers, but I don't know if the test-first idea will be integrated, or when QC2 will be released. My code is an inflexible hack I wrote as a proof of concept, it's definitely not ready for real use. PS. TaPL was great, on #haskell we call it The Brick Book Does it already have a standard nickname? -- Shae Matijs Erisson - http://www.ScannedInAvian.com/ - Sockmonster once said: You could switch out the unicycles for badgers, and the game would be the same. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
Benjamin Pierce wrote: * What are the relative advantages of Hugs and GHC, beyond the obvious (Hugs is smaller and easier for people not named Simon to modify, while GHC is a real compiler and has the most up-to-date hacks to the type checker)? Do people generally use one or the other for everything, or are they similar enough to use Hugs at some moments and GHC at others? snip * I wrote a little program for generating Sierpinkski Carpets, and was astonished to find that it runs out of heap under Hugs (with standard settings -- raising the heap size with -h leads to a happier result). As one data point, I don't think SOEGraphics works with GHC or recent versions of Hugs (http://www.haskell.org/soe/graphics.htm). I also tried a modified version of your Sierpinkski carpet program (changed to spit out a PostScript file, since I don't have SOEGraphics). Hugs chokes without increasing the stack, while my copy of GHC 6.2.1 runs the program below quite fine, even without enabling optimizations. Greg Buchholz --Floating point PostScript version of Sierpinkski Carpet fillSquare x y s = putStr $ x1 ++ y2 ++ x1 ++ y1 ++ x2 ++ y1 ++ x2 ++ y2 ++ box\n where x1 = (show x)++ x2 = (show (x+s)) ++ y1 = (show y)++ y2 = (show (y+s)) ++ carpet x y s = if s 1 then fillSquare x y s else let s' = s / 3 in do carpet xys' carpet (x+s') ys' carpet (x+s'*2) ys' carpet x(y+s') s' carpet (x+s'*2) (y+s') s' carpet x(y+s'*2) s' carpet (x+s') (y+s'*2) s' carpet (x+s'*2) (y+s'*2) s' psPreamble = putStr $ %!PS-Adobe-2.0\n ++ /box\n ++ { newpath moveto lineto lineto lineto closepath fill} ++ def\n 0.05 setlinewidth\n main = do psPreamble carpet 50 250 500 putStr showpage\n ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Some random newbie questions
On Thu, 6 Jan 2005, Greg Buchholz wrote: As one data point, I don't think SOEGraphics works with GHC or recent versions of Hugs (http://www.haskell.org/soe/graphics.htm). I had trouble with this recently, and a friend of a friend suggested I use the latest GHC from CVS, and import Graphics.SOE, rather than SOEGraphics. I ran the original code under GHCi 6.3, importing Graphics.SOE, without problems. Jacob Nelson ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Some random newbie questions
OK, I'm taking the plunge and using Haskell in a course I'm teaching this semester. To get ready, I've been doing quite a bit of Haskell programming myself, and this has raised a few questions... * What are the relative advantages of Hugs and GHC, beyond the obvious (Hugs is smaller and easier for people not named Simon to modify, while GHC is a real compiler and has the most up-to-date hacks to the type checker)? Do people generally use one or the other for everything, or are they similar enough to use Hugs at some moments and GHC at others? All the below is my personal opinion: My impression is that GHC is by far the most used Haskell implementation. In my opinion, the only reason Hugs should be used is if it's the only implementation that will run on your system, otherwise GHC or NHC will likely make a better choice. Some of the differences between GHC and the last version of Hugs I've looked at are: 1) GHCi views it's repl as being in a do-block, with the most important consequence being the ability to define functions interactively, Hugs views the input as an expression so functions can only be defined locally. Neither are rather impressive interactive environments (not when compared to Squeak or CL listeners), but GHCi's is definitely more convenient. 2) GHC is generally acknowledged to do a (significantly) better job with error messages (both type and run-time errors). To me, the difference is so significant that, all other things being equal, GHC would still win hands-down. 3) Most development of third-party libraries and tools target GHC first, part of that has to do with 4) Beyond enhancements to type checking, GHC has many other extensions such as: template haskell, preemptive concurrency (Hugs does have cooperative concurrency), asynchronous exceptions, built-in arrow notation, support for generics, and more. 5) And of course, (3) is also caused by second order effects of itself, e.g. Yi uses GHC because it uses hs-plugins which is intimately related to GHC. 6) GHC has support for profiling. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe