[Haskell-cafe] Re: MonadFix
Daniel Fischer wrote: apfelmus writes: | r == 0= p : f (p:ps) q | p*p n = [n] | otherwise = f ps n However, when you do the sensible thing (which Joost did) and have the intsqrt a parameter of the function, like in factorize :: Integer - [Integer] factorize n = f n (intsqrt n) primes' where primes' = something more or less clever here f m sr (p:ps) | r == 0= p:f q (intsqrt q) (p:ps) | p sr= if m == 1 then [] else [m] | otherwise = f m sr ps where (q,r) = m `quotRem` p , then you only have the expensive intsqrt for each prime factor, and the test for each candidate is only one comparison instead of one multiplication and one comparison. Ah thanks, sorry for not seeing it earlier. My thinking was that intsqrt q is calculated multiple times (for q , q/p, q/p^2, ...) per prime candidate p when n is divisible by a large power of p . But I didn't see that, in return, intsqrt q stays the same for candidates that aren't factors of n . Compared to that, p*p is only calculated once per candidate, but then for every candidate. The former is clearly preferable to the latter. In fact, thanks to lazy evaluation, the above code (test r==0 before p sr) evaluates intsqrt q at most once per prime candidate and thus combines both advantages. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: MonadFix
Joost Behrends wrote: apfelmus writes: Huh? p intsqrt n is evaluated just as often as p*p n , with changing n . Why would that be less expensive? Btw, the code above test for r==0 first, which means that the following p*p n is tested exactly once for every prime candidate p . No. One point in the introduction of DivIter is, that intsqrt dividend is stored there and only recomputed, when a new factor is found. Yes, I'm sorry, it didn't occur to me that recomputing per actual prime factor (with multiplicities, i.e. p^5 counted 5 times) is better than recomputing per candidate factor (without multiplicities, i.e. p^5 counted only once). And concerning my cycled lists of summands as [2,4] or [2,4,2,4,2,4,2,6,2,6]: an easily computable function stepping through all primes can only be a function, which yields primes plus some more odd numbers. This is, what i tried. Yes, this scheme was my intention, too. The list primes' doesn't need to (and indeed shouldn't) be a list of actual primes, just a good guess like primes' = 2:[3,5] primes' = 2:3:scanl (+) 5 (cycle [2,4]) or something with [2,4,2,4,2,4,2,6,2,6]. So, it's an infinite list of numbers that qualify as candidates for being a prime factor of n (which I called prime candidates. Not a good name, since they don't need to be actual prime numbers.) What I want to say is that using such an infinite is a nice way to separate the generate-prime-factor-candidates-logic from the test-all-those-candidates-loop. It's not necessary to hard-code it with a predicate like iterdivisors x | x == 0 = 3 | x == 1 = 5 | otherwise x = iterdivisors (x-1) + ((cycle [2,4]) !! x) (which, in this particular form, is hopelessly inefficient) or special step functions like d2 :: DivIter - DivIter d2 x |dividend x `mod` divisor x 0 = x { divisor = divisor x + 2} |otherwise = found x d4 :: DivIter - DivIter d4 x |dividend x `mod` divisor x 0 = x { divisor = divisor x + 4} |otherwise = found x d6 :: DivIter - DivIter d6 x |dividend x `mod` divisor x 0 = x { divisor = divisor x + 6} |otherwise = found x divisions :: DivIter - DivIter divisions y |or[divisor y == 3, divisor y == 5] = divisions (d2 y) |divisor y = bound y = divisions (d6$d2$d6$d4$d2$d4$d2$d4 y) |otherwise= y It's not even necessary to treat 2 in a special way like twosect :: (Integer,[Integer]) - (Integer,[Integer]) twosect m |odd (fst m) = m |even (fst m) = twosect (div (fst m) 2, snd m ++ [2]) does, the primes' list handles it all. Regards apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Printing and Referential transparency excuse
Cristian Baboi wrote: While reading the Haskell language report I noticed that function type is not an instance of class Read. I was told that one cannot define them as an instance of class Show without breaking referential transparency or printing a constant. f :: (a-b)-String f x = bla bla bla How can I define a function to do the inverse operation ? g :: String - ( a - b ) This time I cannot see how referential transparency will deny it. What's the excuse now ? The new excuse is that a better name for g is full-fledged-compiler :: String - (Int - Int) (the function returned by g better not have a polymorphic type). Which programming language should the argument String be written in? Regards apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Wikipedia on first-class object
Wolfgang Jeltsch wrote: If x doesn’t equal y, x == y is False, but if x equals y, x == y might be True or undefined. x == y may be _|_ for the False case, too, depending on its implementation (like first comparing all list elements on even indices and then comparing all list elements on odd indices). But the standard == for lists has indeed the stated property. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Wikipedia on first-class object
Achim Schneider wrote: Jonathan Cast wrote: More importantly, we can prove that [1..] == [1..] = _|_, since [1..] == [1..] = LUB (n = 1) [1..n] ++ _|_ == [1..n] ++ _|_ = LUB (n = 1) _|_ = _|_ As far as I understand http://www.haskell.org/haskellwiki/Bottom , only computations which cannot be successful are bottom, not those that can be successful, but aren't. Kind of idealizing reality, that is. Ah, that's only a glitch in the wording. [1..] == [1..] is still _|_ since it loops forever. For more about _|_, see also http://en.wikibooks.org/wiki/Haskell/Denotational_semantics Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Wikipedia on first-class object
Cristian Baboi wrote: http://en.wikipedia.org/wiki/First-class_object The term was coined by Christopher Strachey in the context of “functions as first-class citizens” in the mid-1960's.[1] Depending on the language, this can imply: 1. being expressible as an anonymous literal value 2. being storable in variables 3. being storable in data structures 4. having an intrinsic identity (independent of any given name) 5. being comparable for equality with other entities 6. being passable as a parameter to a procedure/function 7. being returnable as the result of a procedure/function 8. being constructable at runtime 9. being printable 10. being readable 11. being transmissible among distributed processes 12. being storable outside running processes I'll guess that 5,9,12 does not apply to Haskell functions. Exactly, together with 10 and 11 (when the distributed processes are on different machines). But there is good reason that those things can't be done in Haskell. With extensional equality (two functions are considered equal if they yield the same result on every possible argument) number 5 is undecidable. Similarly, there cannot be functions print :: (Int - Int) - String compile :: String - (Int - Int) with compile . print = id A print function based on an intensional representation (assembly, byte code, etc.) would have to distinguish extensionally equal functions print f ≠ print g although f = g which is not allowed. More importantly, I don't quite understand your question. If you definitively need 9-12 for a practical problem at hand, then you may want to take a look at the functional language Clean http://clean.cs.ru.nl/ which is similar to Haskell but offers 9-12 in some form. In all other cases, an email thread is not a good (often not even successful) way to get a coherent world view on Haskell (or on something else) since this necessarily involves nitpicking philosophical questions. In my experience, interrogating one person in real-time in audio and interrogating books are the best ways to do that. Concerning books, maybe The Haskell Road to Logic, Maths and Programming http://www.cwi.nl/~jve/HR is for you. More books on http://haskell.org/haskellwiki/Books You don't have to buy them, borrow them from a library. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Wikipedia on first-class object
Jules Bean wrote: Cristian Baboi wrote: But I guess it won't work because the compiler will optimize it and the call will disappear. I'm not quite sure what you're trying to say here. (That's the world view on Haskell issue I mean. I know this feeling with physics: what the heck are they talking about, this is just mathematically wrong, ill-defined or otherwise void of contents?) Cristian Baboi wrote: How can be Clean similar to Haskell and at the same time satisfy 9-12 ? In Clean, print has the type print :: (Int - Int) - Dynamic but there is (hopefully) no equality on Dynamic . But it can be stored in a file or something store :: Dynamic - IO () and loaded back. Thanks to IO, we can think of the file contents to be a non-deterministically chosen intentional representation for a value with extensional equality. I don't know whether Clean really does store that way, it may do more and hence break the extensional semantics a bit. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difference lists and ShowS
Henning Thielemann wrote: I can't see it. If I consider (x++y) but I do not evaluate any element of (x++y) or only the first element, then this will need constant time. If I evaluate the first n elements I need n computation time units. How is (.) on difference lists faster than (++) here? That's a very good question. Basically, the problem is: how to specify the time complexity of an operation under lazy evaluation? You argue that (++) is constant time in the sense that evaluating (x ++ y) to WHNF is O(1) when x and y are already in WHNF. Same for (.). This is indeed correct but apparently fails to explain why (.) is any better than (++). Help! Of course, this very paradox shows that just looking at WHNF is not enough. The next best description is to pretend that our language is strict and to consider full normal form x in NF, y in NF -- (x++y) evaluates to NF in O(length x) time Even when x and y are not in normal form, we know that evaluating the expression (x ++ y) takes O(x ++ y) ~ O(length x) + O(x) + O(y) time to evaluate to NF. Here, O(e) is the time needed to bring the expression e into NF first. This approach now explains that (++) takes quadratic time when used left-associatively O((x ++ y) ++ z) ~ O(length x + length y) + O(length x) + O(x) + O(y) + O(z) instead of the expected O((x ++ y) ++ z) ~ O(x) + O(y) + O(z) or something (only up to constant factors and stuff, but you get the idea). Note that considering NFs is still only an approximation since O(head (qsort xs)) ~ O(n) + O(xs) where n = length xs instead of the expected O(head (qsort xs)) ~ O(qsort xs) ~ O(n log n) + O(xs) where n = length xs thanks to lazy evaluation. Also note that despite considering full normal forms, we can express some laziness with this by giving timings for an expression in different contexts like O(take n ys) O(head ys) instead of only O(ys). Same for parameters with something like O(const x) ~ O(1) instead of the anticipated O(const x) ~ O(x). (For lazy data structures, there are better ways to take laziness into account.) With difference lists I write shows L . (shows T . shows R) (shows LL . (showsLT . shows LR)) . (shows T . shows R) ((shows LLL . (shows LLT . shows LLR)) . (showsLT . shows LR)) . (shows T . shows R) I still need to resolve three (.) until I get to the first character of the result string, but for the subsequent characters I do not need to resolve those dots. In the end, resolution of all (.) may need some time but then concatenation is performed entirely right-associative. Seems to be that this is the trick ... So far so good, but the problem now is that analyzing (.) with full normal forms is doomed since this would mean to evaluate things under the lambda which may take less time than doing call-by-need reductions. Still, we somehow have O(x . y) ~ O(x) + O(y) which is better than O(x ++ y) but I'm not quite sure how to make this exact yet. In the end, the above O(e)s are just random doodles conjured out of the hat, I don't know a formalism for easy reasoning about time in a lazy language. Anyone any pointers? Note that the problem is already present for difference lists in strict languages. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: An interesting monad: Prompt
Felipe Lessa wrote: Ryan Ingram wrote: [snip] data Prompt (p :: * - *) :: (* - *) where PromptDone :: result - Prompt p result -- a is the type needed to continue the computation Prompt :: p a - (a - Prompt p result) - Prompt p result [snip] runPromptM :: Monad m = (forall a. p a - m a) - Prompt p r - m r runPromptM _ (PromptDone r) = return r runPromptM f (Prompt pa c) = f pa = runPromptM f . c [snip] How can we prove that (runPromptM prompt === id)? I was trying to go with You probably mean runPromptM id = id runPromptM prompt (PromptDone r) = return r = PromptDone r runPromptM prompt (Prompt pa c) = prompt pa = runPromptM prompt . c = Prompt pa return = runPromptM prompt . c = Prompt pa ((= (runPromptM prompt . c) . return) = Prompt pa (runPromptM prompt . c) and I got stuck here. It seems obvious that we can strip out the 'runPromptM prompt' down there to finish the proof, but that doesn't sound very nice, as I'd need to suppose what I'm trying to prove. Am I missing something here? You want to deduce runPromptM id (Prompt pa c) = Prompt pa c from runPromptM id (Prompt pa c) = Prompt pa (runPromptM id . c) by somehow assuming that runPromptM id = id at least when applied to c . If it were a problem about lists like foldr (:) [] = id you could use mathematical induction . You can do the same here, but you need to use coinduction. For more, see also http://www.cs.nott.ac.uk/~gmh/bib.html#corecursion Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: US Homeland Security program language security risks
Achim Schneider wrote: That's an interesting task: Design a non-touring complete, restricted language in which every expression is decidable, without making the language unusable for usual programming problems. Have a look about dependently typed languages like Epigram: http://www.e-pig.org/ Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Basic question concerning the category Hask (was: concerning data constructors)
Yitzchak Gale wrote: I wrote: ...it was recently claimed on this list that tuples are not products in that category. I've not been convinced yet. I'm going to try convince you :) The crucial problem of Haskell's product is that (_|_,_|_) ≠ _|_ but that the two projections fst :: (A,B) - A snd :: (A,B) - B cannot distinguish between both values. But if (,) were a categorial product, fst and snd would completely determine it. We would have the universal property that for every f :: C - A g :: C - B there is a _unique_ morphism f g :: C - (A,B) subject to f = fst . (f g) g = snd . (f g) In other words, there is a unique function () :: forall c . (c - A) - (c - B) - (c - (A,B)) f g = \c - (f c, g c) In the particular case of C=(A,B), f=fst and g=snd , the identity function is such a morphism which means fst snd = id due to uniqueness. But id _|_ ≠ id (_|_,_|_) while clearly (fst snd) _|_ = (fst snd) (_|_,_|_) Derek Elkins wrote: Also, there is a Haskell-specific problem at the very get-go. The most obvious choice for the categorical composition operator assuming the obvious choice for the arrows and objects does not work... ...This can easily be fixed by making the categorical (.) strict in both arguments and there is no formal problem with it being different from Haskell's (.), but it certainly is not intuitively appealing. Note that the problem with (.) is seq's fault (pun intended :) Otherwise, it would be impossible to distinguish _|_ from its eta-expansion \x._|_ . Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Concurrency questions
Andrew Coppin wrote: 2. You have to take the data out of an MVar to read it. In other words, only 1 thread can read an MVar at once [by design]. This isn't truly a problem in the current case, but it's irritating in principle that I can't make it so that once the cell is written, multiple threads can read it simultaneously... This is also known as I-structures i.e. IVar. I think you can simulate them via MVar with the readMVar function? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: PROPOSAL: Some more 'Applicative' combinators
Sterling Clover wrote: The more general question, which stems from a certain degree of ignorance on my part, is what uses folks have found for Alternative at all outside of parsing. Alternative is the generalization of MonadPlus, so empty and | are useful for lists and Maybe at least. But I'm not sure whether many :: Alternative f = f a - f [a] and friends have any uses outside of parsing. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difference lists and ShowS
Albert Y. C. Lai wrote: apfelmus wrote: I don't know a formalism for easy reasoning about time in a lazy language. Anyone any pointers? Note that the problem is already present for difference lists in strict languages. http://homepages.inf.ed.ac.uk/wadler/topics/strictness-analysis.html especially strictness analysis aids time analysis. Ah, of course, thanks. Together with D. Sands. Complexity Analysis for a Lazy Higher-Order Language. http://citeseer.ist.psu.edu/291589.html for the higher-order case, a satisfactory analysis can be put together. The formalism is basically as follows: for a function f, let f^T denote the time needed to execute it to weak normal form given that it's arguments are already in weak normal form. Weak normal form = full normal form for algebraic values and lambda-abstraction for functions = what you'd expect in a strict language. Plain values = nullary functions. For instance (++)^T [] ys = 1 + ys^T = 1 (++)^T (x:xs) ys = 1 + (x:(xs ++ ys))^T = 1 + (++)^T xs ys + xs^T + ys^T -- (:) is free = 1 + (++)^T xs ys == (++)^T xs ys = O(length xs) Substituting a function application by the function body is counted as 1 time step, that's where the 1 + comes from. For difference lists, we have (.)^T f g = O(1) since it immediately returns the lambda-abstraction \x - f(g x) . Now, we missed something important about difference lists namely the function toList f = f [] that turns a difference list into an ordinary list and this function is O(n). In contrast, The pendant for ordinary lists, i.e. the identity function, is only O(1). Why is it O(n)? Well, (.) itself may be O(1) but it constructs a function that needs lots of time to run. In particular (f . g)^T [] = ((\x-f (g x))[])^T = 1 + (f (g []))^T = 1 + f^T (g []) + (g [])^T = 1 + f^T (g []) + g^T [] So, to analyze higher-order functions, we simply have to keep track of the size of the returned functions (more precisely, Sands uses cost-closures). The above reduces to (f . g)^T [] = 1 + f^T [] + g^T [] Since our difference lists don't care of what they are prepended to f^T xs = f^T [] Cheating a bit with the notation, we can write toList^T (f . g) = 1 + toList^T f + toList^T g This means that a difference list build out of n elements by m applications of (.) will take O(n + m) time. This is the same as O(m) because m = n , our lists are concatenations of singletons. That's not O(n) as anticipated, but it's alright: a concatenation of m empty lists is empty but clearly takes O(m) time, so the number of concatenations matters. Since difference lists offer such a good concatenation, why not replace ordinary lists entirely? Well, the problem is that we have another function that should run fast, namely head . For ordinary lists, head^T xs = O(1) but for difference lists, we have (head . toList)^T f = O(m) which is = O(n) in the worst case, lazy evaluation notwithstanding. How to analyze lazy evaluation? Wadler's approach is to add an extra argument to every expression which says how much of the expression is to be evaluated. This extra information can be encoded via projections. But I think it's sufficient here to let (head expr)^T symbolize the time to reduce expr to weak head normal form. For example, (head . toList)^T (f . g) = 1 + (head . toList)^T f assuming that f is nonempty. But due to the 1 + , any left-nested composition like head . toList $ (((a . b) . c) . d) . e still needs O(m) time. So, difference lists are no eierlegende wollmilchsau either. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difference lists and ShowS
Achim Schneider wrote: Henning Thielemann wrote: apfelmus wrote: So, difference lists are no eierlegende wollmilchsau either. LEO's forum suggests 'swiss army knife' as translation. :-) But you really need one with 5 differently-sized blades plus three spezialized carving blades, an USB stick, microscope, 13 kinds of torx, imbus etc drivers each, a tv set (analogue/digital) with unfoldable touchscreen, at least 3-band GSM and WiFi connectivity, hydraulic car jack and chain saw to award it with the term egg-laying woolmilkpig. But even such knives still can't lay eggs :( Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Displaying steps in my interpreter
Victor Nazarov wrote: import Control.Monad.Writer -- use a difference list or something for better performance type Trace v = [Zipper v] whnf :: Term v - Writer (Trace v) (Term v) whnf t = whnf' ([],t) where whnf' (AppL e':cxt, Lam x e) = tell (cxt, App (Lam x e) e') whnf' (cxt , subst x e' e) whnf' (cxt, App f e) = whnf' (AppL e:cxt, f) whnf' z = return $ unwind z The definition of whnf is basically left unchanged, except that a redex is recorded via tell whenever a beta-reduction is about to be performed. The recorded terms can be printed afterwards printTrace :: Writer (Trace v) (Term v) - IO () printTrace w = let (t, ts) = runWriter t ts putStrLn . unlines . map show $ ts Is this function lazy? Can I run this code on term without nf and print n-first steps: printTraceN :: Int - Writer (Trace v) (Term v) - IO () printTraceN n w = let (t, ts) = runWriter t ts in putStrLn . unlines . map show $ take n ts Will this work: printTraceN 5 (read (\x. x x x) (\x. x x x)) Yes, it should (you can just try, right? :). That's because tell w something is basically let (w', x) = something in (w ++ w', x) Now, something may loop forever, but w ++ w' doesn't care, it prepends w no matter what w' is. Of course, asking for x (the normal form in our case) instead of w ++ w' won't succeed when something loops forever. (Cave: this is only true for the writer monad in Control.Monad.Writer.Lazy which is imported by default. The writer monad in Control.Monad.Writer.Strict intentionally behaves differently.) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [newbie question] Memoization automatic in Haskell?
Henning Thielemann wrote: David Benbennick wrote: But how can I implement memoization for a more complicated function? For example, perhaps I want to memoize f :: String - Int - Double - String - Bool There was a long thread about a sophisticated technique called blue prints, which allows you to use binary search trees as memoizing structure. http://www.haskell.org/pipermail/haskell-cafe/2006-September/018204.html That's not what blueprints were for. You want generalized tries here Ralf Hinze. Generalizing generalized tries. http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz as Luke pointed out. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [newbie question] Memoization automatic in Haskell?
Luke Palmer wrote: David Benbennick wrote: It would be nice if I could just tell the compiler I command you to memoize this function, and have it produce the required code automatically. Tru dat! But it's not clear what the best way for the compiler writer to do that is. For example, if I know the access patterns of the function, I can design the aforementioned data structure to favor those. Also, not every type admits memoization, for example functions. Indeed. There are plenty of choices of data structures for memo tables and hash tables are not the best of them. Such choices are better left to the programmer. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: MonadPrompt + Gtk2Hs = ?
? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: practice problems?
Tamas K Papp wrote: I am looking for small to medium sized practice problems, preferably with solutions. Hi Tamas, writing a Haskell library is very good idea yet requires some confidence. So for your first somehow useful programs, you could want to try olympiad / competition problems in Haskell. Of course, most are not specifically oriented towards Haskell as programming language and the harder ones are about tricky algorithms. But they are the kind of small yet intelligent task you are looking for. Not all have solutions but usually, google will reveal quite a few. The ultimate Haskell challenge is of course the ICFP contest http://icfpcontest.org/ There is also the International ACM Programming Contest http://acm.uva.es/problemset/ Your country surely has some kind of high school computing science competition to get problems from. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Monad laws
Deokhwan Kim wrote: What is the practical meaning of monad laws? (M, return, =) is not qualified as a category-theoretical monad, if the following laws are not satisfied: 1. (return x) = f == f x 2. m = return == m 3. (m = f) = g == m (\x - f x = g) The 3. law contains a typo and must be 3. (m = f) = g == m = (\x - f x = g) But what practical problems can unsatisfying them cause? In other words, I wonder if declaring a instance of the Monad class but not checking it for monad laws may cause any problems, except for not being qualified as a theoretical monad? This question is likely to be a result of an unlucky introduction to monads where they are introduced top down: Hear ye, a monad, this is some mystic thing obeying the spiritual laws 1.,2. and 3., isn't it? It is this way that monads get the attribute theoretical. Asking what the practical meaning of the monad laws might be is like asking what the practical meaning of the laws for natural number addition could be: what does i) a + (b+c) == (a+b) + c mean? How can i understand ii) a + 0 == a ? What does iii) a + b == b + a signify? These question are unlikely to arise because you have an intuition of what a natural number is: a number of bullets in sack, coins in your pocket, people in the mailing-list etc. With this knowledge, you will most likely not have any problems explaining the laws i),ii),iii) to somebody else and most likely you will have not doubt about *why* they must be true. For monads, my intuition is as following: a value of type (M a) is an action, something producing a value of type a and (or by) executing a side-effect like drawing on the screen or screwing up the hard drive. With the operator =, I can execute such actions in a specific sequence. For the sequence, it is of course unimportant how i group my actions: i can group actions act1 and act2 first and then postpend act3, or i can group act2 and act3 first and then prepend it with act1. To simplify writing down a formular corresponding to this fact, we introduce the operator defined by act1 act2 = act1 = \x - act2 which sequences actions but for simplicity discards the computed value x of type a. It is only the side-effect of act1 we are interested in. Now, the thought about grouping written does as formular is just (act1 act2) act3 == act1 (act2 act3) and this is the simplified version of law 3. Of course, we know that this is coined associativity. The actual law 3 is just a formulation for = that takes proper care of the intermediate calculation result x. With return x , we can create an action which computes the value x but has absolutely no side-effects. This can also be stated in formulas, as Mr return explains: 1. if i am prepended to guys doing side-effects, i give them the value x but do not take any responsibility for side-effects happening (return x) = (\y - f y) == f x 2. if i am postponed to an action which computes a value x, i don't do any additional side-effects but just return the value i have been given m = (\x - return x) == m which is of course equivalent to m = return == m So to answer your question: In other words, I wonder if declaring a instance of the Monad class but not checking it for monad laws may cause any problems, except for not being qualified as a theoretical monad? A thing you declare to be an instance of the Monad class, but one that does not fulfill the three laws above, simply does not match the intuition behind a monad. I.e. your definitions of (=) and (return) are most likely to be void of the intended meaning. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: evaluate vs seq
Michael Shulman wrote: The GHC documentation says that (evaluate a) is not the same as (a `seq` return a). Can someone explain the difference to me, or point me to a place where it is explained? (evaluate a) is weaker than (a `seq` return a). (a `seq` return a) is always _|_ whereas (evaluate a) is not _|_, but does throw an exception when executed. The appended code shows the differences. Regards, apfelmus import Prelude hiding (return,catch) import qualified Control.Monad as M import Control.Exception a = undefined :: () return = M.return :: a - IO a e 0 = return a e 1 = a `seq` return a e 2 = evaluate a t x = e x return () -- t 0 == return () -- t 1 == throwIO something -- t 2 == throwIO something -- to check this, evaluate the expressions -- (t x = print) and (t x `seq` ()) u x = e x `seq` return () -- u 0 == return () -- u 1 == undefined -- u 2 == return () -- to check this, type (u x = print) into hugs/ghci v x = catch (e x) (\_ - return ()) = print -- v 0 == throwIO something -- v 1 == print () -- v 2 == print () ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: evaluate vs seq
Hello Michael, you are correct. Only * (a `seq` return a) = evaluate a *right now*, then produce an IO action which, when executed, returns the result of evaluating a. Thus, if a is undefined, throws an exception right now. is a bit misleading as there is no evaluation right now. It's better to say that (a `seq` return a) is _|_ (bottom, i.e. undefined) when a == _|_. The subtle point is the difference between the action of throwing an exception (throw some_error :: IO a) and an undefined value (_|_). (throw ..) will cause your program to crash when executed, but it's still a well defined value of type (IO a). For a more detailed semantics of exceptions in Haskell, see Tackling the awkward squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell http://research.microsoft.com/%7Esimonpj/Papers/marktoberdorf/ I further think (evaluate x) can be implemented as follows: evaluate x = catch (x `seq` return x) throw Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: evaluate vs seq
Michael Shulman wrote: On 9/11/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: * (a `seq` return a) = evaluate a *right now*, then produce an IO action which, when executed, returns the result of evaluating a. Thus, if a is undefined, throws an exception right now. is a bit misleading as there is no evaluation right now. It's better to say that (a `seq` return a) is _|_ (bottom, i.e. undefined) when a == _|_. Sure... but what about when a is not _|_? I would also like to understand the difference between `seq' and `evaluate' for arguments that are defined. How would you describe that without talking about when expressions are evaluated? Ah well, the discussion goes about denotational semantics of Haskell. Unfortunately, I did not find a good introductory website about this. Personally, I found the explanation from Bird and Wadler's book about infinite lists very enlightening. The game roughly goes as follows: denotational semantics have been invented to explain what a recursive definition should be. I mean, thinking of functions as things that map (mathematical) sets on sets excludes self-referencing definitions (russell's paradox!). The solution is to think functions of as fixed points of an iterative process: factorial is the fixed point of (fac f) n = if n == 0 then 1 else n*f(n-1) what means (fac factorial) n == factorial n Now, the iterative process goes as follows: factorial_0 n = _|_ factorial_1 n = fac (factorial_0) n = if n == 0 then 1 else _|_ factorial_2 n = fac (factorial_1) n = case n of 0 - 1 1 - 1 _ - _|_ and so on. Everytime, a more refined version of the ultimate goal factorial is created, on says that _|_ == factorial_0 = factorial_1 = factorial_2 = ... (= means less or equal than) That's why _|_ is called bottom (it's the smalled thing in the hierarchy). This was about functions. In a lazy language, normal values can show a similar behavior. For instance, we have _|_ = 1:_|_ = 1:2:_|_ = 1:2:3:_|_ = ... = [1..] That's how the infinite list [1..] is approximated. The inequalities follow from the fact that bottom is below everything and that all constructors (like (:)) are monotone (by definition), i.e. 1:x = 1:y iff x = y A function f is called *strict*, if it fulfills f _|_ = _|_ which means that it does not produce a constructor (information) without knowing what its argument is. Back to your original question, we can now talk about functions in terms of _|_ and *data constructors*. As an example, we want to think about the meaning of (take 2 [1..]). What should this be? I mean, one argument is an infinite list! By tracing the definition of (take 2), one finds take 2 _|_ == _|_ -- (btw (take 2) is strict) take 2 (1:_|_) == 1:_|_ take 2 (1:(2:_|_)) == 1:(2:[]) take 2 (1:(2:(3:_|_))) == 1:(2:[]) and so on. The right and side remains equal for all further refinements, so we must conclude take 2 [1..] == 1:(2:[]). For the evaluate and `seq` problem, we simplify things by specializing the polymorphic type to ([Int] - IO [Int]). Then, we introduce two constructors (ThrowException :: IO [Int]) and (Return :: IO [Int]) with the obvious meaning. The semantics of `seq` are now as following: _|_`seq` x == _|_ [] `seq` x == x (y:ys) `seq` x == x So `seq` forces its first argument. When we define f x = x `seq` (Return x) we thereby get f _|_== _|_ f [] == Return [] f (x:xs) == Return (x:xs) To compare, the semantics of (evaluate) is evaluate _|_== ThrowException =/= _|_ evaluate [] == Return [] evaluate (x:xs) == Return (x:xs) That should answer your question. Please note that this answer is actually a lie, as functions must be monotone (halting problem, fix id == _|_), but (evaluate) is not: _|_ = [] yet evaluate _|_ == ThrowException /= evaluate [] == Return [] This can be fixed by departing from denotational semantics and giving all (IO a)-things an operational meaning. Further explanations and further references are found in the paper I pointed out last time. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization problem
Bertram Felgenhauer wrote: splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b]) splitSeq' bp [] = ([], map (const []) bp) splitSeq' bp ((a,b):xs) = case lookup a bp bp of Just _ - let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m) _ - let (bp', m) = insert a bp m' (l, m') = splitSeq' bp' xs in ((a, b : (fromJust $ lookup a bp' m')) : l, m) splitSeq' takes a blueprint for a map with all keys seen so far, and a list tail. It returns the result list for all new keys, and a map (corresponding to the given blueprint) with the tails of the seen elements. The in the base case it just fills the blueprint with empty lists and returns to the caller. If a new element is seen, insert is used to create a new blueprint, including the new key a, which is passed to the recursive call of splitSeq'. The resulting map from that call is threaded back into insert, which gives us a new map without the a key which matches the structure of the original blueprint. Very interesting! So the map with the seen tails matches the blueprint and therefore throws away information (the future keys) from the future. This is what effectively allows the key-tree structure to be rebalanced. If one would return the tails-map with all future keys, it would be _|_ because the key-order in the tree depends on the future keys and changes everytime when a new key is found. I thought that there can only be a static solution, i.e. like the one Ross Paterson presented where the structure (I mean the outermost constructors) of the returned tree are determined before the future. This obviously excludes rebalancing. I found a static solution by trying to fit the key-tails pairs into an infinite tails-map which is filled first comes first: map ! 1 := (first key seen, tails) map ! 2 := (second key seen, tails) By keeping another key-position-map around which records where each key has been inserted, so that the future knows where to put the tails. The point is that the structure of tails-map is known before the future comes as its keys are just 1,2,3,... and so on. It remains to construct such an infinite random access list, but this is turns out to be even easier than finite random access lists (when using non-uniform recursion from Okasaki's chapter 10) :) data Imp a = Imp a (Imp (a,a)) deriving (Show) instance Functor Imp where fmap h ~(Imp x xs) = Imp (h x) (fmap (\(x,y) - (h x, h y)) xs) update :: (a - a) - Position - Imp a - Imp a update f 1 ~(Imp x xs) = Imp (f x) xs update f n ~(Imp x xs) = Imp x $ update f2 (n `div` 2) xs where f2 ~(y,z) = if even n then (f y, z) else (y, f z) Note that we can use irrefutable patterns without hesitation as there is only one constructor. Folding over an infinite thing is strange but here we are fold :: (a - b - b) - Imp a - b fold f ~(Imp x xs) = f x (fold f2 xs) where f2 (x,y) z = f x (f y z) It's not so strange anymore when we realize that this can be used to convert it into an infinite list toList = fold (:) The implementation of fromList is fun as well, so I won't tell it. As fold and unfold can be defined generically for Mu f where f is a functor, i wonder how to deal with it in the case of non-uniform recursion. For splitStreams, the key-position-map is needed in both directions, so we quickly define a bijective map type BiMap a b= (Map.Map a b, Map.Map b a) switch :: BiMap a b - BiMap b a switch (x,y) = (y,x) with the usual operations (empty, member, size etc.) omitted here. Now comes splitStreams itself: splitStreams :: Ord a = [(a,b)] - [(a,[b])] splitStreams xs = takeWhile (not . null . snd) $ toList $ splitStreams' empty xs splitStreams' :: Ord a = BiMap a Position - [(a,b)] - Imp (a,[b]) splitStreams' bimap [] = fmap (\i - (findWithDefault undefined i $ switch bimap,[])) $ fromList [1..] splitStreams' bimap ((a,b):xs) = update fun pos $ splitStreams' bimap' xs where fun ~(_,bs) = (a,b:bs) sz = size bimap pos = findWithDefault (sz+1) a bimap bimap' = (if member a bimap then id else insert a (sz+1)) bimap Note that update actually generates fresh constructors, so the structure of our tails-Imp is not really static. But this is no problem as the form of the constructors is completely known: there is only one. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization problem
Ross Paterson wrote: On Thu, Sep 14, 2006 at 05:22:05PM +0200, Bertram Felgenhauer wrote: [much subtle code] We can now build the splitStream function, using the following helper function: splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b]) This works for infinite lists if there is no balancing, but if insert does balancing, the top of the map will not be available until the last key is seen, so splitSeq' could only be used for finite chunks. Then you'll need a way to put the partial answers together. It might be possible to amortize the cost of that for an appropriate choice of chunk length. It would also cost some laziness: the chunked version would work for infinite lists, but wouldn't produce all the output for a partial list. No. The point is that in let blueprint = empty (_,m) = splitSeq' blueprint $ concat $ repeat [(1,'a'),(2,'b')], the map m contains only as many keys as there are in blueprint, i.e. not a single one! After munching the first (1,'a'), the first recursive call to splitSeq', the returned map m' fulfills toList m' == [(1,a...)] The trick is to throw away all keys from the future, so that there is no need to wait on them. Also, if your argument would have been correct, then the version without balancing wouldn't work either because insert already exchanges Leafs for Nodes in m. So the top of the map would be unavailable until all Leafs have been exchanged. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: foreach
Bulat Ziganshin wrote: because REAL code is somewhat larger than examples. try to rewrite the following: directory_blocks - (`mapM` splitBy (opt_group_dir command) files_to_archive) ( \filesInOneDirectory - do datablocks - (`mapM` splitToSolidBlocks filesInOneDirectory) ( \filesInOneDataBlock - do [...] This particular snippet contains too many undefined identifiers to be rewritten effectively, but I'm very sure that the whole program can be restructured to great effect. Maybe by designing a binary-block-combinator language which calculates padding bytes and length headers automatically and fiddles out scheduling for fast writing to a pipe, something like that. Eventually, a binary parser combinator library which can read single bit flags and things is a must here. It may even be possible to combine the two providing a bijection between abstract file tree, tar-ed blocks and compressed binary file. Separate your concerns, ban IO as much as possible and any function that takes more than 15 lines is a wart. I admit that real world applications are not a good exercise to practice functional programming, but once acquired, advanced functional tactics prove very powerful. An example might be WASH/CGI which successfully abstracts session state over HTTP, a problem where Perl-Scripts are doomed and all kind of imperative buzz like JavaBeans and so on have been invented but somewhat fail to solve it for non-trivial cases. Regards, afpelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization problem
[EMAIL PROTECTED] wrote: type BiMap a b = (Map.Map a b, Map.Map b a) Actually BiMap is not needed at all, it suffices to have splitStreams :: Ord a = [(a,b)] - [(a,[b])] splitStreams xs = takeWhile (not . null . snd) $ toList $ splitStreams' Map.empty xs splitStreams' :: Ord a = Map.Map a Position - [(a,b)] - Imp (a,[b]) splitStreams' map [] = fmap (const (undefined,[])) $ fromList [1..] splitStreams' map ((a,b):xs) = update fun pos $ splitStreams' map' xs where fun ~(_,bs) = (a,b:bs) sz = Map.size map pos = Map.findWithDefault (sz+1) a map map'= (if Map.member a map then id else Map.insert a (sz+1)) map Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization problem
Ross Paterson wrote: How embarrassing. Still, your code is quite subtle. As penance, here's my explanation, separating the main idea from the knot-tying. The ingredients are a map type with an insertion function [...] insert :: Ord k = k - a - Map k a - Map k a with a partial inverse uninsert :: Ord k = k - Map k () - Map k a - (a, Map k a) [...] Applying a tupling transformation to insert+uninsert gives your version. It's interesting that these composed transformations don't seem to cost too much. That the composed transformations are indeed cheap is not necessarily disturbing. Let's call Bertram's original insert version insertdelete from now on. When analyzing the magic behind, you do ross_analysis :: ((bp,map') - (bp',map)) - (bp-bp', (bp,map') - map) ross_analysis insertdelete = (insert, uninsert) Of course a tupling transformation will reverse this tupletiple :: (bp-bp', (bp,map') - map) - ((bp,map') - (bp',map)) tupletiple (insert,uninsert) = insertdelete But as @djinn will tell you, ross_analysis cannot be realized :) So the original version possesses additional computational power, it can reverse everything it's doing with bp on the map'. uninsert does not have information about the single steps that have been done, it only knows what should come out. From that, it would have to reconstruct quickly what happened, if it wants to be fast. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization problem
Ross Paterson wrote: On Sun, Sep 17, 2006 at 01:08:04PM +0200, [EMAIL PROTECTED] wrote: Ross Paterson wrote: It's interesting that these composed transformations don't seem to cost too much. That the composed transformations are indeed cheap is not necessarily disturbing. I meant the composition of the transformations of the tree (update or reverse insert) that Bertram does for each list element. They're cheap because they're bounded by the depth of the tree, and even cheaper if you're probing some other part of the tree. Me too :) I tried to point out that separating uninsert from in insert increases time complexity. In general uninsert :: Ord k = k - Map k () - Map k a - (a, Map k a) fst $ uninsert _ bp m == differenceWith (curry snd) bp m with needs O(size of blueprint) time. This is to be compared with O(log (size of blueprint)) for the combined insertdelete. That there (likely) must be such a difference can already be seen from the types of insertdelete and (insert,uninsert) ! But you already pointed out that something like insertdelete :: Ord k = k - Map shape k - (exists shape' . (Map shape' k, Map shape' a - (a, Map shape a)) is the best type for insertdelete. Here, it is clear that insertdelete likely can do a fast uninsert. Btw, why are there no irrefutable patterns for GADTs? I mean, such a sin should be shame for a non-strict language... Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization problem
Conor McBride wrote: Just imagine data Eq a b where Refl :: Eq a a coerce :: Eq a b - a - b Robert Dockins wrote: coerce ~Refl x = x coerce undefined True :: String Bang you're dead. Or rather... Twiddle you're dead. Mom, he's scaring me! ;) Who says this function may not be strict in the first argument despite the irrefutable pattern? Also, for the sarcastically inclined, death is semantically equivalent to _|_. As is (fix id :: a - b), too. Alas, one can say dontcoerce :: Eq a (Maybe b) - a - Maybe b dontcoerce ~Refl x = Just x and dontcoerce' _|_ == (\x - Just (x :: b(~Refl))) == \(x::_|_) - Just (x :: _|_) == \_ - Just _|_ The type of x on the right-hand side inherently depends on ~Refl and should be _|_ if ~Refl is. Type refinement makes the types depend on values, after all. For our optimization problem, it's only a matter of constructors on the right hand side. They should pop out before do not look on any arguments, so it's safe to cry so you just know, i'm a Just. Maybe one can remedy things by introducing a _|_ on type-level with the only inhabitant _|_ :: _|_. That would treat objections Ross Paterson referenced: See the second restriction. Irrefutable patterns aren't mentioned there, but they're the general case. See also http://www.haskell.org/pipermail/haskell-cafe/2006-May/015609.html http://www.haskell.org/pipermail/glasgow-haskell-users/2006-May/010278.html Admittedly, it's a thin ice to trample on, but otherwise, for this special splitSeq-problem, one runs into the more haste less speed dilemma (i mean Wadler's paper ). Bertram's lazy algorithm actually is an online-algorithm and it should remain one when making it type safe. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] irrefutable patterns for existential types / GADTs
Ross Paterson wrote: The story so far: apfelmus: why are there no irrefutable patterns for GADTs? Conor: because you could use them to write unsafeCoerce Ross: how about irrefutable patterns (or newtypes) for existential types? Simon: Try giving the translation into System F + (existential) data types I'd like to add that I see no problem with coerce :: Eq a b - a - b coerce ~Refl x = x as long as we have coerce _|_ x === _|_ The wish is that f = \refl - Just . coerce refl = \~Refl x - Just x should satisfy f _|_ x === Just _|_ f _|_ x =/= _|_ and likewise for any other constructor than Just. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs
{-# OPTIONS -fglasgow-exts #-} module Main where import Data.IORef data T a where Li:: Int - T Int Lb:: Bool - T Bool La:: a - T a writeInt:: T a - IORef a - IO () writeInt v ref = case v of ~(Li x) - writeIORef ref (1::Int) readBool:: T a - IORef a - IO () readBool v ref = case v of ~(Lb x) - readIORef ref = (print . not) tt::T a - IO () tt v = case v of ~(Li x) - print OK main = do tt (La undefined) ref - newIORef undefined writeInt (La undefined) ref readBool (La undefined) ref This code is more intricate than data Eq a b where Refl :: Eq a a coerce :: Eq a b - a - b coerce ~Refl x = x but I think it amounts to exactly the same thing: ref and x are forced to a particular type witnessed by the GADT. But I think that something still can be squeezed out, strictness is not absolutely necessary (see upcoming mail on this thread). Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs
Here is a formulation of what exactly I require from irrefutable pattern matches for GADTs. The problem arouse from the Optimization problem thread. In short, here is a GADT-using, type safe version of Bertram's solution (without balancing) -- a binary search tree with witness about its shape data Map s k a where Leaf :: Map () k a Node :: k - a - Map s k a - Map t k a - Map (s,t) k a empty :: Map () k a empty = Leaf member :: Ord k = k - Map s k a - Bool member _ Leaf= False member k (Node k' _ l r) = case compare k k' of LT - member k l EQ - True GT - member k r -- a wrapper for an existential type data Undoer s k a where Undoer :: (Map t k a) - (Map t k a - (a,Map s k a)) - Undoer s k a -- insert key element blueprint map (blueprint, result, map) insert :: Ord k = k - a - Map s k a - Undoer s k a insert k a Leaf = Undoer (Node k a Leaf Leaf) (\(Node k a Leaf Leaf) - (a,Leaf)) insert k a (Node k' b (l :: Map l k a) (r :: Map r k a) :: Map s k a) = case compare k k' of LT - case insert k a l of Undoer (m :: Map t k a) f - Undoer (Node k' b m r :: Map (t,r) k a) (\(Node k' b' m' r' :: Map (t,r) k a) - let (a,l') = f m' in (a,Node k' b' l' r' :: Map s k a)) EQ - error inserting existing element GT - case insert k a r of Undoer (m :: Map t k a) f - Undoer (Node k' b l m :: Map (l,t) k a) (\(Node k' b' l' m' :: Map (l,t) k a) - let (a,r') = f m' in (a,Node k' b' l' r' :: Map s k a)) update :: Ord k = k - (a - a) - Map s k a - Map s k a -- the culprit, to be defined later splitSeq :: Ord a = [(a,b)] - [(a,[b])] splitSeq = fst . splitSeq' empty splitSeq' :: Ord a = Map s a [b] - [(a,b)] - ([(a,[b])], Map s a [b]) splitSeq' bp [] = ([], bp) splitSeq' bp ((a,b):xs) = case member a bp of True - let (l, m) = splitSeq' bp xs in (l, update a (b:) m) _- case insert a [] bp of Undoer bp' f - let (rs,m) = splitSeq' bp' xs (bs,m') = f m in ((a, b:bs) : rs, m') To make this work in a lazy manner (it becomes an online algorithm then and works for infinite lists), I'd like to have update :: Ord k = k - (a - a) - Map s k a - Map s k a update k f ~(Node k' a l r) = case compare k k' of LT - Node k' a (update k f l) r EQ - Node k' (f a) l r GT - Node k' a l (update k f r) reasoning that the Node constructor should be output before one inspects the incoming ~Node. I thought that (l, m) = splitSeq' bp xs witnesses that bp and m have the same Shape s, but this is the point where the not-normalizing argument throws in: the type of splitSeq' claims to be a proof that bp and m have the same s but who knows whether it really terminates? So, I'm opting for a different update which is more along the lines of Bertram's original: update :: Ord k = k - (a - a) - Map s k a - Map t k a - Map s k a update k f (Node k' _ l' r') ~(Node _ a l r) = case compare k k' of LT - Node k' a (update k f l' l) r EQ - Node k' (f a) l r GT - Node k' a l (update k f r) The blueprint gives immediate witness that splitSeq' preserves shape, so this update should be fine. To summarize, the main problem is to get a lazy/online algorithm (the problem here falls into the more haste, less speed category) while staying more type safe. @Conor: how does this issue look like in Epigram? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs
But that makes it refutable! For the above, either coerce _|_ x === x or the notation is being abused. Making a pattern irrefutable does not mean that the function in question will become lazy: fromJust (~Just x) = x fromJust _|_ === _|_ The point with coerce is that it looks very much like being lazy in its first argument but in fact it is not. The trouble is that GADT pattern matching has an impact on types, as well as being a selector-destructor mechanism, and for the impact on types to be safe, the match must be strict. I think it's the extra power of GADTs to tell you more about type variables already in play which does the damage. But I think that something still can be squeezed out, strictness is not absolutely necessary. I thought something along the lines of f :: Eq a b - a - Maybe b f ~Refl x = Just x with f _|_ x === Just _|_ The point is one can always output the constructor Just, it does not inspect the type of x. Now, I don't think anymore that this is feasible as the type of (Just x) still depends on the type of x (even if the constructor Just does not mention it). Nevertheless, I still want to remove strictness, see my next mail in this thread. For existentials, I'm not sure, but it seems to me that there's not such a serious issue. Isn't the only way you can use the type which allegedly exists to project out some dictionary/other data which is packed inside the existential? Won't this projection will cause a nice safe _|_ instead of a nasty unsafe segfault? I agree. The only practical problem I can imagine is that GHC internally treats existentials as GADTs. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs
Thanks for asking! [...] Online algorithms do look like a good place to try to get some leverage from this, but we haven't really been in a position to experiment so far. I'm sure that will change. Well, I asked because laziness with memoization gives a definite edge in terms of algorithmic time complexity over strictness (proved for online algorithms in more haste, less speed) and I wonder how this goes together with totality and dependent types. A forthcoming ultimate programming language should retain this edge, shouldn't it? ;-) It isn't necessary to perform constructor discrimination when it's statically known that exactly one constructor is possible, so those patterns can always be made irrefutable, with matching replaced by projection. Thanks for pinpointing the culprit, because translated to GADTs, it means that whenever we know from the type index that only one constructor is possible, it can be made irrefutable. So this is the kind of irrefutable pattern one could safely add. In particular, this can be weakened to whenever we know the type index, i.e. there is no type refinement depending on the value, an irrefutable pattern is safe. But once no type refinement happens, the irrefutable pattern can already be obtained via let bindings in existing Haskell! Below is a lazy version of Bertram solution to the Optimization problem following Ross' suggestions with existentials and GADTs: {-# OPTIONS_GHC -fglasgow-exts #-} data Map s k a where Leaf :: Map () k a Node :: k - a - Map s k a - Map t k a - Map (s,t) k a (Map s k a) is a tree which makes its structure explicit in the type index s. unNode :: Map (s,t) k a - (k,a,Map s k a, Map t k a) unNode (Node k a l r) = (k,a,l,r) unNode knows the type index and is to be used for a lazy pattern match let (k,a,l,r) = unNode .. empty :: Map () k a empty = Leaf member :: Ord k = k - Map s k a - Bool member _ Leaf= False member k (Node k' _ l r) = case compare k k' of LT - member k l EQ - True GT - member k r data Undoer s k a where Undoer :: (Map t k a) - (Map t k a - (a,Map s k a)) - Undoer s k a Undoer is just (exists t. ...) wrapped into a digestible type. -- insert key element blueprint map (blueprint, result, map) insert :: Ord k = k - a - Map s k a - Undoer s k a insert k a Leaf = Undoer (Node k a Leaf Leaf) (\map - let (_,a,_,_) = unNode map in (a,Leaf)) Note how the type system proves that the function (\map - ..) always has to expect (map) == Node ... Then, unNode binds k',b',... /lazily/. Same goes for the rest of insert: insert k a (Node k' b (l :: Map l k a) (r :: Map r k a) :: Map s k a) = case compare k k' of LT - case insert k a l of Undoer (m :: Map t k a) f - Undoer (Node k' b m r :: Map (t,r) k a) (\map - let (k',b',m',r') = unNode map (a,l') = f m' in (a,Node k' b' l' r' :: Map s k a)) EQ - error inserting existing element GT - case insert k a r of Undoer (m :: Map t k a) f - Undoer (Node k' b l m :: Map (l,t) k a) (\map - let (k',b',l',m') = unNode map (a,r') = f m' in (a,Node k' b' l' r' :: Map s k a)) -- update k fun blueprint map update :: Ord k = k - (a - a) - Map s k a - Map s k a - Map s k a update k f (Node k' _ l' r') map = case compare k k' of LT - Node k' a (update k f l' l) r EQ - Node k' (f a) l r GT - Node k' a l (update k f r' r) where (_,a,l,r) = unNode map Again a lazy pattern match on (map). Note that the type system enforces that blueprint and actual (map) have the same shape s. The type refinement for GADTs happens in the strict pattern match on the blueprint and this allows us to lazily match the (map). splitSeq :: Ord a = [(a,b)] - [(a,[b])] splitSeq = fst . splitSeq' empty splitSeq' :: Ord a = Map s a [b] - [(a,b)] - ([(a,[b])], Map s a [b]) splitSeq' bp [] = ([], bp) splitSeq' bp ((a,b):xs) = case member a bp of True - let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m) _- case insert a [] bp of Undoer bp' f - let (rs,m) = splitSeq' bp' xs (bs,m') = f m in ((a, b:bs) : rs, m') To summarize, the problem can be solved without irrefutable patterns for GADTs: the code above works for infinite lists. Yet, they are handy and can be introduced safely in the case where the type indices are known in advance and no type refinement happens. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: question - which monad to use?
Tamas K Papp wrote: Hi, I have a computation where a function is always applied to the previous result. However, this function may not return a value (it involves finding a root numerically, and there may be no zero on the interval). The whole problem has a parameter c0, and the function is also parametrized by the number of steps that have been taken previously. To make things concrete, type Failmessage = Int -- this might be something more complex data Result a = Root a | Failure Failmessage -- guess I could use Either too f :: Double - Int - Double 0 - Result Double f c0 0 _ = c0 f c0 j x = {- computation using x, parameters calculated from c0 and j -} Then c1 = f c0 0 c0 c2 = f c0 1 c1 c3 = f c0 2 c2 up to cn. I would like to 1) stop the computation when a Failure occurs, and store that failure 2) keep track of intermediate results up to the point of failure, ie have a list [c1,c2,c3,...] at the end, which would go to cn in the ideal case of no failure. I think that a monad would be the cleanest way to do this. I think I could try writing one (it would be a good exercise, I haven't written a monad before). I would like to know if there is a predefined one which would work. There are a several ways to achieve your goal, most do not use monads. *a) The underappreciated unfold* unfoldr :: (a - Maybe (b,a)) - a - [b] basically iterates a function g :: a - Maybe (b,a) and collects [b] until f fails with Nothing. With your given function f, one can define g c0 (j,c) = case f c0 j c of Root c' - Just (c',(j+1,c')) _ - Nothing and get the job done by results = unfoldr (g c0) (0,c0) The only problem is that the failure message is lost. You can write your own unfold, though: unfoldE :: (a - Either Failmessage a) - a - ([a], Maybe Failmessage) *b) tying the knot, building an infinite list* cs = Root c0 : [f c0 j ck | (j,Root ck) - zip [0..] cs] will yield cs [Root c0, Root c1, ..., Failure i] ++ _|_ Then, you just have to collect results: collect xs = (failure, [ck | Root ck - ys]) where isFailure (Failure i) = True isFailure _ = False (ys,failure:_) = break isFailure results = collect cs Note that in this case, you always have to end your values with a failure (success as failure). Alas, you didn't mention a stopping condition, did you? *c) the monadic way* This is not the preferred solution and I'll only sketch it here. It only makes sense if you have many different f whose calling order depends heavily on their outcomes. Basically, your monad does: 2) keep track of results (MonadWriter) and 2) may yield an error (MonadError). Note that you want to keep track of results even if an error is yielded, so you end up with type MyMonad a = ErrorT (Either Failmessage) (Writer [Double]) a where ErrorT and Writer are from the Control.Monad.* modules. f :: Double - Int - Double - MyMonad Double f c0 j ck = do {computation} if {screwed up} then fail you too, Brutus else tell {c_{k+1}} return {c_{k+1}} *d) reconsider your definition of f, separate concerns * The fact that the computation of ck depends on the iteration count j makes me suspicious. If you are using j for convergence tests etc. only, then it's not good. The most elegant way is to separate concerns: first generate an infinite list of approximations f :: Double - Double - Double f c0 ck = {c_{k+1}} cs = iterate (f c0) and then look for convergence epsilon = 1e-12 takeUntilConvergence [] = [] takeUntilConvergence [x] = [x] takeUntilConvergence (x:x2:xs) = if abs (x - x2) = epsilon then [x] else x:takeUntilConvergence (x2:xs) or anything else (irregular behaviour, ...). If it's difficult to tell from the cs whether things went wrong, but easy to tell from within f (division by almost 0.0 etc.), you can always blend the separate concerns approach into a) and b): -- iterate as infinite list iterate f x0 = let xs = x0 : map f xs in xs -- iterate as unfoldr iterate f x0 = unfoldr g x0 where g x = let x' = f x in Just (x',x') Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How would you replace a field in a CSV file?
Pete Kazmier wrote: import Data.ByteString.Lazy.Char8 as B hiding (map,foldr) import Data.List (map) import Data.Map as M hiding (map) -- This will be populated from a file dict = M.singleton (B.pack Pete) (B.pack Kazmier) main = B.interact $ B.unlines . map doline . B.lines where doline= B.join comma . mapIndex fixup . B.split ',' comma = B.singleton ',' fixup 3 s = M.findWithDefault s s dict fixup n s = s -- f is supplied the index of the current element being processed mapIndex :: (Int - ByteString - ByteString) - [ByteString] - [ByteString] mapIndex f xs = m xs 0 where m [] _ = [] m (x:xs') i = f i x : (m xs' $! i+1) How about import Data.ByteString.Lazy.Char8 as B hiding (map,foldr) import Data.List (map) import Data.Map as M hiding (map) dict = M.singleton (B.pack Pete) (B.pack Kazmier) main = B.interact $ B.unlines . map doline . B.lines where doline = B.join comma . zipWith ($) fixup9 . B.split ',' fixup9 = fixup 9 fixup n = replicate n id ++ [\s - M.findWithDefault s s dict] ++ repeat id Note that fixup9 is shared intentionally across different invocations of doline. The index n starts at 0. Also note that because (compare :: (Byte)String - ..) takes time proportional to the string length, the use of Map will inevitably introduce a constant factor. But I'm still happy you didn't use arrays or hash tables (urgh!) :) In any case, tries are *the* natural data structure for (in)finite maps in functional languages, see also Ralf Hinze. Generalizing generalized tries. Journal of Functional Programming, 10(4):327-351, July 2000 http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Problematic irrefutable pattern matching of existentials
[EMAIL PROTECTED] wrote: I have come to realize that irrefutable pattern matching of existentials may indeed be problematic. Let us consider the following existential data type data FE = forall a. Typeable a = Foo a | forall a. Typeable a = Bar a The following tests type and run (the latter raising the expected exception). test1 = case (Foo ()) of Foo x - show $ typeOf x test2 = case (Bar True) of Foo x - show $ typeOf x Let us suppose that irrefutable pattern matching of existentials is allowed. What would be the value of the expression case (Bar True) of ~(Foo x) - show $ typeOf x then? Interesting, interesting. Without those unsafe Typeables and further simplified, we can say class Foo a where foo :: a - Int instance Foo () where foo = const 1 instance Foo Bool where foo = const 2 data Bar = forall a . Foo a = Bar a culprit :: Bar - Int culprit ~(Bar x) = foo x Now, hugs -98 gives culprit (Bar (undefined :: ())) 1 culprit (Bar (undefined :: Bool)) 2 The interesting part, however is culprit undefined Program error: Prelude.undefined culprit (Bar undefined) ERROR - Unresolved overloading *** Type : Foo a = Int *** Expression : culprit (Bar undefined) But both should be the same, shouldn't they? In the first case, the implementation gets an undefined dictionary. But in the second case, the type system complains. If we used explicit dictionaries as in data Bar = forall a . Bar (a-Int, a) the second case would yield _|_, too, One may claim that the existential pattern match also binds an implicit dictionary argument, which is demanded above. That explanation is quite unsatisfactory, because it breaks the abstraction of type classes. Indeed, the above shows the subtle difference: with type classes, one may not pass an undefined dictionary as this is an unresolved overloading. Irrefutable patterns for existential types somehow disturb things. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs
John Meacham wrote: the reason being that id translates internally to id = /\a . \x::a . x since you cannot pull out an appropriate type to pass to id without evaluating the 'irrefutable' pattern, you end up with _|_ instead of 4. basically, allowing irrefutable matching on existentials would expose whether the underlying implementation performs type-erasure. even if an implementation does type erasure like ghc. suddenly odd things occur, like adding an unusned class constraint to a type signature might change the behavior of a program. (since it will suddenly have to pull out a dictionary) So you mean that id = /\a . \x :: a . x is /strict/ in its first argument, i.e. in the /type/ a. So to speak, type erasure is equivalent to strictness in all and every type argument. For an irrefutable existential pattern, the type argument should be lazy, of course. This allows you to pull out an appropriate type, only that it's pulled out only when needed. The point is that const c = /\a . \x :: a . c has no problems of being lazy in its type argument. Strictness in type arguments is of course a consequence of the fact that there is no _|_ type. A systematic way to add irrefutable pattern matches to existentials (or even GADTs) would therefore be the introduction of lazy type arguments (alas introduction of a type _|_). Type erasure then becomes a form of /strictness analysis/. I remember that Wadler's projection based strictness analysis can discover unused values and that's what happens very often with types as they are seldomly calculated with at runtime. So you can erase them by virtue of your strictness analyzer. Thinking further, this would even allow to free type classes from the dictionary approach: an overloaded function like (+) = /\a . \x :: a . \y :: a . x+y performs a case analysis on its type argument and selects the right specialized function. In case where this type is known at compile time, the /inliner/ will select the right specialization. This type based dispatch is more natural for type classes than dictionaries because the latter would in principle allow to supply different dictionaries for one and the same type. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: irrefutable patterns for existential types / GADTs
yeah, that is what I mean. however, since we don't have explicit type passing in haskell, reasoning about the lazyness of types would be quite tricky, leading to odd things like changing type signatures (without changing the underlying types) can change the behavior of a program. You mean f.i. by specializing a polymorphic type via type signature? So that data Foo = exists a . Foo a f ~(Foo x) = const 4 (id x) would behave different for (const :: a - Int) and (const :: (a,a) - Int)? At least the latter does not even typecheck. I suspect that all possible program behavior changes are already catched at the type/implementation level: when all types are strict, there just cannot be irrefutable patterns for existentials :) not to mention the optimizations that would be excluded by having to preserve the strictness in types behavior... worrying about it in values is tricky enough. :) True. But you don't have different optimizations for types and values, do you? It would actually not be difficult to add lazy types to jhc at all. I just don't see any use for them when dealing with haskell source. but another front end.. (omega or cayenne perhaps?) certainly might make use of them. In Haskell, only irrefutable patterns for existentials and perhaps GADTs come to mind. But it would be for the good health of our poor computer programs (whaa, My mind just exploded.) :) again, this is precicely how jhc implements type classes. There are numerous benefits, in particular a single scrutinization of a type will determine concrete methods for _every_ class used on that type, in effect you get run-time specialization for free. The huge disadvantage is that it does not play well nice at all with separate compilation, this is an active area of research for me in jhc. Mh, type classes are in essence polymorphic variants on the type level. As types and values do not really differ, you could gain something from existing polymorphic variant implementations like in some ML dialects. A quick google revealed http://caml.inria.fr/pub/papers/garrigue-polymorphic_variants-ml98.pdf but I don't know if it helps. Polymorphic recursion makes things difficult. As a side effect, once you got type classes separately compiled, you get some form of polymorphic variants on the value level for free! Then, it's only a step backward to support open data types proposed by Löh and Hinze http://www.informatik.uni-bonn.de/~ralf/publications/OpenDatatypes.pdf Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Haskell performance (again)!
thing only partly needed) work. But this doesn't use tail recursion/accumulation (which seems to me like a complete hack that is a mandatory price to pay for using FP) I never payed it :) You actually fell into the trap by rewriting your code: performance got worse! Because of laziness, your first version can deliver the first coefficient of the sum without calculation the others. Your tail recursive version must calculate the hole sum before returning anything. This is because the list constructor (:) is lazy in its second argument. addPoly1 falls into the (large - large) category. Of course, if you have to sum Int's over a list (which is large - small and all list elements are needed), then it is wise to be strict and a seq will come in handy (foldl' (+) 0 in Haskell). On the other hand, if you are 'or'ing Boolean values over a list ((any) in Haskell), then laziness is the right thing, why? Your second version only adds overhead by introducing additional parameters and so on, it won't get better than the minor laziness overhead. Test both functions with hugs +s and be disappointed about the number of reductions/cells used. Concerning overhead in general, the simplest version will always do and the compiler will figure out the rest with automatic strictness analysis, inlining, unboxing, tail recursion etc. To the programmer, tail recursion is of no importance in a lazy language. And my last remark, again about data structures: * (mutable) arrays and hash tables don't fit into a functional language. Don't use them. Use Data.Map for random access. That being said, static arrays whose entries do not change (you only have a single shot) can be very handy for Dynamic Programming. In this case, it is crucial that they are boxed, i.e. the entries are calculated lazily. If need any other data structure, http://haskell.org/ghc/docs/latest/html/libraries/ http://haskell.org/haskellwiki/Libraries_and_tools/Data_structures http://www.eecs.tufts.edu/~rdocki01/edison.html For (in)finite maps, tries are *the* natural data structure for functional languages, see also Ralf Hinze. Generalizing generalized tries. Journal of Functional Programming, 10(4):327-351, July 2000 http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz If you are in a true need for speed, they'll outperform everything else. Should you need a custom data structure (appendable priority search queue interval region ... tree sequence), you can build one from the all-rounding Finger Tree, see also Finger Trees: A Simple General-purpose Data Structure Ralf Hinze and Ross Paterson. in Journal of Functional Programming16:2 (2006), pages 197-217 http://www.soi.city.ac.uk/~ross/papers/FingerTree.pdf Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
at length in September on this list. The problem here is that writeFile currently cannot be interleaved lazily, this has to be simulated with appendFile. We can read files lazily but we cannot output them lazily. Can this be remedied? Can there be a version of writeFile which is, in a sense, dual to getContents? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: tail-recursing through an associative list
Malcolm Wallace wrote: Seth Gordon [EMAIL PROTECTED] wrote: almost-entirely-functional code ... that in its first draft, took about three seconds to process 2,000 rows, eight minutes to process 20,000 rows, and overflowed a 1-MB stack when processing 200,000 rows. Oops. Which just goes to show that your algorithm is non-linear in the size of the input. I doubt that your posted snippet is the cause of the problem, since it is certainly linear in the AList it is given. The linear time in the length of AList may well be the problem :) You seem to use AList as a key-value mapping (I know the word associative only as mathematical property, please correct me). The Key acts as key for the grouping you mentioned and the [MetaThingie] is the actual group of MetaThingie, is that so? That means that with each call to myFunction, the AList roughly grows by one element. For logarithmic access times, you should use a binary search tree like Data.Map or similar. The problem in your case could be that matchKeys is only approximate and your keys cannot be ordered in suitable fasion. Then you need a clever algorithm which somehow exploits extra structure of the keys (perhaps they are intervals, then you can use interval trees etc.). The C++ code is likely to do some magic along these lines. In such case, stunning functional power may come from Finger Trees: A Simple General-purpose Data Structure Ralf Hinze and Ross Paterson. in Journal of Functional Programming16:2 (2006), pages 197-217 http://www.soi.city.ac.uk/~ross/papers/FingerTree.pdf I profiled the code and observed that the most-called-upon function in the program (200 million entries for those 20,000 rows) By optimisation, you can only make this function a constant factor faster. You need to work out how to call it less often in the first place. Almost true. I think that recursive calls are counted as proper call, so that each top level call to myFunction will result a count of calls to myFunction linear in the length of AList. Thus in first place alias top level is not enough. type AList = [(Key, [MetaThingies])] myFunction :: AList - Thingie - AList myFunction [] x = [(key x, [meta x])] myFunction ((k, v):tail) x | matchKeys k (key x) = case tryJoining v x of Nothing - (k, v) : (myFunction tail x) Just v' - (k', v') : tail where v' = bestKey k (key x) should be (?) where k' = bestKey k (key x) | otherwise = (k, v) : (myFunction tail x) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
A better solution would be to begin output before the the whole input is read, thus making things more lazy. This can be done the following way: from the input, construct a lazy list of (date,line) pairs. Then, let foldM thread a map from dates to corresponding output file pointers through the list and, at the same time, use the file pointers to output the line in question via appendFile. This way, every line consumed is immediately dispatched to its corresponding output file and things should only require memory for the different dates, besides buffering. I tried this approach previously and it seems to be unacceptably slow. I thought the slowness was just due to the fact that file operations are slow, but I'll include my code here (cleaned up to take some of your previous comments into account) just in case there is something subtle I'm missing which is slowing down the code (B, M, and myRead are as above): dates' file nRows = do (cols, rows) - myRead file dateIx - M.lookup cols $ B.pack \Date\ let writeDate row = B.appendFile (dataDir++fmt date) row where date = (B.split ',' row)!!dateIx fmt = B.unpack . B.map (\x - if x == '-' then '_' else x) . B.takeWhile (/= ' ') oldFiles - getDirectoryContents dataDir mapM_ (\f - catch (removeFile $ dataDir++f) $ const $ return ()) oldFiles mapM_ writeDate $ take nRows rows This code takes over 20 minutes to process 100MB on my machine. No wonder, as this opens and closes the file on every row. The operating system will be kept quite busy that way! In some sense, your are outsourcing the row collecting M.Map to the OS... Of course, you want to open the files once and dispatch the rows to the different open handles. Here is a version (untested) which either does the read all then write approach (group'n'write) or opens the output files simultaneously (group'n'write2). Note also that there is no need to use M.Map for finding the Date keyword in the CSV header (which even hurts performance) though the effects are completely negligible. main = do args - getArgs case args of [dates,file,nRows] - dates file (read nRows) dates file nRows = B.readFile file = group'n'write . sugarWithDates . take nRows . B.lines sugarWithDates (header:rows) = map (\r - (B.split ',' r) !! dateIx, r)) rows where Just dateIx = Data.List.lookup (B.pack \Date\) $ zip (B.split , header) [0..] formatDate= B.unpack . B.map (\x - if x == '-' then '_' else x) . B.takeWhile (/= ' ') date2filename = (dataDir ++) . formatDate group'n'write = mapM_ writeDate . M.toList . foldl' addDate M.empty where addDate mp (date,row) = M.insertWith date (\new old - row:old) [] mp writeDate (date,rows) = B.writeFile (date2filename date) $ B.unlines rows group'n'write2 = foldM addDate M.empty = mapM_ hClose . M.elems where addDate mp (date,row) = do (fp,mp) - case M.lookup date mp of Just fp - return (fp,mp) _ - do fp - openFile (date2filename date) WriteMode return (fp, M.insert date fp mp) hPut fp row return mp The thing that bugs me is that one cannot separate group'n'write2 = write2 . group where (group) is a pure function. I think some kind of lazy writeFile could allow this. thanks for your help, No problem. :) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: function result caching
Carl Witty wrote: Instead of using an infinite list, you can use an infinite binary tree, with a cached result at every node. This, also known as patricia tree, is indeed the canonical answer. In general, one can always construct an (in)finite map from keys (Ints in the present case) to values as a trie. See also Ralf Hinze. Generalizing generalized tries. Journal of Functional Programming, 10(4):327-351, July 2000 http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz This paper uses generic programming, but that should not be the problem as concrete cases can always be constructed by hand in plain Haskell. To apply this to Ints, one should view them as type Int = [Digit] data Digit = Zero | One Also note that there is absolutely no balancing involved (which would break infinite and lazy stuff). Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
The (almost) point-free versions run faster than my fast imperative version and take up significantly less heap space-- even the version which reads everything and then writes takes up about 1/3 the heap space as my version. That was not intended, though I'm very pleased :-D I get the impression that point-free style is a preventive measure against space leaks. Ah, good point, I didn't think about it that way. Point-less makes a bit sure that old values are not leaking around as they are not passed through the pipe. Yet, I'm a bit astonished. I thought that when compiling with -O2, cosmetic changes should become negligible. Perhaps the strict foldl' has an effect? Regards, afpelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
Paul Hudak wrote: In fact avoiding space leaks was one of the motivations for our moving to an arrow framework for FRP (now called Yampa). Arrows amount to a point-free coding style, although with arrow syntax the cumbersomeness of programming in that style is largely alleviated. I think that's an entirely different thing. You changed representation of signal transformers from newtype SF a b = SF ([a] - [b]) to data SF a b = SF (a - (b, SF a b)) By enforcing a synchronous processing, you avoid leaking Signals. The latter cannot be isomorphic to a function type (Signal a - Signal b) for an appropriate Signal, so this implies a point-free style as there is no way to hold stuff of type (Signal a) in variable bindings. This does not mean that there is no point-wise syntax for arrows, it just means that point-wiseness cannot be achieved via variables in the context of function application, i.e. via lambda abstraction. In fact, the main point about Arrows is not that they are an abstraction for computations but that they allow a point-wise syntactic sugar (which stems from their computational being, of course)! The optimization problem here uses (almost) one and the same representation (pure (a - b), sometimes packed in (a - IO b)) and point-free turns out to be space friendlier than point-wise. That's very puzzling and i think ghc -O2 should eliminate this. Regards, afpelmus PS: IMHO the semantics of (SF a b) need a real cleanup. (Time - a) - (Time - b) is too crude, because these map transformers even cannot be made an instance of ArrowChoice. Also, Dirac-like peaks (that is Events) do not fit in. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
I think that you misunderstood what I said. When we went from FRP to Yampa, we changed from using signals directly, i.e. Signal a, to using signal functions, i.e.: SF a b = Signal a - Signal b When we did this, we made SF abstract, and we adopted the arrow framework to allow composing, etc. signal functions. This meant that you could not get your hands on Signals directly, which helped to prevent space leaks. What you describe above is a change that we made in the /implementation/ of signal functions (specifically, from streams to continuations), which indeed is an entirely different thing. You mean that only the fact that (Signal a - Signal b) got abstract prevented space leaks? Can you give an example? That the implementation with continuations avoids space leaks is clear. The question is whether the old does when using the new primitives. In fact, this amounts to the question whether (inject) as defined in newtype SF' a b = SF' ([a] - [b]) data SF a b = SF (a - (b, SF a b)) inject :: SF a b - SF' a b inject (SF f) (a:as) = let (b,g) = f a in b:inject g as preserves space and time complexity: inject (sf `o` sf') =same space time= (inject sf) . (inject sf') and the same for all other operations you provide besides `o`. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
Time to write actual code and do some measurements :) The code attached at the end of the message gets compiled with -O2. Writing a sample test file happens with (writeTest #columns #rows) like in writeTest 4 50 (~7 seconds, ~770MB heap (=:-o), 3MB test file). I assume the heap spaces from writeTest are everything added together as 'top' does not report any memory bursts. The following matrix represents the performance comparisons between four possibilities. The times get produced by calling (test #matrixrow (filtertest #matrixcol)) map transpose ++[nl] 2.76,2.87,2.832.72,2.88,2.96 2.72,2.92,2.872.82,2.79,2.88 No significant differences. In a test with more rows, seems to perform slightly better (as expected). Transpose is a bit better, too: writeTest 10 75 (~24 seconds, ~2.8GB heap (=:-o), 15MB test file) map transpose ++[nl] 3.50,3.59,3.423.23,3.26,3.29,3.19 3.38,3.41,3.413.19,3.14,3.23 Looks like my measurements somewhat disagree with yours. Odd. But note that by discriminating the to be tested functionality on run-time, the compiler gets no chance to optimize things for the particular case. So in reality, (++[nl]) could trigger a good code transformation whereas () does not. Also note that transpose is very lazy and is far cheaper than it looks. Somehow, the 2.8 and 3.5 seconds are not in proportion with respect to the inputs of 3MB and 15MB or the outputs of 590KB and 400KB (yes, the smaller input produces a larger output). Your 13 seconds versus 90 seconds makes this even more puzzling. But it looks like writing a CSV file is far more expensive than reading one. Maybe it's not a good idea to call hPut very often. the mask (map (`elem` tags) cols) is only computed once (shouldn't the compiler do that automatically since the expression is constant?) [...] col x cols row = row !! i where Just i = lookup x $ zip cols [0..] One has to be careful, col x cols = \row - row !! i where Just i = lookup x $ zip cols [0..] is different as it shares i across all rows. The compiler is likely not to do this easy transformation (full laziness transformation), for col because this can introduce space leaks. These are things the programmer should have control over, so no optimization here. See also http://haskell.org/haskellwiki/GHC/FAQ#When_can_I_rely_on_full_laziness.3F I think the reason given there is wrong, it's not about efficiency but about space leaks. The map showcase suggests that (map (`elem` tags) cols) is computed only once, though personally, I don't rely on that (yet). Regards, apfelmus module CSV where import qualified Data.ByteString.Lazy.Char8 as B import Data.List import System.IO {--- Reading and writing CSV (comma separated value) files } readCSV :: FilePath - IO [[B.ByteString]] readCSV file = do v - B.readFile file return $ map (B.split ',') $ B.lines v writeCSV :: Int - FilePath - [[B.ByteString]] - IO () writeCSV i file tbl = do h - openFile file WriteMode mapM_ (writeRow i h) tbl hClose h writeRow j = case j of 1 - \h - mapM_ (B.hPut h) . (++ [nl]) . intersperse comma 2 - \h row - (mapM_ (B.hPut h) $ intersperse comma row) B.hPut h nl where comma= B.singleton ',' nl = B.singleton '\n' {--- Processing [[ByteString]] } select j targs test (cols : rows) = narrow $ cols : filter (test cols) rows where narrow = colmap j (map snd . filter fst . zip mask) mask = map (`elem` targs) cols colmap :: Int - (forall a . [a] - [a]) - [[a]] - [[a]] colmap 1 f = map f colmap 2 f = transpose . f . transpose col x cols = \row - row !! i where Just i = lookup x $ zip cols [0..] if12 = ((== B.pack 2) .) . col (B.pack 1) filtertest j = select j (map B.pack [1,2,4]) if12 test i f = readCSV test.csv = writeCSV i result.csv . f {--- Test cases } rotated :: Int - [[B.ByteString]] rotated n = map (take n) . iterate (drop n) . concat . repeat . map (B.pack . show) $ [1..(n+1)] writeTest c r = writeCSV 1 test.csv . take r . rotated $ c ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Lexically scoped type variables
I think the problem comes from the fact that let (x :: a) = rhs in expr has two interpretations: either the programmer mainly specifies a judgment (\Gamma |- rhs :: a) (f.i. as a guide for type inference) or he just introduces a new variable and sees let as a sugar for (\(x :: a) - expr) rhs with the viewpoint that the program is a huge expression and all top-level declarations are just (somewhat superflous) lambdas which lead to a final goal. In both cases, one would assume that the main variable binder is our good old lambda. Thus, the (::) in (\(x::a) - ..) behaves like a constructor and the type variable on the right hand side is just a pattern: f (x :: [a]) = ... is not any different from f (Cons x [a]) = ... When f is applied to an argument, it's the type checker that performs pattern matching. Of course, one needs nonlinear type patterns: (f :: a - b) $ (x :: a) = f x So the type variables in lambdas behaves like normal variables and things like g :: Int - Int g (x :: a) = x + 1 :: a are entirely fine. All in all, type variables in lambdas are real variables. The first interpretation of let amounts to the Haskell98 intuition: the code x :: a - a x = rhs means a judgment that x has type (forall . a - a). This also means that this does not bring any new type variables in scope, only lambdas are capable of introducing new variables. GHC 6.6 rules are not compatible with that. They somehow admit variable bindings by judgments only to discover that this doesn't work for existential types. But note that we have (x :: ...) and (x=...) as separate statements, and we may well keep the second interpretation: f :: Int - Int (f :: a - a) = \x - (x+1 :: a) The intuition is that stand-alone statements (x :: ...) are exactly what they look like: judgments for the type system. The only odd thing about them is that they are stand-alone and do not add to the main program expression. IMHO, the most natural thing would be that type judgments appear on the rhs of a let or a lambda and that type variable bindings appear on the lhs of a let or a lambda. In the definition of f above, (f :: Int - Int) and (x+1 :: a) are judgments and (f :: a - a) is a variable binding. Any confusion between judgment and binding is just like a confusion between constructor application and pattern. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
Hello Bulat, sorry, I just completely forgot to write down the answer for your post. Can this be remedied? Can there be a version of writeFile which is, in a sense, dual to getContents? this can be solved in other way. here is a program that reads stdin and puts to stdout lines starting with '' and to stderr the rest. note that our main processing function is pure: main = do a - getContents let b = map (processing stdout stderr) (lines a) mapM_ (\(file,line) - hPutStrLn file line) b processing file1 file2 line = if `isPrefixOf` line then (file1,line) else (file2,line) (processing) does grouping, but only a part of it. The task of collecting the writes to the different files is still left to the operating system. Furthermore, this code will have a hard time to extend to a dynamical count of files. I imagined that there might be a way where (processing) already does the grouping work and trashes the order in which things were read in and a lazy write would reconstruct the order by laziness and interleave appropriate write calls. But I now think this is not possible. So any grouping function like group :: Input - Data.Map Key [Value] trashes the order in which different data was read and there cannot be a reconstruction. Of course, if one uses a different grouping data structure which still keeps track of the order of data arrival, i.e. group :: Input - Data.Magic.Groupy Key Value reconstruction becomes possible. Indeed, (IO a) can be used as grouping data structure by virtue of insert :: Key - Value - IO () insert = writeFile and this is too short to merit the boilerplate of an additional purely functional abstract data structure between the grouping and the file writing part. One simply does not gain additional clarity. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: don't: a 'do' for comonads?
Donald Bruce Stewart wrote: As seen on #haskell, from an idea by Malcolm, 14:42 ?let top'n'tail = (pre++) . (++/pre) 14:42 lambdabot Defined. 14:43 dons L.top'n'tail foo me now 14:43 lambdabot prefoo me now/pre 14:43 mauke that reminds me, haskell needs don't 14:43 dons yes! 14:44 pkhuong- mm. the opposite of do, eh? do for comonads? :) So now a prize to the person who comes up with the best use for the identifier: don't :: ? -- Don don't :: IO a - a example :: () example = don't (do erase /dev/hda) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: aggressiveness of functional dependencies
Nicolas Frisby wrote: The inferred type for rite_t1 is rite_t1 :: (Iso (Either Char a) (Either f' g')) = () - Either f' g' Why isn't the inferred type of rite_t1 the same as the ascribed type of rite_t1'? rite_t1' :: Iso b b' = () - Either MyChar b' rite_t1' () = rite t1 I think GHC does not know whether the given instance declaration instance ... = Iso (Either a b) (Either a' b') even applies to the special case of (a = Char) because it mostly ignores the preconditions on the left side of (=). Hugs is much different. Maybe throwing away undecidable instances will drastically change things. Last post until a response I promise! Another demonstration: bar () = runIdentity . flip runStateT 0 $ return 'c' Inferred signature: bar :: (Monad (StateT s Identity), Num s) = () - (Char, s) Why not? bar :: Num s = () - (Char, s) I am not coming up with an s that could prevent (StateT s Identity) from being a monad. Is there one? The same might go on for this case. By not looking at the preconditions in the instance declaration instance Monad m = Monad (StateT s m) GHC concludes that (Monad (StateT s Identity)) might or might not hold depending on s. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: aggressiveness of functional dependencies
Nicolas Frisby wrote: First off, thanks for the reply. I am accustomed to GHC ignoring instance contexts as you mentioned. It's the mostly part that I'm curious about: mostly implies there's some phases that don't ignore context. Is that only the checking the type of the method definitions and absolutely nothing else? So it seems... I just said mostly because I don't know exactly... Though I strongly suspect, like you, that the preconditions are only used in type inference/checking and not for overlapping instances and similar questions related to uniqueness of instance declarations. My project is rather committed to GHC, but could you elaborate on your reference to Hugs being different? By tradition from Gofer, Hugs implements type classes more flexible than GHC does. I once experimented with the following code using overlapping instances: data Lift a = Lift a type Test = Char class Message m o where send :: m - o - Test instance Message (o - Test) o where send m o = m o instance Message m o = Message m (Lift o) where send m (Lift o) = send m o msg :: Test - Test msg = id r1 = send msg 'a' r2 = send msg (Lift 'b') It implements some kind of subtyping. GHC won't typecheck this but hugs -98 +m will. If I remember correctly, the problem was with instance Message (Lift a - Test) (Lift a) Does this follow from the first or from the second instance declaration? GHC ignores the precondition in the second declaration, thus believes it follows from both and consequently rejects it. But Hugs has no problems with that: it follows the precondition and sees that the second declaration does not apply to the culprit because there is no (instance (Lift a - Test) a). Note that if you add this instance later on (perhaps in a different module), things will break. The flexibility comes at a price: Gofer's type class system was unsound and I don't know how much Hugs comes close to this. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difficult memory leak in array processing
Ah, yet another UndeadArray necromancer exhausting his stack of bones. May the forces of light suggest to structure the incantation of darkness? modifyArray arr i f = readArray arr i = \y - writeArray arr i (f y) accumM :: (MArray a e m, Ix i) = (e - e' - e) - a i e - [(i, e')] - m () accumM f arr xs = mapM_ chg xs where chg (i,x) = modifyArray arr i (flip f x) twodice (x:x':xs) = (x+x') `div` 2 : twodice xs noise rng gen = twodice $ randomRs rng gen main = do let bnds = (0, 1000) buf - newArray_ bnds :: IO Buffer gen - getStdGen accumM (curry snd) buf $ zip (range bnds) $ noise (2,12) gen I absolutely don't know why there is no (accumM) function in the standard libraries. And having the ByteString API (maybe even the fusion) for general arrays would be very nice. Maybe (modifyArray) is missing, too. Regards, apfelmus PS: did you try worker low (i `seq` i-1) ? PSS: The strictness analyzer is likely to insert that automatically if you compile with -O or -O2. Niko Korhonen wrote: Hi everyone, I have the following code whose purpose is to add dither (noise) to a given array. The code looks very straightforward but apparently it has a memory leak somewhere. Here I try to run the algorithm for an array of 10,000,000 integers. Ten million unboxed strict integers should equal to 40MB which should pose no problems to any modern system. However, the program fails with a stack overflow error. I'm using GHC 6.6 on Windows with 1 GB of RAM. I've tried applying seq and some other strictness tricks (such as x == x) pretty much everywhere on the code with no results. Could you please help me understand what is going on here? Have I misunderstood something critical in how Haskell works? Here is the relevant portion of the code: module Main where import Data.Array.IO import System.Random type Buffer = IOUArray Int Int -- | Triangular Probability Density Function, equivalent to a roll of two dice. -- The number sums have different probabilities of surfacing. tpdf :: (Int, Int) - IO Int tpdf (low, high) = do first - getStdRandom (randomR (low, high)) second - getStdRandom (randomR (low, high)) return ((first + second) `div` 2) -- | Fills an array with dither generated by the specified function. genSeries :: Buffer - ((Int, Int) - IO Int) - (Int, Int) - IO () genSeries buf denfun lims = let worker low i | i = low = do r - denfun lims writeArray buf i r worker low (i - 1) | otherwise = return () in do (lo, hi) - getBounds buf worker lo hi main = do -- This should allocate a 40 MB array buf - newArray_ (0, 1000) :: IO Buffer -- Fill the array with dither genSeries buf tpdf (2, 12) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difficult memory leak in array processing
Udo Stenzel wrote: Niko Korhonen wrote: I have the following code whose purpose is to add dither (noise) to a given array. The code looks very straightforward but apparently it has a memory leak somewhere. No, it doesn't. It can't, because it doesn't even compile. After correcting the obvious (lo, hi) - getBounds buf to let (lo,hi) = bounds buf The interface changed between GHC 6.4.2 and 6.6. But no honorable Haskell paladin would ever dare to use UndeadArrays. it just works and needs 40MB plus epsilon. Your problem has to be somewhere else. The strictness analyzer likes Udo more than Niko, does it? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimizing a hash function
Yesterday evening I had a go at porting Bob Jenkins' hash function (http://www.burtleburtle.net/bob/c/lookup3.c) to Haskell. If you need hash functions, I hope that you don't need them to become a hash table necromancer: this dark data structure simply does not fit well into Haskell. Tries are a better alternative: Ralf Hinze. Generalizing generalized tries. Journal of Functional Programming, 10(4):327-351, July 2000 http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz Otherwise, take care to conduct hash experiments inside Otiluke's Resilient Sphere only. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difficult memory leak in array processing
nor that I would really understand the code you've posted. On the positive side of things, you've given me a lot to think about. Maybe in the fullness of time I shall return and say 'Lo! I can write leakless Haskell code!'. But alas, that time seems so distant now. I tried to show how the code can be rewritten in a more declarative manner by the use of the higher order function (accumM) which is akin to the accumulation function (Data.Array.accum) for immutable arrays. This removes the need for traversing the array by hand with (worker). As Claus Reinke already pointed out, the accumulating parameter (acc) in your full example has to be evaluated strictly or it will overflow the stack. This is the same situation as in the famous example length' [] n = n length' (_:xs) n = length' xs (n+1) By avoiding (worker) and using higher order functions, you can avoid this kind of accumulating parameters altogether: length xs= foldr (+1) 0 xs Besides, writing things explicitly tail recursive does not help much in Haskell. In the following, I'm going to explain the restructured code posted. First of all, reducing the amount of IOs is always good for code sanity. Formulated with an emotional undertone, IO is evil. Why do you need IO? There are two reasons: the mutable array and the random numbers. Random numbers clearly are a side-effect, but there is a wonderful trick: you simply fetch a lazy infinite list of random numbers and work with that. With System.Random, you can say do -- get the default pseudo-random number generator, -- it already has good random seed gen - getStgGen let rs = randomsRs (2,12) gen result = dostuffwith rs return result (rs) is an infinite list of random numbers and (dostuffwith) is a pure function, no IO involved. In our case, twodice (x:x':xs) = (x+x') `div` 2 : twodice xs noise rng gen = twodice $ randomRs rng gen is a combination of (rs) and (dostuff). (noise) simply supplies an infinite list of random numbers to (twodice). (twodice) processes this list by taking pairs and averaging them. Both cover the functionality of your (tpdf) offers. Concerning mutable arrays, I can say neither that I have any idea what an 'undead array' is UndeadArray is a bowdlerization of Unboxed Array which is the type you're using as Buffer. They generally are to be considered evil as well, hence the renaming. Their only use is to store huge amounts of memory like their most prominent example (ByteString). If you want to add noise to an actual audio channel, then they can be adequate. But if you only want statistics about random numbers, they're completely out of place. In that case, countNumber k xs = length $ filter (k==) xs main = do gen - getStdGen return $ countNumber 7 (noise (2,12) gen) will do everything you need. If mutable arrays are really unavoidable (hence this is only for necromancers), the use of higher order functions is mandatory. One useful higher order function is (accum) from Data.Array. The adaption to the mutable case uses the helper function modifyArray arr i f = readArray arr i = \y - writeArray arr i (f y) which applies f to the array element at position i. accumM f arr xs = mapM_ chg xs where chg (i,x) = modifyArray arr i (flip f x) takes an array and an association list and accumulates pairs from the list into the array with the accumulating function f (documentation from Data.Array.accum). For example if arr[0] == 0, arr[1] == 1, arr[2] == 2, arr[3] == 3 then brr = accum (+) arr [(1,2),(1,3),(2,3)] yields brr[0] == 0, brr[1] == 6, brr[2] == 5, brr[3] == 3 As another example, (accum (curry snd) arr xs) replaces the array entries by those listed in xs. Finally, countNumber can be expressed as a fold over the array. In general, every higher order function for lists can be translated to one for arrays. However, this necromancy business really does sound like an exiting new career prospect. Interesting job opportunities, respect of the community, flexible hours and extremely loyal peers and other commandlings that will literally work for just for the Brain Food. Regards, Nik The Blak, Necromancer of the Glorious Forces of Evil This is indeed very tempting :) Though I suspect that the glory of forces built on (IO a) will be very limited. Regards, apfelmus, Golden Delicious of the Shining Bulbs of Light ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difficult memory leak in array processing
I personally find doing higher order functions with IO extremely difficult, and the resulting compiler errors are often downright scary. But this is probably a direct consequence my rather limited understanding of the monadic operators that the do notion hides. It seems that one has to be /very/ good in Haskell before one can do IO effectively. Hmm. Well, as good as a necromancer has to be :) But seriously, I think the most mind boggling point about IO actions is that those are higher order, too. I for myself have taken the point of view that values of type (IO a) are just plans that will be executed when the main Haskell program has done its work. The programs task is to assemble the main plan by sequencing smaller plans with (=). But back to the point, using Haskell lists for representing a largish buffer of, say, 16-bit samples that is constantly being refreshed from hard disk and sent into the sound system for playback would probably be inefficient beyond imagination. Lists are indeed not suited for massive byte crunching. If they fit into a black box, you can of course outsource things into C. In case you were writing a track scheduler à la iTunes, the black box most likely would be a single function (playSoundFile) which does the job of handling data from disk to the sound card. Actual sound processing with Haskell functions is more involved. As already said, specifying loops over the samples as higher order functions will save you a lot of headaches. The point is that one just cannot start writing Haskell code and hope that it will run as fast as a tight loop in C. Instead, one should do aggressive optimizations at a few critical points only. And these are exactly the higher order loops. I have to admit that (accumM) is not very fast because it is able to change data at arbitrary indexes which therefore have to be kept around. Most often, you want to process each index exactly once which is better expressed as a (map) or a (fold). As Bulat and Duncan already said, Data.ByteString does exactly this for arrays of bytes. Using it or the generalization promised by Duncan is likely to be the best way to go. On the implementation level, lazy evaluation is in the way when crunching bytes. So be sure to tell GHC to make things strict and to unbox and inline everything it can put its hands on by giving appropriate command line options. As others already pointed out, the details are on http://haskell.org/haskellwiki/Performance As a first test, you may want to compile your original code with -O3 -optc-O3 -funfolding-use-threshold=16 and explore what happens. GHC does a good job with strictness analysis and it's of no use to drown your code in strictness annotations. Of course, some well placed ones may hint GHC about things it overlooked. To mention yet another approach, the image synthesis library Pan http://conal.net/pan/ pretends that the programmer writes ordinary Haskell functions, but under the hood, it's a tiny programming language that gets compiled to C++. Of course, the main point is that the image synthesis combinators are strictly less powerful than full Haskell. But as the full power is unneeded in that context, this doesn't hurt. While there are audio related projects, http://haskell.org/haskellwiki/Libraries_and_tools/Music_and_sound I don't know whether they focus on speed. Regards, afpelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Difficult memory leak in array processing
Duncan Coutts wrote: On Wed, 2006-11-29 at 20:27 +0100, [EMAIL PROTECTED] wrote: On the implementation level, lazy evaluation is in the way when crunching bytes. Something I rather enjoyed when hacking on the ByteString lib is finding that actually lazy evaluation is great when crunching bytes, though you do need to know exactly when to use it. Lazy ByteStrings rely on lazy evaluation of course. Demanding a lazy ByteString alternates between strictly filling in big chunks of data in memory with lazily suspending before producing the next chunk. As many people have observed before, FP optimisation is to a great extent about thinking more carefully about a better evaluation order for a computation and making some bits stricter and some bits lazier to get that better evaluation order. I completely agree. My statement was not well formulated, I actually meant that the overhead implied by lazy evaluation occurring at every single byte to be crunched is in the way. In this case, the cost is too high to pay off as the bytes are most likely consumed anyway. The detailed account keeping about every byte (is it _|_ or not?) is unnecessary for a (map) which invariably does look at every byte. The situation is already different for a (fold), though: any p = foldr (\x b - p x `or` b) False Here, the computation may stop at any position in the list. In a sense, lazy ByteStrings just reduce the cost of lazy evaluation / byte ratio by grouping bytes strictly. Bookkeeping becomes cheaper because one doesn't look up so often. Of course, with a stricter fold, (any) gets more costly. The aim is to make the former ratio smaller while not raising the latter too much. One may say that ByteString makes explicit what the Optimistic Haskell Compiler aimed to make implicit. IMHO, lazy evaluation is always the better choice (in theory). In practice, the only problem about lazy evaluation is the overhead (which hurts mostly at (large - small)) which is *not* a consequence of no free lunch but stems from the fact that current machine architecture is not very friendly to purely functional things. In a sense, the natural complexity measure in Haskell is the number of reductions in hugs +s whereas the natural complexity measure on RAM machines is the number of operations in 0xDEADBEAF-arithmetic. Unfortunately, it's the latter which is inside Intel. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Generate 50 random coordinates
Huazhi (Hank) Gong wrote: Hello,all My intention is to generate 50 random coordinates like (x,y). myrand :: Int myrand = randomRIO(1::Int, 100) rf=[(myrand, myrand) | a - [1..50]] My short program is like this. However, GHCI say that the return type of randomRIO is IO a while the function defined by me is Int. Since I only need a integral type as my cooridinate, could you tell me how to fix this? Taral wrote: do let myrand = randomRIO (1 :: Int, 100) rf - replicateM 50 (liftM2 (,) myrand myrand) Jason Dagit wrote: When we look at the type of randomRIO we see: randomRIO :: forall a. (Random a) = (a, a) - IO a You're giving it a tuple of Int, so we can substitute Int for 'a' in that type signature: myrand :: IO Int rf=[(myrand, myrand) | a - [1..50]] Here you are creating a list of tuples. We see from above that the type of the tuples would be (IO Int, IO Int), so rf :: [(IO Int, IO Int)]. This is because we have not run the IO action to generate the Int yet. My short program is like this. However, GHCI say that the return type of randomRIO is IO a while the function defined by me is Int. Since I only need a integral type as my cooridinate, could you tell me how to fix this? Your type signature tries to make a claim that myrand has type Int, but the compiler will disagree because of that pesky IO type. Dons wrote: Try initialising your random generator in 'main' , creating a list of infinite randoms, take the number you need, then feed that to the functions that need the list: import System.Random import Text.Printf import Data.Word main = do g - newStdGen -- intialise a random generator let (a,b) = split g -- create two separate generators as = randoms a -- one infinite list of randoms bs = randoms b -- another rs = zip as bs -- an infite list of pairs dump (take 50 rs) -- take 50, and consume them -- The IO -- Who rides so late through the bits and the bytes? It's Haskell with his child Hank; He has the boy type safe in his arm, He holds him pure, he holds him warm. My son, what makes you hide your face in fear? - Father, don't you see the IO? The IO with randomRIO? - My son, it's a wisp of the outside world. - You dear child, do come along with me! Such lovely replicateMs I'll do with you; Many colorful liftM2s are to be done, My Taral does have many a golden suggestions! My father, my father, and do you not hear What the IO promises me so softly? - Be quiet, stay quiet, my child; I know he won't treat you good. - Don't you come along with me, my fine boy? My Jason shall do explain to you so nicely. My Jason will do tutor you to understand 'return', And he'll do help you and do show you and do guide you to =. My father, my father, and do you not read over there IO's minions in that dark post? - My son, my son, I see it most definitely: It's the imperative paradigm looking so grey. I do love you; I'm charmed by your beautiful mind; And if you're not willing, then I'll do use imperative force! My father, my father, now he's grabbing hold of me! IO has done, IO did do me harm! - Haskell shudders, he rides swiftly, He holds in his arms the moaning child. He reaches Dons' stronghold with effort and urgency. With the following code, the child will not fall: import System.Random randPairs :: (RandomGen g, Random a) = (a,a) - g - [(a,a)] randPairs range gen = zip as bs where (a,b) = split gen -- create two separate generators as = randomRs range a -- one infinite list of randoms bs = randomRs range b -- another seed = 13561956 :: Int mygen = mkStdGen seed coords :: [(Int,Int)] coords = take 50 $ -- 50 random coordinates derived randPairs (1,100) mygen-- from the random seed above Regards, apfelmus PS: As you may have guessed, any similarity with living people is either randomRIO or accidental ... I hope that you accept my apologies for the latter. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: (a - [b]) - [a - b] ?
while pondering over the four fours problem, I wondered: Is there a function of type (a - [b]) - [a - b] It looks a bit like sequence when applied in the ((-) a) Monad: sequence :: [a - b] - a - [b] but I was looking for the other direction. I came up with: \g - map (\n a - g a !! n) [1..] which has the desired type and functionality, but it looks rather inelegant and messy. Any better ideas? While you showed that there exists unsequence :: (a - [b]) - [a - b] , I doubt that it's very useful. The point is that (sequence) is not surjective: (a - [b]) has strictly more functions than [a - b]. (a - [b]) has the freedom to adjust the length of the resulting depending on a list whereas [a - b] hasn't. (sequence) converts an Applicative Functor to a Monad but there is no canonical other way round. In other words, unsequence . sequence === id butsequence . unsequence =/= id Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How to combine Error and IO monads?
Cat Dancer wrote: I have a program that performs a series of IO operations, each which can result in an error or a value. If a step returns a value I usually want to pass that value on to the next step, if I get an error I want to do some error handling but usually want to skip the remaining steps. Thus I have a lot of functions with return types like IO (Either String x), where x might be (), Integer, or some other useful value type, and a lot of case statements like You are on the right track. The point is that (IO (Either String a)) is a Monad, too. This allows you to write the ever repeating case statements once and forall: newtype ErrorIO a = ErrorIO (IO (Either String a)) instance Monad ErrorIO where return x = return (Right x) f = g = do ex - f case ex of e@(Left _) - return e Right x- g x It happens that you can parametrize this on IO: newtype ErrorT m a = ErrorT (m (Either String a)) typeErrorIO a = ErrorT IO a instance Monad m = Monad (ErrorT m) where ... -- same as above And you just rediscovered monad transformers. Regards, apfelmus PS: In the special case of IO, you can also use exceptions. But using ErrorT is better style. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Why so slow?
Lyle Kopnicky wrote: The code below is using way more RAM than it should. It seems to only take so long when I build the 'programs' list - the actual reading/parsing is fast. For a 5MB input file, it's using 50MB of RAM! Any idea how to combat this? (ethereal voice) ... Children of Amaunator ... heed my words ... in no long, the world ... will perish, will ... crumble under the blackened forces of IO ... but there is ... hope ... i can feel that ... ensure :: (a - Bool) - String - a - a ensure b s x = if b x then x else error s ... or switching to ... monadic parser combinators ... like Text.ParserCombinators.Parsec ... can hold strong the light for ... another aeon or two ... Regards, afpelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Well-typed functions with nonexisting signatures [Was: type variable question]
[EMAIL PROTECTED] wrote: Is there a canonical example example that exhibits this behavior? Yes, it was discussed back in 2003. Here's the canonical form: g::(Show a,Show b) = a-b g = undefined --h :: Show a = b - a h x = g(h x) Both Hugs and GHC (in the pure Haskell98 mode!) are happy with h. Both compilers infer the type for h, shown in the comment. If you give h the type signature -- the one that is inferred by the compilers themselves -- both compilers complain. Strangely, ghci accepts it on the command line: $ ghci [...] GHC Interactive, version 6.4.2, for Haskell 98. [...] Prelude let g :: (Show a, Show b) = a - b; g = undefined; h :: Show a = b - a; h x = g (h x) in h 1 *** Exception: Prelude.undefined but not from a file to be loaded. Hugs +98 does not accept it on command line as expected. What's going on? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Seeking advice on a style question
Steve Schafer wrote: In my text/graphics formatting work, I find myself doing a lot of pipeline processing, where a data structure will undergo a number of step-by-step transformations from input to output. For example, I have a function that looks like this (the names have been changed to protect the innocent--and to focus on the structure): process :: a - b - c - d - e process x1 x2 x3 x4 = let y01 = f01 x1 x2 x3; y02 = f02 x1; y03 = f03 y02; y04 = f04 y03; y05 = f05 x1 y01 y04; y06 = f06 x2 y05; (y07,y08) = f07 y01 y06; y09 = f08 y07; (y10,y11) = f09 x2 x4 y09 y08; y12 = f10 y10; y13 = f11 y12; y14 = f12 x1 x2 x3 y01 y13; y15 = f13 y14; y16 = f14 y15 y11 in y16 [...] In principle, it could be managed with a bunch of nested StateT monads, but my attempts to do so seem to get so caught up in the bookkeeping that I lose the advantages mentioned above. [...] So here's the question: Is there a reasonable way to express this kind of process (where I suppose that reasonable means in keeping with Haskell-nature) that preserves the advantages mentioned above, but avoids having to explicitly pass all of the various bits of state around? To me, it looks more like MonadReader than MonadState because I have the impression that x1 and x2 are enivronments to fetch something from. (Btw, MonadReader is best used as an Applicative Functor, but that's a different story). But in general, it's futile trying to simplify things without knowing their meaning: names are *important*. I assume that your proper goal is not to structure pipeline processes in full generality, but to simplify the current one at hand. Even if you wanted to simplify the general structure, I think you'd have to make the types of the different yk explicit. Otherwise, the problem is underspecified and/or one has to assume that they're all different (modulo some equalities implied by type correctness). Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Seeking advice on a style question
version is much clearer than any diagram with boxes and arrows. At least, you should keep names like 'questions', 'bands' and 'pages'. Only cumbersome derivations like 'groupedBands' or names with additional ticks are really redundant. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Seeking advice on a style question
layout of the parent question is not generated by a simple traversal over its children, for example. So, for all of the processing that follows, a parent question (one with child questions) looks just like any other question, and any parent question-specific details remain hidden inside. Again, I'd say that the algorithm and now more than ever the meaning dictates the data structure. Assuming that processing children of different parents is independent and that processing children of the same parent is *not* independent, I'd group families right together in a data structure. Whether it's a simple traversal (I interpret this as independent?) or not, at some point you have to mess with the whole group at once anyway, so you can put it together right now. validateQuestionContent :: [Question] - [Question] Uh, I think the type is plain wrong. Doesn't the name suggest 'Question - Bool' and a fatal error when a question content is invalid? No. The idea is to never fail to assemble the questionnaire. If there is a question with invalid content, then it is replaced by a dummy question [...] Ah, of course you are right, I didn't think of enhanced error processing. I guess that (validateQuestionContent) is not a filter, because you have to check non-local parent-child relations as well? If so, then I suggest grouping them beforehand to make it a filter. (numberedQuestions,questionCategories) = numberQuestions pagemaster questions; Another piece of miscellaneous information contained within the pagemaster is the starting question number. You can still automatically number questions in dependence of a first number by overloading the (Num) class: newtype RelativeInteger = RI { unRI :: Integer - Integer } instance (Num RelativeInteger) where ... mkAbsolute :: Integer - RelativeInteger - Integer mkAbsolute pointOfReference relint = unRI relint pointOfReference (Some questionnaires start with a question number other than 1 because there is a post-processing step where various front ends are pasted onto variable back ends--another example of where a hierarchical approach would have made more sense, but couldn't be adopted because the database people couldn't cope.) Uh, that doesn't sound good. I assume that the post-processing is not implemented in Haskell? Otherwise, you could incorporate this stuff into (process) and choose suitable interfaces. IMHO, dealing with some modestly expressive interface which still only offers medium abstraction (like object orientation) is a pain in a type system as powerful as Haskell's. bands' = resolveCrossReferences bands questionCategories; Questions are cross-referenced by question number. For example, question 4 might be in the Sales category, while question 22 might be Detailed Sales. The last item of question 22 might be Total; should equal the value reported in (4). In order to make the layouts as reusable as possible, rather than hard-coding (4) in that last item in (22), there is a tag that looks something like this: textTotal; should equal the value reported in question-ref category=Sales/./text Fine, though I don't see exactly why this isn't done before after the questions have been transformed to printable things but before there are distributed across pages. So the references cannot refer to page numbers, yet must be processed after transforming questions to rectangles? groupedBands = groupBands bands'; (can't guess on that) In order to implement widow/orphan control, not every band is allowed to start a new page (keep with previous and keep with next, in effect). Before being handed off to the paginator, the bands are grouped so that each group of bands begins with a band that _is_ allowed to start a page, followed by the next n bands that aren't allowed to start a page. Each grouped band is then treated by the paginator as an indivisible entity. (At this point, the grouped bands could be coalesced into single bands, but doing so adds a bit of unnecessary overhead to the rendering phase.) Maybe (paginate) can be given a type along the lines of paginate :: Rectangle a = [a] - Pages a and perhaps you could merge several bands into a single rectangle simply by saying instance Rectangle [Band] where ... To conclude, I think that (process) can be roughly factorized as follows: process = buildPages . questions2rectangles . expandMacros Now, you get 2/3 of TeX or another desktop publishing system for free, you only have to replace (questions2rectangles) by (text2rectangles). Regards, apfelmus Footnote: * Well, it is possible to recover insert, but only by introducing a contradiction into the logic of types with the help of (undefined): insert x bp = foo x (bp, (undefined :: map')) This is clearly unsafe and heavily depends on the implicit knowledge that the returned (BluePrint') ignores its arguments. ___ Haskell-Cafe mailing list
[Haskell-cafe] Re: Possible (GHC or HGL) bug or ??
Calvin Smith wrote: This basically works, in that it does exactly what I want in Hugs, but GHC sometimes pauses partway through rendering, and does not continue rendering until I type any key (except space, which exits) or unfocus/refocus the window, or move the mouse pointer across the window. Sometimes, more often the first time in a GHCI session, it renders completely with no pauses, and it seems to pause more and more if I evaluate main, then close the window, evaluate again in the same GHCI session, repeatedly. The same pausing behavior is observed in a GHC-compiled executable. When the problem occurs, there is a message to the console that says: thread blocked indefinitely. I can reproduce this on OS X with ghc-6.4.2, X11-1.1 and HGL-3.1. The console message is rare but I also got it once. This looks like a bug in HGL, perhaps some issue with polling the event queue in a threaded fashion. p.s. Any stylistic or other comments about the code welcome too. The infinite list of colors is a very good idea. It might also be a good idea not to mess with trigonometry when creating the snowflake. These things can be put into a single function (rotate) which rotates a point around the origin by a specified number of degrees. The following code demonstrates this. Note that the resulting snowflake has slightly different proportions than your original one, but it shouldn't be a problem to adjust this. module Main where import Graphics.SOE main = runGraphics $ do w - openWindow Snowflake Fractal (600, 600) drawInWindow w $ snowflake (300,300) 200 (cycle $ enumFrom Blue) spaceClose w spaceClose w = do k - getKey w if k == ' ' then closeWindow w else spaceClose w rotate :: Double - Point - Point rotate deg (x,y) = (truncate $ c*x' - s*y', truncate $ s*x' + c*y') where (x',y') = (fromIntegral x, fromIntegral y) rad = deg * pi / 180 (s,c) = (sin rad, cos rad) translate :: (Int, Int) - Point - Point translate (dx,dy) (x,y) = (x + dx, y + dy) minSize = 2 :: Int snowflake :: Point - Int - [Color] - Graphic snowflake _ h _ | h = minSize = emptyGraphic snowflake pos h (c:cs) = overGraphics $ map (\pos - snowflake pos (h `div` 3) cs) (mkPoints corners) ++ map (withColor c . polygon . mkPoints) [triangle1, triangle2] where -- things gets specified by their angle -- with respect to the y-axis mkPoints = map $ translate pos . flip rotate (0,h) triangle1 = [0, 120, 240] triangle2 = map (180+) triangle1 corners = map (60*) [0..5] Also note that I eschewed (drawInWindow) in favor of (overGraphic), but I think that SOE will introduce that at some point, too. A minor hint is to use Double instead of Float. It doesn't really matter, but today's computers internally favor Double (double precision floating point number). Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Composing functions with runST
Neil Mitchell wrote: As for beginner issues with rank-2 types, I've been learning Haskell for years now, and have never felt the need for a rank-2 type. If the interface for some feature requires rank-2 types I'd call that an abstraction leak in most cases. It certainly means that you can't properly Hoogle for it, can't compile it with Yhc, can't do full type inference etc. I think that the term abstraction leak is too harsh. In some sense, you may as well call strong typing an abstraction leak because one can do the stuff as well in a dynamic typed language and adding strong typing means that you can't compile it with current compilers, you need to implement type checking/inference etc. Of course, this analogy has flaws as higher rank types go to the edge of computability whereas strong typing can be implemented. Concerning interfaces, higher rank types offer crucial static checking that cannot be achieved without them. The prominent example is ST. The next example is the parsing library frisby. In both cases, it would be easy to wrack havoc in case the interface would not use higher rank types. The same analogy as above applies: one uses strong typing because one does not want to wreak havoc. I would not call this an abstraction leak. Concerning implementation, higher rank types are even more widespread: almost everything involving continuations needs them: ReadP, Exceptions (as opposed to Either), deforestation etc. In fact, it is quite possible to throw away algebraic data types altogether and build everything you need with higher rank types. A prominent example is [a] ~= (forall b . (a - b - b) - b - b) ~= (forall b . (Maybe (a,b) - b) - b) The denotational semantics do not change, but the time and space behavior is much different. Perhaps the above remarks misinterpret your statement and you meant abstraction leak in the sense that, because higher rank types are available, the interface author used them without thinking whether the same effect can be achieved in ordinary Haskell. Alas, such problems are not tied to higher rank types: proper interface design is an art and does not come for free, not to mention interface documentation[1]. One could easily berserk: why does this library use String and doesn't abstract it with a type class? Why does that interface only provide IO, why isn't this available as a library of pure functions? What do these obviously crappy semantics mean? In this case, higher rank types are a symptom, not the problem. If one wants to cure the problem by disallowing the symptom, then I suggest to also erase the symptom IO. Thoroughly. Of course, the drawbacks of higher rank types you mentioned remain. In the case of hoogleability, I'm confident that it is possible to implement them, it's only that someone has to think hard about it. Implementing higher rank types in YHC is even harder but not impossible. Sure, type inference is the most difficult thing, and one has to accept glitches and drawbacks to make it work. Compared to these difficulties, I think that the remark So I can't just tell someone who's just starting to learn Haskell that f $ g y is equivalent to f (g y); I have to say those two are *almost always* equivalent, but if you use $ and the compiler complains about not being able to match the expected and the inferred type and a type signature in the error message has the word 'forall', try rewriting that expression without the $ and see if it compiles. Eeeww. posted in this tread is too harsh. That's life, every language has its flaws and glitches: parts of the layout rule, pattern guards, I want a better records system, views, generic programming, etc. But, when code has to be finished, those glitches or annoying things are best countered with a shrug: they are not life-threatening. A programming language with nonexistent type system and ugly semantics is. And much to our joy, Haskell is far from this. In that sense, dear reader of this post, just rewrite that expression without $ and see if it compiles. The complier authors don't want to annoy you, it's just that the exact reasons why this cannot yet be put to work are damn hard. You are encouraged to learn about System F to get a grasp of what is going on, but spending this one $ will be much cheaper. Regards, apfelmus [1] Concerning library documentation, I think that literate Haskell sources have the drawback that they are either tied to TeX (\begin{code}..\end{code}) or that every line has to start with a ''. I'd suggest to add a code../code or something else. The point is that while (La)TeX can be cranked up to a publishing system, it is not suited for many tasks such as media-dependent processing. TeX is a macro language, not a structured document type. And for the strongly typed Haskell programmer used to referential transparency, macros are a nightmare. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http
[Haskell-cafe] Re: Composing functions with runST
Yitzchak Gale wrote: Well, it turns out that using Data.Sequence or Data.IntMap to shuffle a list becomes prohibitive if you might have more than about 10^5 elements in your list. So in that case you will need to use a mutable array, and you now need ST. [..] Wouldn't it be nice if instead you could just write: shuffle :: (RandomGen g, MonadState g m) = [a] - m [a] shuffle = stToState . shuffleST and then just use that directly inside a calculation that is otherwise purely non-ST? It seems, what you really want is shuffleST :: RandomGen g = [a] - StateT g ST [a] Actually, I tried that. It didn't help - it was just one more layer I had to peel away to get at the ST inside. There seems to be no way to avoid the fact that you think about state in two very different ways in these two monads. Every program is written in either one style or the other. Occasionally, you require an isolated use of the opposite style, and I am looking for ways of simplifying the resulting mess. StateT st ST and MonadST look like ways of combining the two, but in practice I find that they just seem to get in the way. I don't get what exactly you want. If you want to carry your state named MyState (f.i. type MyState = [Cards,Players]) around in a monad, you use Control.Monad.State MyState. If (and only if) you have an algorithm (like depth-first search) that carries an array as state around (nodes already visited) and you know that this array is used in a single threaded fashion, it might be worth to update the array in place. For that, you use Control.Monad.ST and Data.Array.ST and you can be confident that the state carrying has been strictness analyzed and fine tuned to match the machine. In short, you get updates in place without selling your soul to IO, runST is your protection from evil and will keep you pure. ST does not really have more uses than this one (besides being the foundation for IO). For more info on ST, see http://research.microsoft.com/Users/simonpj/Papers/state-lasc.ps.gz Note that the you can now achieve the array thing as well with Data.Array.Diff. This is a purely functional interface to an array type that uses destructible updates internally and keeps a history to become persistent. However, I doubt that an array makes a difference over Data.IntMap for all but the most special cases. I am starting to be convinced that the only way to write the function I want is by using unsafeRunST. Or type it as stToState :: MonadState st m = (st - ST s (a, st)) - m a and then write in the documentation that the user is require to write do r - newSTRef x ... y - readSTRef r return (z, y) by hand every time. Yuck. If the programmer needs to adhere to a policy, the type system may well enforce it for him. No unsafeRunST. It's far better to struggle with the safety device than to discover the hard way that running without it will directly lead into the debugging hell. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Seeking advice on a style question
, then that's 65 pages that they have to inspect. But if the system were to produce ten 11-page questionnaires, even though the first five pages of each questionnaire are generated from exactly the same data using exactly the same software, that's 110 pages that they have to inspect. X-) Thanks for all of the discussion. I think I have a lot to ponder May the λ guide your path ;) And of course, you can always outsource some pondering to the mailing list. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: State monad strictness - how?
. Strictness is needed because it exerts control over memory (and time). That's something best left to the compiler. Of course, the compiler's job is not to turn a O(n^2) algorithm into one that runs in O(n) time, this is something only the programmer can do. But the compiler's job is to figure out the `seq`s, fusions and inline definitions because I am too lazy to mark them explicitly. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: advice on architecting a library (and haskell software in general)
Yang wrote: hi all, i'm looking for advice on how to architect a simple widget library for vty/hscurses, but these beginner-level questions should apply to haskell applications in general. input/requests on any aspect of my design would be greatly appreciated - in return perhaps you'll get something you'd actually want to use! i have studied in detail various haskell curses/vty apps, including yi, hmp3, riot, and hscurses.widgets/contactmanager. my immediate goal is to produce a set of composable widgets and a flexible framework that - as much as possible - reduces the amount of code a user has to write. my eventual goal is to make a functional gui library, along the lines of fruit/wxfruit/fg. to this end i've also read their literature, but i still do not understand enough about arrows/yampa/afrp. Currently, the design of a functional UI library (be it graphical or for tty) is still an open research problem. Somehow, the arrow based approaches like Fruit etc. are not yet satisfying. A predecessor to this approach is FranTk. The early library Fudgets is in essence based on arrows but precedes their invention. The most recent development in this direction is Phooey. In the mean time, the medium level libraries wxHaskell and gtk2hs have gathered acceptance, mostly because they implement a full widget set while still being reasonably succinct. A predecessor is HToolkit drawing from ObjectIO. Despite being close to their imperative cousins, they already supersede them in expressiveness. They make an excellent base for further experiments, be it for arrows (wxFruit) or other recent approaches like PropLang. You have two options: * Search for the grail. Most likely, this doesn't involve much coding but much searching and has the risk of not finding it. But as the tty doesn't have many different widgets, you can concentrate on the high level ideas, i.e. how to get rid of IO and IORefs. Pumping your brain with ideas - as you already do - surely helps. * Implement an existing design. This way, you'll actually program something. I'd propose to implement something medium level along the lines of wxHaskell that can later be utilized in a high level approach. Maybe you can even create a cross platform interface, i.e. one that works for a tty and for graphical UIs at the same time. The author of HToolkit wrote a proposal on how to transparently enclose Windows, Mac and Gnome. try to use it for a real application that has little to do with parsing or other purported strengths of the language. Well, Haskell's only strength is that it gives you enough power to express your ideas, i.e. to compose great things from small ones. The strength of your ideas is your business :) In this sense, monadic parser combinators are not an inherent strength of the language, they happen to be powerful by themselves. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: different performance of equivalent expression
I've run into strange effect that I can not explain. I have simple expression that can be written by two equivalent ways. However one way give much performance gain over another. Here is an example: -- apply function many times (tail-recursive) many n f x = if n == 0 then x else many (n-1) f $! (f x) -- first adder function adder1 = let r = many 500 sin 1.0 in \x - x + r -- second adder function adder2 = \x - x + many 500 sin 1.0 If you run program it think some seconds performing math, and them prints 4 results immediately. But with adder2 function, it perform calculation in every call, which can be seen visually. It seems that compiler is able to cache in some way long computation in first case, but not in second. This is indeed the case and entirely reasonable. The effect is called memoization, a key ingredient to lazy evaluation. To simplify the explanation, let's take the examples adder1 = let r = many 5000 (1+) 0 in \x - x + r adder2 = \x - let s = many 5000 (1+) 0 in x + s The evaluation of (adder1 3) proceeds as follows: adder1 3 = (let r = many 5000 (1+) 0 in \x - x + r) 3 = let r = many 5000 (1+) 0 in 3 + r = let r = 5000 in 3 + r = 5003 Now, (r) will be updated with the result (5000) after it has been calculated and subsequent access to (r) will retrieve the updated value as in adder1 4 = (let r = 5000 in \x - x + r) 4 = let r = 5000 in 4 + r = 5004 Every incarnation of (adder1) shares the same r. For (adder2), things are different. Here, (s) will be updated as well, but different incarnations of (adder2) do not share the same (s) because (s) is only in scope after supplying some argument (x). Hence, every (adder2 3) and (adder3 4) (re)calculates its own (s). I always thought that let a = b in x + a is just a syntactic sugar for x + b. Is it wrong? This is correct but doesn't apply to the case above. The question here is whether let a = b in \x - x + a and \x - let a = b in x + a are equivalent. Considering the result, they are. But considering running time and memory footprint, they are not. The first trades memory for speed, the second trades speed for memory. In general, the compiler is reluctant to switch between those two versions, i.e. it does not perform much common subexpression elimination or let floating (see GHC manual). The choice must be up to the programmer. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: strange performance of expression evaluators
I've done some more experiments. The following program defines simple arithmetic expression with indexed variables. I've written four different ways to evaluate them: - eval1 is simple monadic evaluator - eval2 is the obvious straight-forward implentation - compile1 is attempt to perform compilation - compile2 is refined compile1 with sub-expressions explicitely separated via let binding. Test evaluates the same expression in 100 different environments. The results are: - eval1 - 17.47 sec - eval2 - 3.71 sec - compile1 - 3.79 sec - compile2 - 3.74 sec This figures are completely mysterious for me. 1) I expected eval1 and eval2 to perform equally. In fact, eval1 is 4.7 times slower for eval2. 2) I expected compile{1,2} to perform much faster then eval{1,2}. However, the compilation attempt does not give any speed up at all. Your intention is that (compile2 test) should analyze the expression tree of (test) only once when evaluating it for different environments. I'm not sure whether the constructors (Add), (Mul) etc. get replaced once and for all by (+) and (*) or whether this really matters, because (eval2), (compile1) and (compile2) have the same running time. I think that memoization (as explained in my previous post) only takes place for values not of function type, i.e. partially evaluated functions aren't memoized. It may also be that the compiler optimizes things for the concrete expression (test) you gave in your code. So supplying the expression interactively could show a difference between (eval2), (compile1) and (compile2). Ironically, (eval1) does compile as much as you intend (compile2) to do. But it looks like the overhead imposed by appealing to Control.Monad.Reader doesn't get inlined away completely. Currently, you don't do much work per expression, it just gets evaluated. To take advantage of memoization, you need to do more expensive analysis on a per expression basis. For example, you might want to precalculate stuff that doesn't depend on the environment: data ConstVar a = Const a | Var (Env - a) eval :: ConstVar a - Env - a eval (Const x) = const x eval (Var f) = f -- addition, multiplication etc. do precalculation -- when the constituents are known beforehand instance Num a = ConstVar a where (Const x) + (Const y) = Const (x + y) x + y = Var (\e - eval x e + eval y e) ... data Expr a = Value a | Variable Name | Add (Expr a) (Expr a) | Mul (Expr a) (Expr a) compile :: Num a = Expr a - ConstVar a compile (Value c)= Const c compile (Variable v) = Var (\e - e ! v) compile (Add x y)= (compile x) + (compile y) compile (Mul x y)= (compile x) * (compile y) testexpr = (Mul (Value 1) (Value 2)) `Add` (Variable 1) test = eval . compile $ testexpr Of course, this can be improved. For instance, it currently does not know about the associative law like in (Add (Value 1) (Add (Value 2) (Variable 1))) Now, it is clear that analyzing the expression again and again every time it needs to be evaluated (interpretation) is wasted work. Regards, apfelmus PS: data Expr = Const !Value | Var !Int | Add !Expr !Expr | Sub !Expr !Expr | Mul !Expr !Expr You'd better leave out the strictness annotations (!). ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Article review: Category Theory
Ulf Norell wrote: In the section on the category laws you say that the identity morphism should satisfy f . idA = idB . f This is not strong enough. You need f . idA = f = idB . f Unfortunately, fixing this means that the category Hask is no longer a category since _|_ . id = \x - _|_ =/= _|_ Neil Mitchell wrote: Isn't _|_ = \x - _|_? _|_ `seq` () = _|_ (\x - _|_) `seq` () = () Whether this is the fault of seq or not is your call... Subtle, subtle. From the point of view of denotational semantics, the functions (x \mapsto _|_) and _|_ are the same as equality and the semantic approximation order are defined point-wise. Usually, the morphisms of some category arising from a (non-normalizing or polymorphic) lambda calculus are given by such partial functions. The key point about lambda calculi is that the external morphisms sets can be internalized, i.e. represented as objects of the category themselves. So, the set of morphisms from 'Integer' to 'Integer' can be represented by the type 'Integer - Integer'. But, as the example with `seq` shows, this is not entirely true. Apparently, Haskell represents function types in a boxed way, i.e. they are lifted by an extra _|_: newtype ClosureInt2Int = Closure (Integer - Integer)# Thus, Hask is not a category, at least not as defined in the article. The problem is that (either) morphisms or the morphism composition ('.') are not internalized correctly in Haskell. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: seq (was: Article review: Category Theory)
Lennart Augustsson wrote: And this is why some of us think that adding polymorphic seq to Haskell was a mistake. :( To ease the pain, (oca)ML has the same problem/feature: function types are lifted: let rec f (x : int) : int = f x ;; let g y = let x = 1 / 0 in f ;; let const y = 1 ;; # const f ;; - : int = 1 # const (g 1) ;; Exception: Division_by_zero. The reason is, of course, that one cannot be strict in a function argument (taking _|_ = \x - _|_) because this is undecidable (and nonsense with extensional equality). But because the ML-equivalent of (.) is strict, it still does a correct internalization of morphism composition. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] [Fwd: Re: Computer Language Shootout]
Neil Mitchell wrote: That would be very neat. Another neat trick would be generalising optimisations so that there are fewer and more independant passes, this would make it easier to understand (and is what I was working on for Yhc). Well, it's the nature of repeatedly applying local reductions that you will neither really know what's going nor truly understand how to improve it. One particular example is the GHC rules mechanism. It's much better than doing nothing, but it often fails to fuse yet another list. I think that one can only achieve deterministic optimization by carefully choosing and exploiting the type system of the intermediate language. For instance, short cut fusion and strictness analysis can be expressed as type inference. If you have an intermediate language where things like boxing and forcing of thunks is explicit and typed, you have a chance to eliminate such expensive constructs by type inference and conventional inlining magic. One point is that local optimization is hard to extend across the boundaries imposed by recursion and the fixed point combinator. But I can imagine that switching to a core language with non-totality (lifted types) manifest in the types, like System F with a fixed point combinator fix :: Pointed a = (a - a) - a is key to crossing this boundary. Yhc has intermediate code that is substantially more Haskell like, and with the command: loadCore file.ycr = writeFile file.html . coreHtml You can generate an active Html document that lets you explore the Core in a more interactive way - including hyperlinks for function names, syntax hilighting etc. i.e: http://www.cs.york.ac.uk/fp/yhc/Roman.html All of these things make playing with Yhc Core much more fun than playing with GHC Core. There is absolutely no reason you couldn't add all these things to GHC Core, then perhaps you'd save some time when it does come to the examine core level. Wow, the core looks really cool! One look and you see it all. I would even rename the local variables to single letters like a,b,c because the cryptic numbers are quite hard to track. This is something coreHtml can do. Also, I'd add more color, like making punctuation marks grey, but that's a very personal taste. Concerning the shootout, it deserves much laud for it is certainly not an easy task to set it up for so many language and it keep running. But I'm not very fond of the benchmarks themselves. The goal seems to be to measure how fast different languages can execute a hard wired algorithm which specifies memory management and data layout. While this is a valid goal, I don't think it is a worthy one. It simply does not get to the interesting facts, namely how fast the natural algorithms for each language are. It just compares highly artificial algorithm implementations. Random examples are the nsieves (count the prime numbers from 2 to M [...] create a sequence of M boolean flags) and the k-nucleotide (define a procedure/function to update a hashtable of k-nucleotide keys) benchmarks. Both boolean flags and hash tables are completely alien to Haskell. The former would be naturally implemented by a list of primes, the latter naturally with a generalized trie. Of course, disallowing unboxed arrays will certainly move Haskell down the ranking. But what do we gain from unlimited array necromancy? Do we get a fair account on how fast and short day to day Haskell really is? And how to tweak the compilers to make day to day Haskell an even more pleasant experience? I think not. (Do not misunderstand me, ByteStrings are fine for it is the purely functional style that counts). And sorry, but using the number of gzipped bytes for comparing the code length is just ridiculous. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] OOP parametric polymorphism
Donald Bruce Stewart wrote: deliverable: ...In the tradition of the letters of an ignorant newbie... What's the consensus on the OOP in Haskell *now*? There're some libraries such as OOHaskell, O'Haskell, and Haskell~98's own qualified type system with inheritance. If I have GHC, which way to do anything OOP-like is considered right today? Using existentials and typeclasses to do some OO things wouldn't be considered unidiomatic (particularly, using existentials to package up interfaces to values). In general though, using a functional approach will produce better (simpler) Haskell code, and make it more likely others will understand it. Personally, I run in fear from OO Haskell ;) Instead of OOP, Haskell uses (parametric) polymorphism which is more powerful than OOP. For instance, the function length :: [a] - Int or, with an explicit forall length :: forall a . [a] - Int counts the number of elements in a list [a], regardless of what type a those elements have. Moreover, it is guaranteed that length does not inspect or change the elements, because it must work for all types a the same way (this is called parametricity). Another example is map :: (a - b) - [a] - [b] which maps a function over all elements in the list. In addition, Haskell has type classes (which are similar to interfaces in OOP). The most basic example is class Eq a where (==) :: a - a - Bool Thus, you have an equality test available on all types that are instances of this class. For example, you can test whether two Strings are equal, because String is an instance of Eq. More generally, you say whether two lists are equal if you know how to test elements for equality: instance Eq a = Eq [a] where [] == [] = True (x:xs) == (y:ys) = (x == y) (xs == ys) _ == _ = False The important thing I want to point out in this post is that parametric polymorphism is indeed more powerful than OOP: already a concept like Eq is impossible to implement in OOP. The problem is best illustrated with the class Ord (*), which provides an ordering relation. Let's concentrate on the smaller than function () :: Ord a = a - a - Bool Can I create an interface that expresses the same thing? public interface Comparable { boolean smaller_than(?? y); } No, because there is no type I can attribute to the second argument of smaller_than. The problem is that I can only compare to values of the *same* type, i.e. the type which implements the interface. Can I create a class the expresses the same thing? public class Comparable { boolean smaller_than(Comparable y); } This seems like a solution, but it is not. The problem is subtyping: if i make integers and strings members of this class, i would be able to compare the number 1 against the string hello, which should be reported as a type error. I have no formal proof, but I think that the function cannot be expressed in a type correct way in OOP. AFAIK, only Java Generics can express the requirement we want: interface OrdA { boolean smaller_than(A x, A y); } class String implements OrdString { ... } But Generics are a considerable extension to OOP. In fact, there is nothing really object oriented in here anymore, we're just on our way to parametric polymorphism. My final remark is about what this means for the existential quantifier in Haskell. Because of the injection inject :: forall a . a - (exists a . a) the existential quantifier can be thought of as implementing some form of subtyping, i.e. (exists a . a) is a supertype of every a. The point now is: given type ExistsOrd = exists a . Ord a = a there is *no* instance Ord ExistsOrd where ... because we could compare arbitrary subtypes of ExistsOrd then. In the end, the existental quantifier has limited use for data abstraction, it's forall that makes things happen. Regards, apfelmus (*) We don't consider Eq because given a test on type equality, we can generalize the signature of (==) (==) :: (Eq a, Eq b) = a - b - Bool Indeed, this is what OOP equality does. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Channel9 Interview: Software Composability and the Future of Languages
Michael T. Richter wrote: And in, you know, the real world of programming you don't face mathematical problems as your bread and butter. Can you prove that? ;) You face problems in a messy world of too-short deadlines, too-few resources, too-poorly-communicated requirements and too-many-hours work. In it's essence, the way of mathematics is to solve an infinite number of problems at once by generalizing and simplifying them until they read 1 == 1. But that's exactly the kind of stuff you need: thanks to generalization, you already implemented all requirements before the customer can even conceive them and thanks to simplification, needed resources and hours of work shrink to reasonable amounts resulting in deadlines becoming harmless :) Well, i mean it seriously. - Coding a complicated configuration system, no doubt baroque, can it be simplified by basing it on a simple but powerful macro language with simple and sane semantics? Can it be based on the lambda calculus? Is this general enough? Maybe you want to assure that every macro terminates? - Coding a graphical user interface with lots of forms, can they be reduced to their essence and generated automatically from a data type? Perhaps in the style of Functional Forms (www.st.cs.ru.nl/papers/2005/eves2005-FFormsIFL04.pdf)? Are they general enough? If they require immediate feedback or interactive error checking, may tangible values (http://conal.net/papers/Eros/) be an idea to base on? - Coding a dynamic website and having to control caching and data mining, can this logic be separated out, restricting yourself to a programming model that allows those this to happen transparently? Can you plunder Google's Map Reduce model for that (www.cs.vu.nl/~ralf/MapReduce/paper.pdf)? - Coding data base access or a package management system, can data integrity be assured by again restricting yourself to a less general programming model? Like Software Transactional Memory? Or is it just enough to use strong typing and a simple yet clever data structure (http://www.galois.com/cufp/slides/2006/CliffordBeshers.pdf)? The structures behind the repetitions, the generalizations to rule them all, the simplifications to find them, they all lie there. But they may resist discovery and you may need cleverness and, well, mathematics to find them. The point about Haskell is that its type system is pure and rich enough to enable you to actually express the proof, the insight as a program. Only few programming languages can do that. And you know: computers and Haskell itself are products of mathematics. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Paths to tree
of 'insertMaybe :: (v - v - v) - v - Maybe v - Maybe v). Now what about 'MapString v', how do we get this? Well, your implementation corresponds to the choice type MapString v = [(String,v)] But in our case, we can apply the same trick again! We have 'String = [Char]' and given an implementation of data MapChar v = ... we can use exactly the same code from 'MapPath v' to implement 'MapString v'! (This reuse can be abstracted into a type class, but I'll not cover that here.) Of course, we need 'MapChar v' now. But yet, again we can think of Char as Char ^= Int ^= [Bool] where the '[Bool]' means the list of digits in binary representation. So, given 'MapBool v', we can implement 'MapChar v' with yet again the same code that we used for the preceding finite maps! Fortunately, the recursion ends here because a finite map for 'Bool'eans is just the pair type MapBool v = (Maybe v, Maybe v) In case your head does not yet hurt too much :), further information about tries in Haskell and a detailed explanation of why the code above works, can be found in Ralf Hinze. Generalizing generalized tries. Journal of Functional Programming, 10(4):327-351, July 2000 http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz Regards, apfelmus import Data.Tree import Control.Monad data ArcData = ArcData { name :: String } deriving Show type ArcTree = Tree ArcData type ArcForest = Forest ArcData pathsToForest :: [[String]] - ArcForest pathsToForest paths = mergeForest $ concat $ map pathToTree paths mergeForest :: ArcForest - ArcForest mergeForest [] = [] mergeForest (x:xs) = merge x (mergeForest xs) where merge :: ArcTree - ArcForest - ArcForest merge tree [] = [tree] merge tree (y:ys) = if sameTreeName tree y then merge tree { subForest = mergeForest ((subForest tree) ++ (subForest y)) } ys else (y:merge tree ys) treeName :: ArcTree - String treeName tree = name $ rootLabel $ tree sameTreeName :: ArcTree - ArcTree - Bool sameTreeName treeLeft treeRight = treeName treeLeft == treeName treeRight pathToTree :: [String] - ArcForest pathToTree [] = [] pathToTree (name:subpath) = [ Node { rootLabel = ArcData { name = name } , subForest = pathToTree subpath } ] prettyPrint' :: ArcForest - [String] prettyPrint' [] = [] prettyPrint' (x:xs) = [name $ rootLabel $ x] ++ (map ( ++) (prettyPrint' $ subForest x)) ++ prettyPrint' xs prettyPrint :: ArcForest - IO () prettyPrint forest = do forM_ (prettyPrint' forest) putStrLn ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Paths to tree
appears to work. I also find the union function incredibly easy to understand. I only hope I got it right. union :: (Ord k) = Trie k v - Trie k v - Trie k v union (Trie k0 v0) (Trie k1 v1) = Trie k0 v where v = Map.unionWith union v0 v1 Well, once you found such a really concise function, it can only be correct :) While it is not relevant in your case, note that 'union' can be extended to be applicable with the recursive trick. But the extension suggest itself by noting that you used 'Map.unionWith' instead of 'Map.union'. So, you could do unionWith :: (Ord k) = (v - v - v) - Trie k v - Trie k v - Trie k v unionWith f (Trie k0 v0) (Trie k1 v1) = Trie (f k0 k1) v where v = Map.unionWith (unionWith f) v0 v1 union = unionWith (\_ y - y) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Haskell Cookbook?
Alexy Khrabrov wrote: Reflecting why it's harder to pick up Haskell than say Ruby or Python, here's what I found -- those languages deal with a typical domain available to any programmer -- his own computer/system/shell. The artifacts are files, directories, timestamps, etc. The stuff every programmer understands in their sleep. APIs. So I loved the shell-script beautification thread. Note that there is an inherent difficulty with files, directories, pipes: they are not purely functional in style. While it's unavoidable that writeFile :: FilePath - String - IO () is in IO, I think that readFile :: FilePath - IO String would not need to be in IO if the file system would be persistent (not ephemeral). Pipes are an ugly way to implement lazy evaluation. For instance, take the pipe ls -l | grep '*.hs'. To achieve the effect that grep immediately feeds on the data ls -l spits out, both programs are run concurrently and blocking keeps in line. This does not generalize so well to multiple data sources, something lazy evaluation has no problems with. And one of the main problems is that current OS see programs as things of type String - IO String. Of course, pure and strongly typed programs would be preferred. So, instead of calling system (ghc -O2 -c ++ show filepath) we want GHC.desugar :: GHC.Haskell' - GHC.Core GHC.optimize :: GHC.Core - GHC.Core GHC.assemble :: GHC.Core - OS.Program In a sense, current OS are not ready for Haskell yet. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: ANNOUNCE: The Monad.Reader - Issue 6
Wouter Swierstra wrote: I pleased to announce that the latest issue of The Monad.Reader is now available: [...] Horray, the long awaited new issue of my favorite breakfast reading is out! * Bernie Pope - Getting a Fix from the Right Fold [...] Concerning the strictness of dwBackwards, it suffices to make the pattern match on (ys,xs) irrefutable: dwBackwards predicate = fst . dwPairs predicate dwPairs :: (a - Bool) - [a] - ([a], [a]) dwPairs predicate = foldr combine base where -- combine next ~(ys, xs) | predicate next = (ys, next:xs) | otherwise = (next:xs, next:xs) base = ([], []) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: GADTs: the plot thickens?
Its partial inverse, thickening has the spec thicken i i = Nothing thicken i (thin i j) = Just j thicken :: Fin (S n) - Fin (S n) - Maybe (Fin n) thicken Fz Fz = Nothing thicken Fz (Fs j) = Just j thicken (Fs i) Fz = Just Fz thicken (Fs i) (Fs j) = fmap Fs (thicken i j) [...] So you really need a safety check there. Another way to do it, crudely but avoiding the run time numbers, is this thicken :: Fin (S n) - Fin (S n) - Maybe (Fin n) thicken Fz Fz = Nothing thicken Fz (Fs j) = Just j thicken (Fs Fz) Fz = Just Fz thicken (Fs Fz) (Fs j) = fmap Fs (thicken Fz j) thicken (Fs (Fs i)) Fz = Just Fz thicken (Fs (Fs i)) (Fs j) = fmap Fs (thicken (Fs i) j) Isn't the problem simply that in the former 'thicken', the compiler can not guarantee that the case thicken (Fs i) Fz = Just Fz does never show up when we have 'thicken :: Fin (S Z) - ...'? I mean the subtle point about Fin is that 'Fz' is the only inhabitant of 'Fin (S Z)'. At least, this is what Epigram can deduce. But due to _|_, every constructor that is declared may appear in the form of Fs _|_ That's why only the latter 'thicken' can be correct. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Channel9 Interview: Software Composability and the Future of Languages
Bulat Ziganshin wrote: there are also many other similar issues, such as lack of good syntax for for, while, break and other well-known statements, On the other hand you have an ability to define your own control structures. i have a lot, but their features are limited, both in terms of automatic lifting and overall syntax. let's consider while (hGetBuf h buf bufsize == bufsize) crc := updateCrc crc buf bufsize break if crc==0 print crc I guess that the crc is a simple fold over the single bytes: crc xs = foldl' (\crc word8 - crc `xor` word8) 0 xs You do not need xs to be an inefficient String, Data.ByteString.Lazy gives you single byte access (f.i. via fold) but internally reads stuff in a chunked way, just like you now manually do with hGetBuf. Lazy evaluation is very handy for separating those the two concerns of reading the chunks of bytes and presenting them in a non-chunked way. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Let's welcome the Ruby hackers!
Alexis wrote: In contrast with other IT-related communities i've experienced, i've found the Haskell community (both here and on IRC) to generally be helpful, good-humoured and mercifully lacking in flames and alpha behaviours. :-) I have to reject this claim because there are quite many alphas in here. For instance, ∀α.α notoriously tries to creep in every discussion, just because he thinks that he is principally more general than the others. Of course, he's a blatant liar. Another well known troll is ∀α.α - α. While at least not throwing in contradictory posts, he greatly overestimates his role. Most often, you can just elide his contributions as he only repeats prior arguments. Sometimes, he even signs his posts with the pseudonym (∀α.α - α)-(∀α.α - α) to rise in rank, but this is easily seen through. The list once tried to employ alpha-conversion to get rid of them. But the only effect was that now, the betas annoy us as well! A particularly persistent offspring is ∀α.α - β - β giving rise to much debate in regular intervals: he managed to subvert parametricity. Also, the mischievous ∀αβ.α - β even plotted with evil IO to get the attention he thinks he deserves. In the end, the alphas and betas are noisy braggarts, talking very long about what they want to do without doing anything at all. It's the lambdas who do all the real work. Fortunately, they most often don't need the signature from their alpha bosses. Regards, apfelmus PS: This mail is best viewed with Unicode (UTF-8). ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: (a - [b]) vs. [a - b]
Chad Scherrer wrote: Unfortunately, I was trying to give a simplification of the real problem, where the monad is STM instead of []. Based on apfelmus's observation of why they can't be isomorphic, I'm guessing I'm out of luck. http://www.haskell.org/pipermail/haskell-cafe/2006-December/020041.html So in reality, I'm trying to construct something like f :: (a - STM b) - STM (a - b) Indeed, such an f most likely does not exist. What is the task you tried to solve with the help of f? I guess that either there is a way without or it just cannot be solved for mathematical reasons. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Generalizing three programs
Andrew Wagner wrote: Hi everyone, I've got an interesting problem here I'm trying to solve. Actually, I've got several problems which seem to have a very similar structure. I want to find a way to abstract them to solve other problems which can be thought about in the same way. Here they are: http://hpaste.org/307 http://hpaste.org/308 http://hpaste.org/309 Note that these are just sketches, the programs aren't done yet. The general structure of the problem is that an object enters a system, interacts with different parts of the system, and eventually leaves, and we want to monitor some statistics about the interaction, so that we can then make some changes, and run it again, and hopefully improve it. Thanks in advance for any suggestions! I'm unsure whether it's a good idea to simulate the situations, I'd prefer a more denotational approach. To that extend, http://haskell.org/haskellwiki/Research_papers/Data_structures#Probablistic_Programming may help. Also, I think that in all three problems, the interesting probability distributions (like the time a random customer has to wait at the register) - perhaps depending on a chosen scheduling strategy - can be calculated without sampling. At least, the sampling can be integrated transparently into the probabilistic functional programming framework cited above. Besides, it's not specified what efficiency means in the grocery store problem. The mean time a customer has to wait is not the only possible cost measure. Customers have different processing times and one could weight mean wait time with that, so that people buying few stuff have much shorter waiting times than people with several full shopping carts. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization fun
Creighton Hogg wrote: Hello Haskell-ers, So a friend and I were thinking about making code faster in Haskell, and I was wondering if there was a way to improve the following method of generating the list of all prime numbers. It takes about 13 seconds to run, meanwhile my friend's C version took 0.1. I'd love to learn a bit more about how to optimize Haskell code. Of course, the best optimization is a better algorithm. In case this is what you're after, have a look at Colin Runciman, Lazy Wheel Sieves and Spirals of Primes http://citeseer.ist.psu.edu/runciman97lazy.html While Haskell makes it possible to express very complicated algorithms in simple and elegant ways, you have to expect to pay a constant factor (roughly 2x-10x) when competing against the same algorithm in low-level C. -- Naive way to calculate prime numbers, testing each new n to see if it has prime factors less than sqrt(n). import Data.List primes = 2:(foldr (\x y - if isPrime x then x:y else y) [] [3..]) where isPrime x = foldl' (\z y - z (if x `mod` y == 0 then False else True)) True (take (floor $ sqrt $ fromIntegral x) primes) Your code has two glitches and a serious flaw: foldl' is strict but not fast, use foldr instead. Premature strictness is the root of all evil :) To see what went wrong, I take the freedom to rewrite the code as primes= 2 : filter isPrime [3..] isPrime x = all' (\p - x `mod` p /= 0) . take sqrtn $ primes where sqrtn = floor $ sqrt $ fromIntegral n all' prop = foldl' (\z y - z prop y) True The first and most minor glitch is the missing type signature. Every Haskell compiler will default your integers to arbitrary precision Integers: :t primes [Integer] I doubt that your C friend uses arbitrary precision arithmetic, so you'd better write down primes :: [Int] isPrime :: Int - Bool The second glitch is that you 'take sqrtn primes'. This is not the same as 'takeWhile (= sqrtn) primes', i.e. taking primes as long as they are smaller than the square root of n. I guess you know that this results in far fewer primes taken. The glitches may have been unintentional, but the flaw intentionally degrades performance: you should not use a strict all' but the lazy all prop = foldr (\y z - prop y z) True from the Prelude. The point is that the lazy version stops inspecting the elements of the remaining list whenever (prop y) turns False for the first time. This is because is lazy: False x = False for whatever x we supply. For example, take the list [True, False, True, True] ++ replicate 100 True Here, all returns False after inspecting the first two elements while all' inspects every one of the 104 list elements just to return False afterwards. As every second number is even, your all' is busy wasting time like crazy. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimization fun
You're right, 'fix' is *not* a fix for non-termination, this is better fixed in the type system (with the right fixed points or you're in a fix) ;) Fixed regards, apfelmus Lennart Augustsson wrote: This is actually a pretty good algorithm. And also a rather subtle one when it comes to termination. :) Of course, the best optimization is a better algorithm. In case this is what you're after, have a look at Colin Runciman, Lazy Wheel Sieves and Spirals of Primes http://citeseer.ist.psu.edu/runciman97lazy.html ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Foldr tutorial, Inspired by Getting a Fix from a Fold
Bernie Pope wrote: Lennart Augustsson wrote: Sure, but we also have para f e xs = snd $ foldr (\ x ~(xs, y) - (x:xs, f x xs y)) ([], e) xs Nice one. Nice one is an euphemism, it's exactly solution one :) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Foldr tutorial, Inspired by Getting a Fix from a Fold
Lennart Augustsson wrote: para f e xs = snd $ foldr (\ x ~(xs, y) - (x:xs, f x xs y)) ([], e) xs I thought solution one was missing the ~ ? Yes, that's irrefutably right ;) I mean solution one modulo the laziness bug. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Strange memory consumption problems in something that should be tail-recursive
Jefferson Heard wrote: Argh, bitten by the scheme bug! Right -- NO tail recursion... So that leaves me with some rather non-intuitive strategies for achieving execution time efficiency. Anyone care to point me in the direction of a document on efficiency in Haskell? Besides, proper tail recursion in Haskell needs strictness annotations, but the best way is to forget the two words tail recursive altogether :) It always helps to do a rough calculation of how much time you have to expect it to run. Processing 1TB with a 1GHz processor and 16=2^4 machine instruction in the inner loop (must be quite short, the loop) takes 2^40 / (2^30 / 16) = 2^14 seconds ~ 4.5 hours Of course, these 4.5 hours are quite sensitive to the 2^4 factor and might well be 3 or 9 hours. Assuming that you ran alex on a String, the reported 36 hours are entirely reasonable, in the sense of alex not being overly slow. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Is lazyness make big difference?
Allan Clark wrote: For me one of the best examples of this is that of logging within a compiler. Consider a compiler which operates roughly as such compileProgram :: String - IO () compileProgram program = let (log, abstract) = parse program (log2, typedProgram) = typeCheck abstract log (log3, convertedProgram) = convertToIntermediate typedProgram log2 (log4, convertedToAssembly) = convertToAssembly convertedProgram log3 in do writeFile a.asm (show convertedToAssembly) writeFile a.log (show log4) Now each of the intermediate transforming calls will produce some logging information which is added to the current log. It's a bit OT (off thread) but I think that it's better to appeal to the monoid structure of logs let (log, abstract)= parse program (log2, typedProgram)= typeCheck abstract (log3, convertedProgram)= convertToIntermediate typedProgram (log4, convertedToAssembly) = convertToAssembly convertedProgram in show (mconcat [log,log2,log3,log4]) i.e. to use Monad.Writer in stead of Monad.State. The point is that for example 'typedProgram' does not really depend on the contents of 'log', but the dependencies in your code don't express this. One should switch from Log - (a, Log) to (a, Log - Log) or even (a, Log) if Log already has a natural monoid structure. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Is lazyness make big difference?
Nick wrote: The question is the following: how big the gap between strict languages with lazy constructs and Haskell? Does the default lazyness have irrefutable advantage over default strictness? Laziness is needed to achieve true compositionality. This point is elaborated in John Hughes. Why functional programming matters http://haskell.org/haskellwiki/Research_papers#Overview I also think that the laziness in Haskell is already so implicit that 90% of the Haskell code written so far will simply break irreparably if you experimentally remove it. By the way, lazy evaluation is strictly more powerful than eager evaluation (in a pure language, that is) with respect to asymptotic complexity: Richard Bird, Geraint Jones and Oege de Moor. More Haste, Less Speed. http://web.comlab.ox.ac.uk/oucl/work/geraint.jones/morehaste.html Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Is lazyness make big difference?
Nick wrote: main= print primes primes = 2:filter is_prime [3,5..] is_prime n = all (\p- n `mod` p /= 0) (takeWhile (\p- p*p=n) primes) We can rewrite this in strict languages with lazy constructs. For example, in Scala (of course stream is not only lazily evaluated thing there) def main(args: Array[String]): Unit = { val n = Integer.parseInt(args(0)) System.out.println(primes(ints(2)) take n toList) } def primes(nums: Stream[Int]): Stream[Int] = Stream.cons(nums.head, primes ((nums tail) filter (x = x % nums.head != 0)) ) def ints(n: Int): Stream[Int] = Stream.cons(n, ints(n+1)) Aha, I finally recovered some of the examples from which the claim Laziness is needed to achieve true compositionality stems. The first is already present in your example above and also showed up some time ago in the thread Optimisation fun. The point is that the function 'all' used in is_prime n = all (\p- n `mod` p /= 0) (takeWhile (\p- p*p=n) primes) works only because we have lazy *Bool*eans. Your Scala version accidentally (?) circumvents it by using a different algorithm, namely primes' = sieve [2..] sieve (x:xs) = x : filter (\y - y `mod` x /= 0) (sieve xs) Thanks to laziness, 'all' stops as soon as one element does not fulfill the condition. True compositionality allows us to define all p = foldr () True . map p and get the lazy behavior. You cannot reuse a strict () in such a way. Of course, given some support for lazy constructs, you could define a lazy version of () just as you define a lazy version of lists (called Streams), but not having laziness as default means that you have to think about whether your function is intended to be re-used (= you have to provide lazy interface) or not *before* you write your function. The second folklore example is lazy mergesort: mergesort [] = [] mergesort xs = foldtree1 merge $ map return xs foldtree1 f [x] = x foldtree1 f xs = foldtree1 f $ pairs xs where pairs []= [] pairs [x] = [x] pairs (x:x':xs) = f x x' : pairs xs merge [] ys = ys merge xs [] = xs merge (x:xs) (y:ys) = if x = y then x:merge xs (y:ys) else y:merge (x:xs) ys The point about this 'mergesort' is that while it sorts a complete list in O(N log N) time, it may return the minimum element in O(N) time already. Thus, we can be bold and reuse 'mergesort' as in minimum = head . mergesort and still get the desired O(N) asymptotic complexity. Note: The function 'foldtree' folds the elements of a list as if they where in a binary tree: foldrtree f [1,2,3,4,5,6,7,8] == ((1 `f` 2) `f` (3 `f` 4)) `f` ((1 `f` 2) `f` (3 `f` 4)) The O(N) stuff works because 'foldtree' constructs this expression in O(N + N/2 + N/4 + N/8 + ..) = O(N) time. I'm not entirely sure, but I think that the more common 'splitAt (length xs `div` 2)' and 'deal (x:x':xs) = (x:..,x':..)' approaches both take O(N log N) time for the same task. This makes them unusable for the point here. Besides, only 'foldtree' can easily be transformed into a proof for dependent types, but that's another story told by Conor McBride in 'Why dependent types matter'. There has been another example circulating on #haskell. I think it was something with substrings = concatMap tails . inits but I can't remember it now. Cale, can you help? Anyway, the point is that with domain specific embedded languages, the re-usability without time penalties is crucial. So far, only default laziness can achieve this. I also think that the laziness in Haskell is already so implicit that 90% of the Haskell code written so far will simply break irreparably if you experimentally remove it. Yes, I understand, that the present Haskell code heavily bases on laziness, but I'm going into the problem in general: how much I get, if I switch from default strictness to default laziness in my hypothetical language L? Or, from other side, how much I throw away in the reverse case? Yes, what I meant with laziness in Haskell is already so implicit is that the re-use I exemplified above happens subconsciously. So indeed, it looks like - and only looks like - one could easily turn a lazy language into a strict one. Isn't that the good thing about laziness that nobody notices it in the code? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: stateful walk through a tree?
David Tolpin wrote: I am looking for help in design of a stateful tree walker. I think that you can use Data.Traversable to great effect. Related to that are Control.Applicative and Data.Foldable. The papers that are mentioned in the Haddocks explain what these modules do and what they are useful for. How would I do that in Haskell? I'd like to avoid using mutable variables for now (mostly for didactic puproses). Well, Haskell doesn't have mutable variables as LISP or ML do. In the end, avoiding mutable variables is more useful for non-didactic purposes :) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe