Re: [Haskell-cafe] Re: optimization help
Hello, The (almost) point-free versions run faster than my "fast" imperative version and take up significantly less heap space-- even the version which reads everything and then writes takes up about 1/3 the heap space as my version. I get the impression that point-free style is a preventive measure against space leaks. thanks again, Jeff ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] function result caching
On Thu, Oct 12, 2006 at 04:01:14PM -0700, Carl Witty wrote: > Instead of using an infinite list, you can use an infinite binary tree, > with a cached result at every node. Construct a binary tree with the > following property: Consider the path from the root to a node, where > left branches are called "0" and right branches are called "1". This > sequence of 0's and 1's is the binary expansion of the key whose cached > value is stored at that node (with the least-significant-bit at the root > of the tree). (Actually constructing this tree, and looking things up > in it, is left as an exercise for the reader; but it isn't very hard.) > > This generalizes to any kind of key that can be uniquely serialized into > bits; looking up the corresponding value then takes O(# of bits in the > key) extra time and space, which is far better than the O(2^(# of bits > in the key)) that an infinite list would use. it is too bad IntSet and IntMap are strict in their subtrees, it would have been nice to provide things like infiniteMap :: (Int -> a) -> IntMap a infiniteMap = ... cachingSet :: (Int -> Bool) -> IntSet cachingSet = ... out of curiosity, why are IntMap and IntSet strict in their subtrees. since their subtrees are not of CPR form, I would think the benefit would not be that great and might even hurt in some situations... was there testing of the 'lazy' version? John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] nightnightita
hope you're feeing better too. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Nim Game in Haskell - Someone plz?
This sounds like a request for homework help. http://www.haskell.org/haskellwiki/Homework_help Mike Luis Felipe wrote: Hi, I need help to develop an implementation of nim game in Haskell. Could anyone send me a implementation of this game in haskell?? thanks ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Nim Game in Haskell - Someone plz?
Hi, I need help to develop an implementation of nim game in Haskell. Could anyone send me a implementation of this game in haskell?? thanks -- Luis ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Having typing problem with "tan" function.
Okay, I think I get it. The thing to realize is that if you do not declare type, Haskell, being smart makes some inferences of the type of values depending on the expression and usage of the value. To solve the problem I tested a variation based on this understanding and it seems to have worked. I declared type for the expressions. Thanks for the help and the comment about the indenting. (Indenting is a little confusing at this point.) triangle :: Window -> (Point,Point) -> Int -> IO () triangle window ((x1,y1), (x2,y2)) size = let height :: Float height = fromIntegral(y2 - y1) * 3 / 2 halfWidth :: Float halfWidth = height * (tan (pi / 3 )) in drawInWindow window (withColor (sizeColorMap size) (polygon [(x1,y1),( x1 - (round halfWidth), round height ), (x1 + (round halfWidth), round height)] )) On Thu, 2006-12-10 at 20:45 -0400, Cale Gibbard wrote: > The compiler is complaining that there's no instance of the Floating > class for Int because Floating is the class where pi and tan are > defined, and it manages to infer that halfWidth must be an Int. How > does it do this? > > Well, assuming that you are using SOE or a similar graphics library, > the polygon function expects a list of Points, and type Point = (Int, > Int). So, in particular, x1 - halfWidth must be an Int, and so > halfWidth needs to be an Int. > > Hence, you probably want to change the definition of halfWidth to be: > halfWidth = round (height * tan (pi / 3)) > > Also, just as a side note, your indentation is a little unusual. It's > more common to align the 'let' with the 'in'. This would also be an > appropriate place to use 'where'. > > hope this helps, > - Cale > > On 12/10/06, Edward Ing <[EMAIL PROTECTED]> wrote: > > Hi, > > I am new to Haskell, so the following problem is probably easy for you to > > spot but difficult for me. > > Any pointers would be appreciated. > > In GHCI on a load I am getting the following message: > > > > SSnowflake.hs:41:45: > > No instance for (Floating Int) > > arising from use of `pi' at Snowflake.hs:41:45-46 > > Probable fix: add an instance declaration for (Floating Int) > > In the first argument of `(/)', namely `pi' > > In the first argument of `tan', namely `(pi / 3)' > > In the second argument of `(*)', namely `(tan (pi / 3))' > > Failed, modules loaded: none. > > > > > > When I put the expression "tan ( pi / 3)" at the GHCI interpreter prompt I > > get a value, but this is not accepted on a load the code. > > > > > > > > Here is the code: > > code > > > > triangle :: Window -> (Point,Point) -> Int -> IO () > > triangle window ((x1,y1), (x2,y2)) size = > > letheight = fromIntegral(y2 - y1) * 3 / 2 > > halfWidth = height * (tan (pi / 3 )) in > > drawInWindow window (withColor (sizeColorMap size) > > (polygon [(x1,y1),( x1 - halfWidth, height ), (x1 > > + halfWidth, height)] )) > > > > ___ > > Haskell-Cafe mailing list > > Haskell-Cafe@haskell.org > > http://www.haskell.org/mailman/listinfo/haskell-cafe > > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Having typing problem with "tan" function.
The compiler is complaining that there's no instance of the Floating class for Int because Floating is the class where pi and tan are defined, and it manages to infer that halfWidth must be an Int. How does it do this? Well, assuming that you are using SOE or a similar graphics library, the polygon function expects a list of Points, and type Point = (Int, Int). So, in particular, x1 - halfWidth must be an Int, and so halfWidth needs to be an Int. Hence, you probably want to change the definition of halfWidth to be: halfWidth = round (height * tan (pi / 3)) Also, just as a side note, your indentation is a little unusual. It's more common to align the 'let' with the 'in'. This would also be an appropriate place to use 'where'. hope this helps, - Cale On 12/10/06, Edward Ing <[EMAIL PROTECTED]> wrote: Hi, I am new to Haskell, so the following problem is probably easy for you to spot but difficult for me. Any pointers would be appreciated. In GHCI on a load I am getting the following message: SSnowflake.hs:41:45: No instance for (Floating Int) arising from use of `pi' at Snowflake.hs:41:45-46 Probable fix: add an instance declaration for (Floating Int) In the first argument of `(/)', namely `pi' In the first argument of `tan', namely `(pi / 3)' In the second argument of `(*)', namely `(tan (pi / 3))' Failed, modules loaded: none. When I put the expression "tan ( pi / 3)" at the GHCI interpreter prompt I get a value, but this is not accepted on a load the code. Here is the code: code triangle :: Window -> (Point,Point) -> Int -> IO () triangle window ((x1,y1), (x2,y2)) size = letheight = fromIntegral(y2 - y1) * 3 / 2 halfWidth = height * (tan (pi / 3 )) in drawInWindow window (withColor (sizeColorMap size) (polygon [(x1,y1),( x1 - halfWidth, height ), (x1 + halfWidth, height)] )) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Having typing problem with "tan" function.
Am Freitag, 13. Oktober 2006 02:22 schrieb Edward Ing: > Hi, > I am new to Haskell, so the following problem is probably easy for you to > spot but difficult for me. Any pointers would be appreciated. > In GHCI on a load I am getting the following message: > > SSnowflake.hs:41:45: > No instance for (Floating Int) > arising from use of `pi' at Snowflake.hs:41:45-46 > Probable fix: add an instance declaration for (Floating Int) > In the first argument of `(/)', namely `pi' > In the first argument of `tan', namely `(pi / 3)' > In the second argument of `(*)', namely `(tan (pi / 3))' > Failed, modules loaded: none. > > > When I put the expression "tan ( pi / 3)" at the GHCI interpreter prompt I > get a value, but this is not accepted on a load the code. > > Here is the code: > code > > triangle :: Window -> (Point,Point) -> Int -> IO () > triangle window ((x1,y1), (x2,y2)) size = >letheight = fromIntegral(y2 - y1) * 3 / 2 > halfWidth = height * (tan (pi / 3 )) in > drawInWindow window (withColor (sizeColorMap size) >(polygon [(x1,y1),( x1 - halfWidth, height ), (x1 + > halfWidth, height)] > )) > The co-ordinates must have type Int, I believe, tan has type Floating a => a -> a. However, Int is not an instance of the Floating class. What you ought to do is convert height and halfWidth to Int via round, floor or ceiling. Then it should work. Cheers, Daniel > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Having typing problem with "tan" function.
Hi, I am new to Haskell, so the following problem is probably easy for you to spot but difficult for me. Any pointers would be appreciated. In GHCI on a load I am getting the following message: SSnowflake.hs:41:45: No instance for (Floating Int) arising from use of `pi' at Snowflake.hs:41:45-46 Probable fix: add an instance declaration for (Floating Int) In the first argument of `(/)', namely `pi' In the first argument of `tan', namely `(pi / 3)' In the second argument of `(*)', namely `(tan (pi / 3))' Failed, modules loaded: none. When I put the expression "tan ( pi / 3)" at the GHCI interpreter prompt I get a value, but this is not accepted on a load the code. Here is the code: code triangle :: Window -> (Point,Point) -> Int -> IO () triangle window ((x1,y1), (x2,y2)) size = letheight = fromIntegral(y2 - y1) * 3 / 2 halfWidth = height * (tan (pi / 3 )) in drawInWindow window (withColor (sizeColorMap size) (polygon [(x1,y1),( x1 - halfWidth, height ), (x1 + halfWidth, height)] )) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] function result caching
On Fri, 2006-10-13 at 01:27 +0300, Silviu Gheorghe wrote: > it does, thank you very much for the quick answer, unfortunately as I > understand it, it doesn't work well on ints :( > > for just now i created a list > > slowFunctionCacheList= [slowFunction (i) | i <-[0..500]] > and use "slowFunctionCacheList !! i" instead of "slowFunction (i)" > > it helped alot (i mean i stoped the program after "3 hours still > working" and got the result in 2 minutes :)) > > i am still curious about a better method (and a general one), because > this is ugly, and it only works on ints as i see it. I can describe a method that's uglier, faster, and more general; is that better or not? You're using an infinite list to store your cached results. (Well, your list is actually finite, but an infinite list would work just as well.) Instead of using an infinite list, you can use an infinite binary tree, with a cached result at every node. Construct a binary tree with the following property: Consider the path from the root to a node, where left branches are called "0" and right branches are called "1". This sequence of 0's and 1's is the binary expansion of the key whose cached value is stored at that node (with the least-significant-bit at the root of the tree). (Actually constructing this tree, and looking things up in it, is left as an exercise for the reader; but it isn't very hard.) This generalizes to any kind of key that can be uniquely serialized into bits; looking up the corresponding value then takes O(# of bits in the key) extra time and space, which is far better than the O(2^(# of bits in the key)) that an infinite list would use. Carl Witty ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] function result caching
it does, thank you very much for the quick answer, unfortunately as I understand it, it doesn't work well on ints :(for just now i created a list slowFunctionCacheList= [slowFunction (i) | i <-[0..500]] and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"it helped alot (i mean i stoped the program after "3 hours still working" and got the result in 2 minutes :)) i am still curious about a better method (and a general one), because this is ugly, and it only works on ints as i see it.but then again thank you for telling me it doesn't do it, because i had the false impresion it does and i wouldn't stop it otherwise On 10/13/06, Tom Phoenix <[EMAIL PROTECTED]> wrote: On 10/12/06, Silviu Gheorghe <[EMAIL PROTECTED]> wrote:> I'd like to know if the results are "cached" by the compilerHardly ever, as I understand things. > if they are not I'd like to know what is the best way to cache them> manually, and where can I read more about this, and the optimizations the> compiler does, because I've searched the web before and i found very little > on this topic.You need to search for the word "memoize" (or "memoise"). Here's apage about a memo function for GHC. http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.htmlHope this helps!--Tom Phoenix ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] function result caching
On 10/12/06, Silviu Gheorghe <[EMAIL PROTECTED]> wrote: I'd like to know if the results are "cached" by the compiler Hardly ever, as I understand things. if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little on this topic. You need to search for the word "memoize" (or "memoise"). Here's a page about a memo function for GHC. http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.html Hope this helps! --Tom Phoenix ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
>> A better solution would be to begin output before the the whole input is >> read, thus making things more lazy. This can be done the following way: >> from the input, construct a lazy list of (date,line) pairs. Then, let >> foldM thread a map from dates to corresponding output file pointers >> through the list and, at the same time, use the file pointers to output >> the line in question via appendFile. This way, every line consumed is >> immediately dispatched to its corresponding output file and things >> should only require memory for the different dates, besides buffering. >> > I tried this approach previously and it seems to be unacceptably slow. > I thought the slowness was just due to the fact that file operations > are slow, but I'll include my code here (cleaned up to take some of > your previous comments into account) just in case there is something > subtle I'm missing which is slowing down the code (B, M, and myRead > are as above): > > dates' file nRows = do > (cols, rows) <- myRead file > dateIx <- M.lookup cols $ B.pack "\"Date\"" > let writeDate row = B.appendFile (dataDir++fmt date) row where > date = (B.split ',' row)!!dateIx > fmt = B.unpack . B.map (\x -> if x == '-' then '_' else x) . > B.takeWhile (/= ' ') > oldFiles <- getDirectoryContents dataDir > mapM_ (\f -> catch (removeFile $ dataDir++f) $ const $ return ()) oldFiles > mapM_ writeDate $ take nRows rows > > This code takes over 20 minutes to process 100MB on my machine. No wonder, as this opens and closes the file on every row. The operating system will be kept quite busy that way! In some sense, your are outsourcing the row collecting M.Map to the OS... Of course, you want to open the files once and dispatch the rows to the different open handles. Here is a version (untested) which either does the read all then write approach (group'n'write) or opens the output files simultaneously (group'n'write2). Note also that there is no need to use M.Map for finding the "Date" keyword in the CSV header (which even hurts performance) though the effects are completely negligible. main = do args <- getArgs case args of ["dates",file,nRows] -> dates file (read nRows) dates file nRows = B.readFile file >>= group'n'write . sugarWithDates . take nRows . B.lines sugarWithDates (header:rows) = map (\r -> (B.split ',' r) !! dateIx, r)) rows where Just dateIx = Data.List.lookup (B.pack "\"Date\"") $ zip (B.split "," header) [0..] formatDate= B.unpack . B.map (\x -> if x == '-' then '_' else x) . B.takeWhile (/= ' ') date2filename = (dataDir ++) . formatDate group'n'write = mapM_ writeDate . M.toList . foldl' addDate M.empty where addDate mp (date,row) = M.insertWith date (\new old -> row:old) [] mp writeDate (date,rows) = B.writeFile (date2filename date) $ B.unlines rows group'n'write2 = foldM addDate M.empty >>= mapM_ hClose . M.elems where addDate mp (date,row) = do (fp,mp) <- case M.lookup date mp of Just fp -> return (fp,mp) _ -> do fp <- openFile (date2filename date) WriteMode return (fp, M.insert date fp mp) hPut fp row return mp The thing that bugs me is that one cannot separate group'n'write2 = write2 . group where (group) is a pure function. I think some kind of lazy writeFile could allow this. > thanks for your help, No problem. :) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] function result caching
hello,I'm new to Haskell and I'm using it to do some simulations for a game, and I have a some functions that have as argument just one int (the current situation), but they do a lot of computations after that (future evolutions etc) I'd like to know if the results are "cached" by the compiler (there are only a few thousand values i call the functions on, but they are distributed on a fairly large interval (0-100), because of the codification. if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little on this topic. thank you very muchSilviuP.S. sorry for my English :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is Haskell a Keynesian language?
It often seems to me that the Wildeian dichotomy of "charming" vs. "tedious" applies especially well to programming languages. On Oct 12, 2006, at 5:02 PM, mvanier wrote: I prefer the terms "awesome" and "crappy", respectively, but sure, whatever works for you ;-) Mike David F. Place mailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is Haskell a Keynesian language?
I prefer the terms "awesome" and "crappy", respectively, but sure, whatever works for you ;-) Mike Henning Thielemann wrote: Here is another approach of questionable classification of languages. :-) A lazy functional program is demand driven, an imperative program is supply driven. That is, if I request some information by calling a function in GHCi or Hugs, the interpreter develops a plan a how to produce the information I need and then executes the necessary steps. In contrast to that, an imperative program executes what's next on the schedule, whether it is need or not. So is Haskell a Keynesian language and C++ a Say language? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: tail-recursing through an associative list
[EMAIL PROTECTED] wrote: > For logarithmic access times, you should use a binary search tree like > Data.Map or similar. The problem in your case could be that matchKeys is > only approximate and your keys cannot be ordered in suitable fasion. That is precisely the problem that I was dealing with. I've been ruminating on some way to use a map to solve the problem, although since this project has been demoted from "cool side project for work" to "cool side project for my Copious Free Time", I don't know when I'll be able to implement it. Thanks for all your suggestions--at least I feel like I am on the right track. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Gtk2Hs and Glade Tutorial
Hia Hans, That's great that you're interested in doing/improving Gtk2Hs tutorial material. I think actually this is one of the weakest parts of Gtk2Hs at the moment - the lack of a decent basic intro tutorial. We've often talked about starting one, and we had various ideas floating around, but you know how it is, anyway we've never got round to writing one. Part of the problem is that as experts in the system and knowing all the internals etc, we're probably not the best people to write the entry level tutorial because we can't remember how we learned the stuff. We sketched out some ideas on the haskell.org wiki: http://haskell.org/haskellwiki/Gtk2Hs/Tutorials/Intro I'm not actually sure I can read .odt files because OOo doesn't build on my box. You couldn't export to pdf for me could you? :-) So, yes, fantastic. Lets chat about what you plan to do and how we can help and perhaps if we can get more people involved. Duncan On Thu, 2006-10-12 at 18:16 +0200, Hans van Thiel wrote: > Hello, > > There is a very nice tutorial on the Glade Interface Designer at > http://www.writelinux.com/glade and I've been using it with 'An > Introduction to Gtk2Hs, A Haskell GUI Library', by Kenneth Hoste at > http://haskell.org/~shae/memory.pdf to learn Gtk2Hs and Glade. > The 'Gtk2Hs Introduction', however, is not really aimed at the absolute > beginner and so I've written down my experience as an 'appendix' to the > glade tutorial, which really is for getting started. > I've just asked the author, Eddy Ahmed, for permission to 'translate' > his tutorial to Haskell Gtk2Hs, but meanwhile I'd like to submit my > separate concept (attached) for comments. > > Thanks, > > Hans van Thiel > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is Haskell a Keynesian language?
Henning Thielemann <[EMAIL PROTECTED]> writes: > Here is another approach of questionable classification of languages. :-) > > A lazy functional program is demand driven, an imperative program is > supply driven. > So is Haskell a Keynesian language and C++ a Say language? Great, now we can talk about the Invisible Hand performing evaluations... Alice: "The Invisible Hand is holding up more memory than I thought. My program is using O(n) space just to compute length!" Bob: "You've violated Nash equilibrium!" ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A type in search of a name...
"Brian Hulley" <[EMAIL PROTECTED]> writes: > You'll never believe it but I've been struggling last night and all of > today to try and think up a name for the following type and I'm still > nowhere near a solution: > > data ??? = VarId | VarSym | ConId | ConSym Perhaps Atom. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [off-topic / administrative] List Reply-to
On 11/10/06, Mikael Johansson <[EMAIL PROTECTED]> wrote: * It penalizes the person with a reasonable mailer in order to coddle those running brain-dead software. I don't agree. I view pine as something that should be classified as reasonable, and I feel penalized by non-munging. When you press R to reply, you should be asked whether you would like to reply to all recipients or not. Choose yes, and the message should be sent to the list. At least, that's how I remember pine working, though it's been about a year now since I used it. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] tail-recursing through an associative list
On 10/12/06, Seth Gordon <[EMAIL PROTECTED]> wrote: (it was easier to do that then to learn enough about Cabal to get HSQL recompiled with profiling), Hmm...That should just require: runghc Setup.hs configure -p Doing that tells cabal to build both the normal library and profiling enabled copy so that when you run: runghc Setup.hs build runghc Setup.hs install Both versions (normal and profiled) should be installed side by side. If that didn't work for you, maybe there is some other problem. I actually compiled HSQL from scratch + HSQL-MySQL just two days ago without a problem using the above commands. Perhaps you need to add a line in the .cabal file to tell ghc which profiling options to use. Something like: ghc-prof-options: -caf-all On a side note, cabal is fairly simple still (from a user point of view), meaning it's easy to learn 80% of the functionality (or maybe I mean learn how to do 80% of what you need). I think what we lack (or what the cabal userguide lacks) is a section on concrete examples taking you from cabalizing a HelloWorld project on up through the work needed to do fancy cross platform stuff. I know there are some examples, but something isn't quite right yet to make it accessible enough for beginners. And I'm not sure what, I just have some fuzzy ideas on what might make it better :) For me once I got comfortable with cabal I found that darcs + cabal makes a mean team. You can quickly pull together different libraries and get some serious hacking done. Much praise to both. I can't wait till cabal-install/hackage is mainstream. HTH, Jason ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: optimization help
Hello, When using addDate in foldM like below, you certainly don't want to search the cols for the string "Date" again and again everytime addDate is called. The index of the "Date" field is a number determined when parsing the header. That and only that number has to be plugged in here. Good catch. The main thing in the code that makes me feel very very ill is the fact that the code is quite "impure" (many many dos). The next line promptly bites back: > res <- foldM addDate M.empty $ take nRows rows Did you notice this appeal to addDate makes its callee getCol live in the IO-Monad? From the use of M.lookup in getColId, I think you intended to have getCol :: _ -> Maybe _, do you? M.lookup recently got the more general type M.lookup :: Monad m => _ -> m a, so it happily lives in IO. I was aware of this and counted on the lookup causing the program to stop if the column didn't exist. The following line does unnecessary work: myRead splits a row to get access to the date, but now you join it without having changed any field. It would be wiser to split for the date but to keep an intact copy of the line so that you can pass it here without join. This will reduce memory footprint. Another good observation which I missed. Your current solutions reads the input and "calculates" all output files before writing them to disk in a final step. This means that the contents of the output files has to be kept in memory. Thus you need least a constant * 100MB of memory. I don't know how ByteString interacts with garbage collection, but it may well be that by keeping the first line (you "cols") in memory, the entire input file contents is also kept which means an additional constant * 100 MB. It is likely that both can be shared if one resolves the code quirks mentioned above. I intentionally chose this design to minimize the amount of file access which seems to be quite slow (see below). After fixing the two slips you pointed out, my code works as expected, processing 100MB in about 1 minute using around 550MB of heap. Here is the good version (where B is Data.ByteString.Lazy.Char8 and M is Data.Map): myRead file = do v <- B.readFile file let (cols' : rows) = B.lines v cols = foldl' (\mp (k,v) -> M.insert k v mp) M.empty $ zip (B.split ',' cols') [0 ..] return (cols, rows) dates file nRows = do (cols, rows) <- myRead file dateIx <- M.lookup (B.pack "\"Date\"") cols let addDate mp row = M.insert date (row:old) mp where date = (B.split ',' row)!!dateIx old = M.findWithDefault [] date mp res = foldl addDate M.empty $ take nRows rows mapM_ writeDate $ M.toList res where fmt = B.unpack . B.map (\x -> if x == '-' then '_' else x) . B.takeWhile (/= ' ') writeDate (date,rows) = B.writeFile (dataDir++fmt date) (B.unlines rows) A better solution would be to begin output before the the whole input is read, thus making things more lazy. This can be done the following way: from the input, construct a lazy list of (date,line) pairs. Then, let foldM thread a map from dates to corresponding output file pointers through the list and, at the same time, use the file pointers to output the line in question via appendFile. This way, every line consumed is immediately dispatched to its corresponding output file and things should only require memory for the different dates, besides buffering. I tried this approach previously and it seems to be unacceptably slow. I thought the slowness was just due to the fact that file operations are slow, but I'll include my code here (cleaned up to take some of your previous comments into account) just in case there is something subtle I'm missing which is slowing down the code (B, M, and myRead are as above): dates' file nRows = do (cols, rows) <- myRead file dateIx <- M.lookup cols $ B.pack "\"Date\"" let writeDate row = B.appendFile (dataDir++fmt date) row where date = (B.split ',' row)!!dateIx fmt = B.unpack . B.map (\x -> if x == '-' then '_' else x) . B.takeWhile (/= ' ') oldFiles <- getDirectoryContents dataDir mapM_ (\f -> catch (removeFile $ dataDir++f) $ const $ return ()) oldFiles mapM_ writeDate $ take nRows rows This code takes over 20 minutes to process 100MB on my machine. In a setting without IO, the task corresponds to the "Optimization Problem" discussed at length in September on this list. The problem here is that writeFile currently cannot be interleaved lazily, this has to be simulated with appendFile. We can read files lazily but we cannot output them lazily. Can this be remedied? Can there be a version of writeFile which is, in a sense, dual to getContents? Wouldn't this require blocking IO? thanks for your help, Jeff ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Gtk2Hs and Glade Tutorial
Hello, There is a very nice tutorial on the Glade Interface Designer at http://www.writelinux.com/glade and I've been using it with 'An Introduction to Gtk2Hs, A Haskell GUI Library', by Kenneth Hoste at http://haskell.org/~shae/memory.pdf to learn Gtk2Hs and Glade. The 'Gtk2Hs Introduction', however, is not really aimed at the absolute beginner and so I've written down my experience as an 'appendix' to the glade tutorial, which really is for getting started. I've just asked the author, Eddy Ahmed, for permission to 'translate' his tutorial to Haskell Gtk2Hs, but meanwhile I'd like to submit my separate concept (attached) for comments. Thanks, Hans van Thiel Gtk2HsGlade.odt Description: application/vnd.oasis.opendocument.text ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: tail-recursing through an associative list
Malcolm Wallace wrote: > Seth Gordon <[EMAIL PROTECTED]> wrote: > >> almost-entirely-functional code ... that in its first draft, took >> about three seconds to process 2,000 rows, eight minutes to process >> 20,000 rows, and overflowed a 1-MB stack when processing 200,000 rows. >> Oops. > > Which just goes to show that your algorithm is non-linear in the size of > the input. I doubt that your posted snippet is the cause of the > problem, since it is certainly linear in the AList it is given. The linear time in the length of AList may well be the problem :) You seem to use AList as a key-value mapping (I know the word "associative" only as mathematical property, please correct me). The Key acts as key for the grouping you mentioned and the [MetaThingie] is the actual group of MetaThingie, is that so? That means that with each call to myFunction, the AList roughly grows by one element. For logarithmic access times, you should use a binary search tree like Data.Map or similar. The problem in your case could be that matchKeys is only approximate and your keys cannot be ordered in suitable fasion. Then you need a clever algorithm which somehow exploits extra structure of the keys (perhaps they are intervals, then you can use interval trees etc.). The C++ code is likely to do some magic along these lines. In such case, stunning functional power may come from Finger Trees: A Simple General-purpose Data Structure Ralf Hinze and Ross Paterson. in Journal of Functional Programming16:2 (2006), pages 197-217 http://www.soi.city.ac.uk/~ross/papers/FingerTree.pdf >> I profiled the code and >> observed that the most-called-upon function in the program (200 >> million entries for those 20,000 rows) > > By optimisation, you can only make this function a constant factor > faster. You need to work out how to call it less often in the first > place. Almost true. I think that recursive calls are counted as proper call, so that each top level call to myFunction will result a count of calls to myFunction linear in the length of AList. Thus "in first place" alias top level is not enough. >> type AList = [(Key, [MetaThingies])] >> >> myFunction :: AList -> Thingie -> AList >> myFunction [] x = [(key x, [meta x])] >> myFunction ((k, v):tail) x | matchKeys k (key x) = >>case tryJoining v x of >>Nothing -> (k, v) : (myFunction tail x) >>Just v' -> (k', v') : tail >>where v' = bestKey k (key x) should be (?) where k' = bestKey k (key x) >>| otherwise = (k, v) : (myFunction tail x) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] tail-recursing through an associative list
Seth Gordon <[EMAIL PROTECTED]> wrote: > almost-entirely-functional code ... that in its first draft, took > about three seconds to process 2,000 rows, eight minutes to process > 20,000 rows, and overflowed a 1-MB stack when processing 200,000 rows. > Oops. Which just goes to show that your algorithm is non-linear in the size of the input. I doubt that your posted snippet is the cause of the problem, since it is certainly linear in the AList it is given. > I profiled the code and > observed that the most-called-upon function in the program (200 > million entries for those 20,000 rows) By optimisation, you can only make this function a constant factor faster. You need to work out how to call it less often in the first place. > type AList = [(Key, [MetaThingies])] > > myFunction :: AList -> Thingie -> AList > myFunction [] x = [(key x, [meta x])] > myFunction ((k, v):tail) x | matchKeys k (key x) = >case tryJoining v x of >Nothing -> (k, v) : (myFunction tail x) >Just v' -> (k', v') : tail >where v' = bestKey k (key x) >| otherwise = (k, v) : (myFunction tail x) myFunction is clearly rather like a map (except that occasionally it stops before traversing the whole list). There is nothing wrong with its recursion pattern or otherwise. Regards, Malcolm ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is Haskell a Keynesian language?
This is certainly proof that you can abuse economics in any context! ;) Or perhaps that economics can be used to abuse anything... - Johan Tibell On 10/12/06, Henning Thielemann <[EMAIL PROTECTED]> wrote: Here is another approach of questionable classification of languages. :-) A lazy functional program is demand driven, an imperative program is supply driven. That is, if I request some information by calling a function in GHCi or Hugs, the interpreter develops a plan a how to produce the information I need and then executes the necessary steps. In contrast to that, an imperative program executes what's next on the schedule, whether it is need or not. So is Haskell a Keynesian language and C++ a Say language? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Is Haskell a Keynesian language?
Here is another approach of questionable classification of languages. :-) A lazy functional program is demand driven, an imperative program is supply driven. That is, if I request some information by calling a function in GHCi or Hugs, the interpreter develops a plan a how to produce the information I need and then executes the necessary steps. In contrast to that, an imperative program executes what's next on the schedule, whether it is need or not. So is Haskell a Keynesian language and C++ a Say language? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] tail-recursing through an associative list
In my first posting, I mentioned that I was going to try to translate some of our code to Haskell and see how it worked. Well, I don't have a stunning demonstration of the power of pure functional programming, but I do have an interesting problem. I chose to port a program that we use in our build system that takes a table of geographic data and groups the rows in the table according to [REDACTED]. The description of what the existing program does takes up only a few paragraphs, but the source code is eight pages of dense C++ that has obviously been optimized up the wazoo (and beyond the point where a mortal like myself can understand what's going on). On one of our servers, it can process 200,000 rows in about three minutes. My Haskell translation is three and a half pages of gorgeous lucid almost-entirely-functional code ... that in its first draft, took about three seconds to process 2,000 rows, eight minutes to process 20,000 rows, and overflowed a 1-MB stack when processing 200,000 rows. Oops. After banging together a version that took input from stdin instead of the database (it was easier to do that then to learn enough about Cabal to get HSQL recompiled with profiling), I profiled the code and observed that the most-called-upon function in the program (200 million entries for those 20,000 rows) was structured like this: type AList = [(Key, [MetaThingies])] myFunction :: AList -> Thingie -> AList myFunction [] x = [(key x, [meta x])] myFunction ((k, v):tail) x | matchKeys k (key x) = case tryJoining v x of Nothing -> (k, v) : (myFunction tail x) Just v' -> (k', v') : tail where v' = bestKey k (key x) | otherwise = (k, v) : (myFunction tail x) I'm wondering if the slowness of the program can be attributed to that case statement in the middle--perhaps unwrapping the Maybe type returned by tryJoining is preventing the function from being properly tail-recursive, or something. (I tried making the list construction strict by replacing "(k, v) : (myFunction tail x)" et al. with "(:) (k, v) $! (myFunction tail x)", and it actually slowed the program down, so either I'm not understanding how to improve things with strictness or laziness isn't the problem here.) Any advice from the gurus? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: optimization help
> I have been trying to do some CSV-style processing. My code works > fine for small input (up to 10MB), but performs poorly for moderate to > large input (it can't seem to finish 100MB of input with 700MB heap > space). I have gone through several optimization passes with profiler > help, and now I am hoping someone else can point out some other > approaches to improving the code's performance (both space and time). > > The code breaks a large file into smaller files all of whose entries > have the same date. First of all, for this problem and 100 MB input, you have to think carefully about what you do. I'll point out three quirks in your code and afterwards discuss how a better solution looks like. > module Main where > > import Debug.Trace > import Control.Monad > import Data.List > import qualified Data.ByteString.Lazy.Char8 as B > import qualified Data.Map as M > import System.Environment (getArgs) > > dataDir = "dataH/" > > myRead file = do > v <- B.readFile file > let (cols' : rows) = map (B.split ',') $ B.lines v > let cols = foldl' (\mp (k,v) -> M.insert k v mp) M.empty (zip cols' [0 ..]) > return (cols, rows) > > getColId cols col = M.lookup col cols > > getCol cols col row = do > i <- getColId cols col > return $! row!!i > > > dates file nRows = do > (cols, rows) <- myRead file > let addDate mp row | mp `seq` row `seq` False = undefined >| otherwise = do When using addDate in foldM like below, you certainly don't want to search the cols for the string "Date" again and again everytime addDate is called. The index of the "Date" field is a number determined when parsing the header. That and only that number has to be plugged in here. Thus the next line should read let date = row !! datefieldindex instead of > date <- getCol cols (B.pack "\"Date\"") row > let old = M.findWithDefault [] date mp > return $ M.insert date (row:old) mp The main thing in the code that makes me feel very very ill is the fact that the code is quite "impure" (many many dos). The next line promptly bites back: > res <- foldM addDate M.empty $ take nRows rows Did you notice this appeal to addDate makes its callee getCol live in the IO-Monad? From the use of M.lookup in getColId, I think you intended to have getCol :: _ -> Maybe _, do you? M.lookup recently got the more general type M.lookup :: Monad m => _ -> m a, so it happily lives in IO. I strongly suggest that you restructure your code and restrict IO to one place only: main = do .. input <- B.readFile file let outs = busywork input mapM_ [writeFile name contents | (name,contents) <- outs] where busywork does the work and is purely functional. > mapM_ writeDate $ M.toList res > where > fmt = B.unpack . B.map (\x -> if x == '-' then '_' else x) . > B.takeWhile (/= ' ') > writeDate (date,rows) = > B.writeFile (dataDir++fmt date) The following line does unnecessary work: myRead splits a row to get access to the date, but now you join it without having changed any field. It would be wiser to split for the date but to keep an intact copy of the line so that you can pass it here without join. This will reduce memory footprint. > (B.unlines $ map (B.join (B.pack ",")) rows) > > main = do > args <- getArgs > case args of > ["dates",file,nRows] -> dates file (read nRows) To summarize, the code is not very clean and several things slipped in, just as one would expect from an imperative style. The key is to separate concerns, which means here: IO will just do very dumb in and output, fetching the index of the "Date" from the header is handled separately, grouping the lines by date is to be separated from the to-be-output-contents of the lines. Now, we'll think about how to solve the task in reasonable time and space. Your current solutions reads the input and "calculates" all output files before writing them to disk in a final step. This means that the contents of the output files has to be kept in memory. Thus you need least a constant * 100MB of memory. I don't know how ByteString interacts with garbage collection, but it may well be that by keeping the first line (you "cols") in memory, the entire input file contents is also kept which means an additional constant * 100 MB. It is likely that both can be shared if one resolves the code quirks mentioned above. A better solution would be to begin output before the the whole input is read, thus making things more lazy. This can be done the following way: from the input, construct a lazy list of (date,line) pairs. Then, let foldM thread a map from dates to corresponding output file pointers through the list and, at the same time, use the file pointers to output the line in question via appendFile. This way, every line consumed is immediately dispatched to its corresponding output file and things should only require memory for the different dates, besides buffering.
Re: [Haskell-cafe] [off-topic / administrative] List Reply-to
Udo Stenzel <[EMAIL PROTECTED]> writes: >> However, I don't recall problems with multiple copies of emails. > I did get your mail twice, which I don't consider a huge problem. And for people who do, perhaps they can set up procmail to deal with this? E.g., http://www.greatcircle.com/lists/majordomo-users/mhonarc/majordomo-users.199808/msg00408.html -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] do ghci
On Wed, Oct 11, 2006 at 05:13:13PM -0700, Greg Fitzgerald wrote: > To: Haskell Cafe > From: Greg Fitzgerald <[EMAIL PROTECTED]> > Date: Wed, 11 Oct 2006 17:13:13 -0700 > Subject: [Haskell-cafe] do ghci > > Just curious, why does ghci run in the context of a 'do'? > > This tripped me up when I first started learning Haskell. It's fine once > you know what's going on, but why the restriction? Why can't I write the > code below without 'let' and ':module'? > > >two = 1 + 1 > >import Data.List > >cols = transpose [[1,2,3], [4,5,6]] If you interact with a command line, it is natural to have IO side effects and evaluate the command lines in the order of appearance. ghci is sort of a command line, right? It is straight-forward enough to make up something ad hoc without using the concept of monadic IO and memory modification, but why? matthias signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe