Send Beginners mailing list submissions to beginners@haskell.org To subscribe or unsubscribe via the World Wide Web, visit http://www.haskell.org/mailman/listinfo/beginners or, via email, send a message with subject or body 'help' to beginners-requ...@haskell.org
You can reach the person managing the list at beginners-ow...@haskell.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Beginners digest..." Today's Topics: 1. Re: I have created an ugly Haskell program.. (Heinrich Apfelmus) 2. Does System.Directory work on Windows XP? (Patrick Larrivee-Woods) 3. Re: Does System.Directory work on Windows XP? (Jason Dusek) 4. Re: Does System.Directory work on Windows XP? (Patrick Larrivee-Woods) 5. Re: Does System.Directory work on Windows XP? (Daniel Fischer) 6. Lazy file IO & Space leaks/waste (Aleksandar Dimitrov) 7. if ands (Nathan M. Holden) 8. Re: if ands (Joe Fredette) 9. Re: if ands (Keith Sheppard) ---------------------------------------------------------------------- Message: 1 Date: Wed, 04 Nov 2009 18:21:17 +0100 From: Heinrich Apfelmus <apfel...@quantentunnel.de> Subject: [Haskell-beginners] Re: I have created an ugly Haskell program.. To: beginners@haskell.org Message-ID: <hcsd6d$vv...@ger.gmane.org> Content-Type: text/plain; charset=ISO-8859-1 Brent Yorgey wrote: > Ask yourself: What Would Conal Do (WWCD)? Conal Elliott is always > trying to get people to think about the semantic essence of their > problems, so let's try it. > > What are we REALLY trying to do here? What are those lists of tuples, > REALLY? Well, it seems to me that the lists of tuples are really just > representing *functions* on some totally ordered domain. > [...] > > So, let's try converting these lists of pairs to actual functions: > > > asFunc :: (Ord a) => [(a,b)] -> (a -> Maybe b) > asFunc is a = fmap snd . listToMaybe . reverse . takeWhile ((<=a) . fst) $ > is > > [...] > > Now, you might object that this is much more inefficient than the > other solutions put forth. That is very true. [...] > > However, I still find it very helpful to think about the essence > of the problem like this: elegant yet inefficient code is a much > better starting place than the other way around! [...] > > You can also try to optimize, taking advantage of the fact that we > always call the functions built by asFunc with a sequence of strictly > increasing inputs. I am with Brent and Conal here. Now, to continue, ask yourself: What Would Conal Do Next (WWCDN)? What are we really trying to do here? What is this function, really, considering that we are only evaluating it at a strictly increasing sequence of inputs? Well, it seems to me that it is some special kind of function, best captured as an *abstract data type*. In particular, the function is something which I will call a "time series". In other words, the input is to be thought of as time. data Time t = Moment t | Infinity deriving (Eq,Ord,Show) The inclusion of infinity will turn out to be very convenient. Now, the time series is a function that has a value x1 in the distant past, until a time t1 where it begins to have the value x2 , again until a time t2 where it switches to x3 and so on, until a value xn that is kept until infinity. In Haskell, this looks like this function t | -Infinity <= t && t < t1 = x1 | t1 <= t && t < t2 = x2 | t2 <= t && t < t3 = x3 | ... | t1 <= t && t < Infinity = xn and pictorially, something like this: ____ xn _____ ____ x2 ____ | | |____ x3 ____ ... | _____ x1 ____| -Inf t1 t2 ... tn Inf Of course, we can implement this abstract data type with a list of pairs (tk,xk) newtype TimeSeries t a = TS { unTS :: [(a,Time t)] } deriving (Show) and our goal is to equip this data type with a few natural operations that can be used to implement Philip's zip-like function. The first two operations are progenitor :: TimeSeries t a -> a progenitor = fst . head . unTS which returns the value from the distant past and beginning :: TimeSeries t a -> Time t beginning = snd . head . unTS which returns the first point in time when the function changes its value. These correspond to the operation head on lists. The next operation is called `forgetTo` t and will throw away all values and changes before and including a given time t . forgetTo :: Ord t => TimeSeries t a -> Time t -> TimeSeries t a forgetTo (TS xs) Infinity = TS [last xs] forgetTo (TS xs) t = TS $ dropWhile ((<= t) . snd) xs This roughly corresponds to tail , but takes advantage of the time being continuous. Last but not least, we need a way to create a time series forever :: a -> TimeSeries t a forever x = TS [(x,Infinity)] and we need to add values to a time series, which can be done with an operation called prepend that adds a new beginning and replaces the progenitor . -- We assume that t < beginning xs prepend :: a -> Time t -> TimeSeries t a -> TimeSeries t a prepend x Infinity _ = TS [(x,Infinity)] prepend x t (TS xs) = TS $ (x,t) : xs These operations correspond to [] and (:) for lists. The key about these operations is that they have a description / intuition that is *independent* of the implementation of times series. At no point do we need to know how exactly TimeSeries is implemented to understand what these five operations do. Now, Philip's desired zip-like function is straightforward to implement: zipSeries :: Ord t => TimeSeries t a -> TimeSeries t b -> TimeSeries t (a,b) zipSeries xs ys = prepend (progenitor xs, progenitor ys) t $ zipSeries (xs `forgetTo` t) (ys `forgetTo` t) where t = min (beginning xs) (beginning ys) and you may want to convince yourself of its correctness by appealing to the intuition behind time series. Regards, apfelmus -- http://apfelmus.nfshost.com ------------------------------ Message: 2 Date: Wed, 04 Nov 2009 16:02:47 -0800 From: Patrick Larrivee-Woods <plarr...@sfu.ca> Subject: [Haskell-beginners] Does System.Directory work on Windows XP? To: beginners@haskell.org Message-ID: <4af21627.5030...@sfu.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi, I"m having trouble getting the System.Directory module in ghci on a Windows XP machine. Whenever I call a function like getCurrentDirectory or getDirectoryContents I get no results back. I was wondering if anyone knows why this is happening, or if someone can point me to a module that works on Windows. Here's an example of what I get: ___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 6.4.2, for Haskell 98. / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help. Loading package base-1.0 ... linking ... done. Prelude> :module System.Directory Prelude System.Directory> getCurrentDirectory Prelude System.Directory> getDirectoryContents "." Prelude System.Directory> getDirectoryContents "c:\\perl" Prelude System.Directory> ------------------------------ Message: 3 Date: Wed, 4 Nov 2009 16:02:49 -0800 From: Jason Dusek <jason.du...@gmail.com> Subject: Re: [Haskell-beginners] Does System.Directory work on Windows XP? To: Patrick Larrivee-Woods <plarr...@sfu.ca> Cc: beginners@haskell.org Message-ID: <42784f260911041602h32aa484cg391e8ad3f583a...@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Could you try a recent version of GHC and let us know if you still have trouble? -- Jason Dusek ------------------------------ Message: 4 Date: Wed, 04 Nov 2009 16:14:59 -0800 From: Patrick Larrivee-Woods <plarr...@sfu.ca> Subject: Re: [Haskell-beginners] Does System.Directory work on Windows XP? To: Jason Dusek <jason.du...@gmail.com> Cc: beginners@haskell.org Message-ID: <4af21903.3070...@sfu.ca> Content-Type: text/plain; charset=UTF-8; format=flowed With 6.10.4 it works fine. Thanks for your help, I hadn't even realized I was using an older version. Cheers, Patrick Jason Dusek wrote: > Could you try a recent version of GHC and let us know if you > still have trouble? > > -- > Jason Dusek > ------------------------------ Message: 5 Date: Thu, 5 Nov 2009 01:59:12 +0100 From: Daniel Fischer <daniel.is.fisc...@web.de> Subject: Re: [Haskell-beginners] Does System.Directory work on Windows XP? To: beginners@haskell.org Message-ID: <200911050159.12432.daniel.is.fisc...@web.de> Content-Type: text/plain; charset="utf-8" Am Donnerstag 05 November 2009 01:14:59 schrieb Patrick Larrivee-Woods: > With 6.10.4 it works fine. Thanks for your help, I hadn't even realized > I was using an older version. > > Cheers, > Patrick Just to explain: getDirectoryContents "." is an IO action producing a String. In newer GHC releases (I don't remember whther that change came with 6.6 or 6.8), when you invoke an IO action from the ghci-prompt, if the value returned from the action has not type (), by default it is bound to the variable 'it' and printed out (since it's not unconditionally a good thing to do that - consider readFile "HUGEFile.txt" - it can be disabled via ghci> :set -fno-print-bind-result - re-enable with -fprint-bind-result). Before, the action was simply run and its result not printed out, to use the result of an IO action, you had to use ghci> res <- getDirectoryContents "." ghci> print res or ghci> getDirectoryContents "." >>= mapM_ putStrLn or whatever. If you still have 6.4.2 installed, you can try it out (but do your real work with the newer, it produces better code). > > Jason Dusek wrote: > > Could you try a recent version of GHC and let us know if you > > still have trouble? > > > > -- > > Jason Dusek ------------------------------ Message: 6 Date: Thu, 5 Nov 2009 12:01:11 +0100 From: Aleksandar Dimitrov <aleks.dimit...@googlemail.com> Subject: [Haskell-beginners] Lazy file IO & Space leaks/waste To: beginners@haskell.org Message-ID: <20091105110110.ga7...@bylha.uni-tuebingen.de> Content-Type: text/plain; charset="us-ascii" Hello list, I'm currently writing a small linguistic corpus analyzer. My input file is only 25MB, but profiling shows that the overall amount of allocation over the program's runtime is several GB. That's a little too much - adding to that is the fact that the program is abysmally slow, so I'm suspecting a space leak somewhere. I'm using the ByteString.Lazy.Char8 class in order to work efficiently with lazy IO and I must admit that I'm very inexperienced with predicting runtime and space behaviour of lazy IO :-( It worked well in the past, but I'm stuck now. The program can be found here: http://hpaste.org/fastcgi/hpaste.fcgi/view?id=11863#a11863 The important bits are as follows: > mf :: [C.ByteString] -> StdWord > mf [] = Word [] C.empty > mf s = Word (tail s) (head s) > > f' = mf . reverse . C.words > main :: IO () > main = do > corpus_name <- liftM head getArgs > corpus <- liftM (Corpus . (map f') . C.lines) $ C.readFile corpus_name > print $ length (content corpus) > let interesting = filterForInterestingTags interestingTags corpus > print $ show (freqMap interesting) freqMap also uses foldr' to fold the corpus it is given into a Data.Map, but according to the profiling output, that's not where the bad stuff happens. Here's the interesting parts of the profiling output: > analyse +RTS -p -RTS corpus/bnc-samp.tt > > total time = 30.46 secs (1523 ticks @ 20 ms) > total alloc = 14,871,619,904 bytes (excludes profiling overheads) > > COST CENTRE MODULE no. entries %time %alloc %time %alloc > > MAIN MAIN 1 0 0.0 0.0 100.0 100.0 > main Main 221 0 0.0 0.0 0.0 0.0 > CAF Main 206 14 0.1 0.1 100.0 100.0 > showsPrec_aSF Main 223 1 0.0 0.0 0.0 0.0 > interestingTags Main 218 1 0.0 0.0 0.0 0.0 > f' Main 216 1 0.0 0.0 0.0 0.0 > mf Main 217 1 0.0 0.0 0.0 0.0 > main Main 212 1 2.4 4.4 99.9 99.9 > f' Main 219 0 89.0 91.5 90.2 92.7 > mf Main 220 2427450 1.2 1.2 1.2 1.2 > filterForInterestingTags Main 215 1 5.4 0.1 5.4 0.1 > freqMap Main 214 1 0.6 0.5 2.0 2.7 > compare_aRL Main 222 1141996 1.4 2.2 1.4 2.2 Obviously the main cost centre seems to be f' (which I've factored out in order to view it separately as a cost centre.) It's not the call to reverse that makes it slow (removing it doesn't affect the run time.) Interestingly, it prints out the first statement (length) very quickly, then takes a lot of time. I've removed the call to freqMap, and it seems that GHC is smart enough to drop the call to f' completely, because only the length ever gets evaluated. But the freqMap also needs the processing from f', and that's where the bad stuff starts happening. Should I look into DeepSeq? I've tried adding strictness to the functions by hand, and also to the data structures, but that didn't seem to help so far. So I'm looking for solutions to make this faster. I'm guessing that a smart `seq` somewhere might help, but I don't know where :-\ As a side note: how can I find out how the run-time data structures look like in memory? I.e. I want to know *what* exactly stays around as thunks in memory so that I can focus better on where to add strictness or redirect the program flow. Ideally, only a very smart part of the file should ever be in memory, with processing happening incrementally! Thanks for any pointers! Aleks -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://www.haskell.org/pipermail/beginners/attachments/20091105/5917de71/attachment-0001.bin ------------------------------ Message: 7 Date: Thu, 5 Nov 2009 21:05:22 -0500 From: "Nathan M. Holden" <nathanmhol...@gmail.com> Subject: [Haskell-beginners] if ands To: beginners@haskell.org Message-ID: <200911052105.22124.nathanmhol...@gmail.com> Content-Type: Text/Plain; charset="us-ascii" If you have an if statement like if (a&&b) then fun else fun' and a is false, does GHC actually bother to check b? ------------------------------ Message: 8 Date: Thu, 5 Nov 2009 21:09:13 -0500 From: Joe Fredette <jfred...@gmail.com> Subject: Re: [Haskell-beginners] if ands To: Nathan M.Holden <nathanmhol...@gmail.com> Cc: beginners@haskell.org Message-ID: <0c6944e9-12d2-4e97-8358-46f3caab1...@gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes No, consider the definition of (&&) -- I hope this is the def from the prelude. If it's not, then it's probably isomorphic... (&&) :: Bool -> Bool -> Bool True && x = x False && _ = False Since (&&) ignores it's second argument if the first is false, then it will "Short circuit" (like most `&` operators in other languages) due to lazy evaluation. /Joe On Nov 5, 2009, at 9:05 PM, Nathan M. Holden wrote: > If you have an if statement like > > if (a&&b) then fun else fun' > > and a is false, does GHC actually bother to check b? > _______________________________________________ > Beginners mailing list > Beginners@haskell.org > http://www.haskell.org/mailman/listinfo/beginners ------------------------------ Message: 9 Date: Thu, 5 Nov 2009 21:12:49 -0500 From: Keith Sheppard <keiths...@gmail.com> Subject: Re: [Haskell-beginners] if ands To: "Nathan M. Holden" <nathanmhol...@gmail.com>, beginners@haskell.org Message-ID: <92e42b740911051812j2042cf78pb6f587b604905...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Also, an nice way to check how evaluation works in ghci is to do something like: > if False && error "error here" then "it's true" else "it's false" This expression will evaluate as "it's false" without any "error here" error message appearing On Thu, Nov 5, 2009 at 9:05 PM, Nathan M. Holden <nathanmhol...@gmail.com> wrote: > If you have an if statement like > > if (a&&b) then fun else fun' > > and a is false, does GHC actually bother to check b? > _______________________________________________ > Beginners mailing list > Beginners@haskell.org > http://www.haskell.org/mailman/listinfo/beginners > -- keithsheppard.name ------------------------------ _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners End of Beginners Digest, Vol 17, Issue 5 ****************************************