[Haskell-cafe] = definition for list monad in ghc
Usually in monad tutorials, the = operator for the list monad is defined as: m = k = concat (map k m) -- or concatMap k m but in the GHC sources it's defined as: m = k = foldr ((++) . k) [] m As far as I can tell, this definition is equivalent to the previous one (correct me if I'm wrong), so I was wondering why this definition was chosen instead of the other one. Does anybody know? Thanks in advance, Mike ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
On 16 May 2011 19:07, Michael Vanier mvanie...@gmail.com wrote: Usually in monad tutorials, the = operator for the list monad is defined as: m = k = concat (map k m) -- or concatMap k m but in the GHC sources it's defined as: m = k = foldr ((++) . k) [] m As far as I can tell, this definition is equivalent to the previous one (correct me if I'm wrong), so I was wondering why this definition was chosen instead of the other one. Does anybody know? My guess is to aid the foldr fusion RULEs... -- Ivan Lazar Miljenovic ivan.miljeno...@gmail.com IvanMiljenovic.wordpress.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
On Monday 16 May 2011 11:07:15, Michael Vanier wrote: Usually in monad tutorials, the = operator for the list monad is defined as: m = k = concat (map k m) -- or concatMap k m but in the GHC sources it's defined as: m = k = foldr ((++) . k) [] m As far as I can tell, this definition is equivalent to the previous one It is indeed, otherwise at least one of them would be wrong. (correct me if I'm wrong), so I was wondering why this definition was chosen instead of the other one. Does anybody know? I don't *know*, but I suspect it's for efficiency, writing concat (map k m) might not be unfolded enough to make foldr/build fusion fire in cases where it applies. I'm just guessing, though. Thanks in advance, Mike ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
On 16/05/2011 10:07 AM, Michael Vanier wrote: Usually in monad tutorials, the = operator for the list monad is defined as: m = k = concat (map k m) -- or concatMap k m but in the GHC sources it's defined as: m = k = foldr ((++) . k) [] m As far as I can tell, this definition is equivalent to the previous one (correct me if I'm wrong), so I was wondering why this definition was chosen instead of the other one. Does anybody know? Any time you see a more convoluted definition which ought to be equivilent to a simpler one, the answer is usually because this way makes some important compiler optimisation fire. It's even possible that the optimisation in question would fire anyway now, but way back when the code was written, the compiler wasn't as smart. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
Looking at the Core for an utterly trivial example (test x = concatMap k x where k i = [i..i*2]), the foldr definition seems to cause a little extra optimization rules to fire, but the result seems pretty big. The definition using concatMap results in core like this: main_go2 = \ (ds_aqV :: [Int]) - case ds_aqV of _ { [] - [] @ Int; : y_ar0 ys_ar1 - case y_ar0 of _ { I# x_arj - let { y1_ase [Dmd=Just L] :: Int# y1_ase = *# x_arj 2 } in let { n_sRv :: [Int] n_sRv = main_go2 ys_ar1 } in case # x_arj y1_ase of _ { False - letrec { go_sRx [Occ=LoopBreaker] :: Int# - [Int] go_sRx = \ (x1_asi :: Int#) - : @ Int (I# x1_asi) (case ==# x1_asi y1_ase of _ { False - go_sRx (+# x1_asi 1); True - n_sRv }); } in go_sRx x_arj; True - n_sRv } } } But with the foldr definition, we get: Main.main_go [Occ=LoopBreaker] :: GHC.Prim.Int# - [GHC.Types.Int] [GblId, Arity=1, Caf=NoCafRefs, Str=DmdType L] Main.main_go = \ (x_asu :: GHC.Prim.Int#) - GHC.Base.++ @ GHC.Types.Int (GHC.Enum.eftInt x_asu (GHC.Prim.*# x_asu 2)) (case x_asu of wild_B1 { __DEFAULT - Main.main_go (GHC.Prim.+# wild_B1 1); 10 - GHC.Types.[] @ GHC.Types.Int }) end Rec } There seems to be a binding for my 'test' example, but it seems to be abandoned in the final core for some reason (there are no references too it, but it's not eliminated as an unused binding?) Main simply calls the simplified/inlined version above. As you can see, with the foldr definition, GHC is able to fuse the iteration of the input list with the generation of the result - note the 'GHC.Base.++' call with the first argument being a list from [x..x*2], and the second list to append being a recursive call. So it simplifies the code quite a bit - it doesn't really end up traversing a list, but instead generating a list only in this case, as it stores current iteration in a Int# and has the upper limit inlined into the case statement. I don't know why GHC doesn't have this rule by default, though. We can at least rig it with a RULES pragma, however: $ cat concatmap.hs module Main where {-# RULES concatMap/foldr forall x k. concatMap k x = foldr ((++) . k) [] x #-} test :: [Int] - [Int] --test x = foldr ((++) . k) [] x test x = concatMap k x where k i = [i..i*2] main :: IO () main = do print $ test [1..10] $ ghc -fforce-recomp -O2 -ddump-simpl concatmap.hs 1 ↵ [1 of 1] Compiling Main ( concatmap.hs, concatmap.o ) Tidy Core Rec { Main.main_go [Occ=LoopBreaker] :: GHC.Prim.Int# - [GHC.Types.Int] [GblId, Arity=1, Caf=NoCafRefs, Str=DmdType L] Main.main_go = \ (x_ato :: GHC.Prim.Int#) - GHC.Base.++ @ GHC.Types.Int (GHC.Enum.eftInt x_ato (GHC.Prim.*# x_ato 2)) (case x_ato of wild_B1 { __DEFAULT - Main.main_go (GHC.Prim.+# wild_B1 1); 10 - GHC.Types.[] @ GHC.Types.Int }) end Rec } ... ... ... -- Local rules for imported ids concatMap/foldr [ALWAYS] forall {@ b_aq7 @ a_aq8 x_abH :: [a_aq8] k_abI :: a_aq8 - [b_aq7]} GHC.List.concatMap @ a_aq8 @ b_aq7 k_abI x_abH = GHC.Base.foldr @ a_aq8 @ [b_aq7] (GHC.Base.. @ [b_aq7] @ ([b_aq7] - [b_aq7]) @ a_aq8 (GHC.Base.++ @ b_aq7) k_abI) (GHC.Types.[] @ b_aq7) x_abH Linking concatmap ... $ Maybe it should be added to the base libraries? On Mon, May 16, 2011 at 1:03 PM, Andrew Coppin andrewcop...@btinternet.com wrote: On 16/05/2011 10:07 AM, Michael Vanier wrote: Usually in monad tutorials, the = operator for the list monad is defined as: m = k = concat (map k m) -- or concatMap k m but in the GHC sources it's defined as: m = k = foldr ((++) . k) [] m As far as I can tell, this definition is equivalent to the previous one (correct me if I'm wrong), so I was wondering why this definition was chosen instead of the other one. Does anybody know? Any time you see a more convoluted definition which ought to be equivilent to a simpler one, the answer is usually because this way makes some important compiler optimisation fire. It's even possible that the optimisation in question would fire anyway now, but way back when the code was written, the compiler wasn't as smart. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe -- Regards, Austin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org
Re: [Haskell-cafe] = definition for list monad in ghc
On Mon, May 16, 2011 at 3:49 PM, austin seipp a...@hacks.yi.org wrote: I don't know why GHC doesn't have this rule by default, though. We can at least rig it with a RULES pragma, however: $ cat concatmap.hs module Main where {-# RULES concatMap/foldr forall x k. concatMap k x = foldr ((++) . k) [] x #-} Well, that's the definition of concatMap :) [1], at least since as long as Hackage can go [2]. So the problem seems something else. Cheers, [1] http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/src/GHC-List.html#concatMap [2] http://hackage.haskell.org/packages/archive/base/3.0.3.1/doc/html/src/GHC-List.html#concatMap -- Felipe. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
On Monday 16 May 2011 20:49:35, austin seipp wrote: Looking at the Core for an utterly trivial example (test x = concatMap k x where k i = [i..i*2]), the foldr definition seems to cause a little extra optimization rules to fire, but the result seems pretty big. The definition using concatMap results in core like this: Hmm, something seems to be amiss, writing test :: [Int] - [Int] test x = concat (map k x) where k :: Int - [Int] k i = [i .. 2*i] the core I get is Rec { ConcatMap.test_go [Occ=LoopBreaker] :: [GHC.Types.Int] - [GHC.Types.Int] [GblId, Arity=1, Caf=NoCafRefs, Str=DmdType S] ConcatMap.test_go = \ (ds_aoS :: [GHC.Types.Int]) - case ds_aoS of _ { [] - GHC.Types.[] @ GHC.Types.Int; : y_aoX ys_aoY - case y_aoX of _ { GHC.Types.I# x_aom - GHC.Base.++ @ GHC.Types.Int (GHC.Enum.eftInt x_aom (GHC.Prim.*# 2 x_aom)) (ConcatMap.test_go ys_aoY) } } end Rec } which is identical to the core I get for foldr ((++) . k) []. Writing concatMap, I get the larger core (I'm not sure which one's better, the concatMap core uses only (:) to build the result, that might make up for the larger code). But, as Felipe noted, concatMap is defined as -- | Map a function over a list and concatenate the results. concatMap :: (a - [b]) - [a] - [b] concatMap f = foldr ((++) . f) [] in GHC.List. There are no RULES or other pragmas involving concatMap either there or in Data.List. In the absence of such pragmas, I would expect concatMap to be inlined and thus to yield exactly the same core as foldr ((++) . f) [] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
You're both right indeed - I didn't look for the definition of concatMap in GHC.List. I thought it could be some behavior with the new inliner, so I defined concatMap in terms of foldr, put it in a seperate module, and then included it and used it in my test: Concatmap2.hs: module Concatmap2 (concatMap2) where concatMap2 :: (a - [b]) - [a] - [b] concatMap2 f x = foldr ((++) . f) [] x And concatmap.hs: module Main where import Concatmap2 test :: [Int] - [Int] test x = concatMap2 k x where k i = [i..i*2] main :: IO () main = do print $ test [1..10] Initially I thought it might be something to do with the new inliner heuristics (something about only inlining if call sites are 'fully saturated' with the amount of arguments they explicitly take,) but defining concatMap as a partial function or in terms of 'x' didn't make a difference - both resulted in generating the longer version of core. Attaching an INLINEABLE pragma to the definition of concatMap2 however, causes its definition in the interface file (Concatmap2.hi) to change, and it results in the core turning into the small version. Compiling with the pragma causes the persisted version of concatMap2 in the iface file to change from: 8d333e8d08e5926fd304bc7152eb286d concatMap2 :: forall a b. (a - [b]) - [a] - [b] {- Arity: 2, HasNoCafRefs, Strictness: LS, Unfolding: (\ @ a @ b f :: a - [b] x :: [a] - letrec { go :: [a] - [b] {- Arity: 1, Strictness: S -} = \ ds :: [a] - case @ [b] ds of wild { [] - GHC.Types.[] @ b : y ys - GHC.Base.++ @ b (f y) (go ys) } } in go x) -} To: 075ec6b9bcabc12777955494312f5e61 concatMap2 :: forall a b. (a - [b]) - [a] - [b] {- Arity: 2, HasNoCafRefs, Strictness: LS, Inline: INLINABLE[ALWAYS], Unfolding: stable (\ @ a @ b f :: a - [b] x :: [a] - GHC.Base.foldr @ a @ [b] (\ x1 :: a - GHC.Base.++ @ b (f x1)) (GHC.Types.[] @ b) x) -} Which I assume exposes the needed code (namely the foldr) for additional RULES to fire later, resulting in the small code. So perhaps we should mark concatMap INLINEABLE, instead? On Mon, May 16, 2011 at 2:46 PM, Daniel Fischer daniel.is.fisc...@googlemail.com wrote: On Monday 16 May 2011 20:49:35, austin seipp wrote: Looking at the Core for an utterly trivial example (test x = concatMap k x where k i = [i..i*2]), the foldr definition seems to cause a little extra optimization rules to fire, but the result seems pretty big. The definition using concatMap results in core like this: Hmm, something seems to be amiss, writing test :: [Int] - [Int] test x = concat (map k x) where k :: Int - [Int] k i = [i .. 2*i] the core I get is Rec { ConcatMap.test_go [Occ=LoopBreaker] :: [GHC.Types.Int] - [GHC.Types.Int] [GblId, Arity=1, Caf=NoCafRefs, Str=DmdType S] ConcatMap.test_go = \ (ds_aoS :: [GHC.Types.Int]) - case ds_aoS of _ { [] - GHC.Types.[] @ GHC.Types.Int; : y_aoX ys_aoY - case y_aoX of _ { GHC.Types.I# x_aom - GHC.Base.++ @ GHC.Types.Int (GHC.Enum.eftInt x_aom (GHC.Prim.*# 2 x_aom)) (ConcatMap.test_go ys_aoY) } } end Rec } which is identical to the core I get for foldr ((++) . k) []. Writing concatMap, I get the larger core (I'm not sure which one's better, the concatMap core uses only (:) to build the result, that might make up for the larger code). But, as Felipe noted, concatMap is defined as -- | Map a function over a list and concatenate the results. concatMap :: (a - [b]) - [a] - [b] concatMap f = foldr ((++) . f) [] in GHC.List. There are no RULES or other pragmas involving concatMap either there or in Data.List. In the absence of such pragmas, I would expect concatMap to be inlined and thus to yield exactly the same core as foldr ((++) . f) [] -- Regards, Austin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
On Monday 16 May 2011 20:49:35, austin seipp wrote: As you can see, with the foldr definition, GHC is able to fuse the iteration of the input list with the generation of the result - note the 'GHC.Base.++' call with the first argument being a list from [x..x*2], and the second list to append being a recursive call. So it simplifies the code quite a bit - it doesn't really end up traversing a list, but instead generating a list only in this case, as it stores current iteration in a Int# and has the upper limit inlined into the case statement. I don't know why GHC doesn't have this rule by default, though. Probably because it's not good. The core it generates for concatMap is better. With the driver module Main (main) where import ConcatMap import System.Environment (getArgs) main :: IO () main = do args - getArgs let n = case args of (a:_) - read a _ - 12 print $ sum $ test [1 .. n] and different definitions of test, I get: ./useFoldr 1 +RTS -s 1933803664 3,415,002,336 bytes allocated in the heap 232,459,348 bytes copied during GC MUT time1.55s ( 1.55s elapsed) GCtime0.80s ( 0.80s elapsed) Total time2.36s ( 2.35s elapsed) (the same figures for concat (map k x) and for (x = k)) and ./useConcatMap 1 +RTS -s 1933803664 2,203,739,388 bytes allocated in the heap 256,508 bytes copied during GC MUT time0.88s ( 0.88s elapsed) GCtime0.03s ( 0.03s elapsed) Total time0.91s ( 0.91s elapsed) I'm still totally at a loss, why GHC generates different code for concatMap and foldr ((++) . k) [], though. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] = definition for list monad in ghc
On Monday 16 May 2011 22:26:18, I wrote: On Monday 16 May 2011 20:49:35, austin seipp wrote: As you can see, with the foldr definition, GHC is able to fuse the iteration of the input list with the generation of the result - note the 'GHC.Base.++' call with the first argument being a list from [x..x*2], and the second list to append being a recursive call. So it simplifies the code quite a bit - it doesn't really end up traversing a list, but instead generating a list only in this case, as it stores current iteration in a Int# and has the upper limit inlined into the case statement. I don't know why GHC doesn't have this rule by default, though. Probably because it's not good. The core it generates for concatMap is better. ... Aw, but that's some black magic which only works under special circumstances. You need nice types and a nice k for that to work out so neatly. Give it less to work on (a less simple k, for example) and you get the same for concatMap k, concat . map k and for foldr ((++) . k) []. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe