[Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Michael Vanier
Usually in monad tutorials, the = operator for the list monad is 
defined as:


m = k = concat (map k m)  -- or concatMap k m

but in the GHC sources it's defined as:

m  =  k  =  foldr  ((++)  .  k)  []  m

As far as I can tell, this definition is equivalent to the previous one 
(correct me if I'm wrong), so I was wondering why this definition was chosen 
instead of the other one.  Does anybody know?

Thanks in advance,

Mike






___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Ivan Lazar Miljenovic
On 16 May 2011 19:07, Michael Vanier mvanie...@gmail.com wrote:
 Usually in monad tutorials, the = operator for the list monad is defined
 as:

 m = k = concat (map k m)  -- or concatMap k m

 but in the GHC sources it's defined as:

 m  =  k  =  foldr  ((++)  .  k)  []  m

 As far as I can tell, this definition is equivalent to the previous one
 (correct me if I'm wrong), so I was wondering why this definition was chosen
 instead of the other one.  Does anybody know?

My guess is to aid the foldr fusion RULEs...

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Daniel Fischer
On Monday 16 May 2011 11:07:15, Michael Vanier wrote:
 Usually in monad tutorials, the = operator for the list monad is
 defined as:
 
 m = k = concat (map k m)  -- or concatMap k m
 
 but in the GHC sources it's defined as:
 
 m  =  k  =  foldr  ((++)  .  k)  []  m
 
 As far as I can tell, this definition is equivalent to the previous one

It is indeed, otherwise at least one of them would be wrong.

 (correct me if I'm wrong), so I was wondering why this definition was
 chosen instead of the other one.  Does anybody know?

I don't *know*, but I suspect it's for efficiency, writing

concat (map k m)

might not be unfolded enough to make foldr/build fusion fire in cases where 
it applies.
I'm just guessing, though.

 
 Thanks in advance,
 
 Mike

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Andrew Coppin

On 16/05/2011 10:07 AM, Michael Vanier wrote:

Usually in monad tutorials, the = operator for the list monad is
defined as:

m = k = concat (map k m) -- or concatMap k m

but in the GHC sources it's defined as:

m = k = foldr ((++) . k) [] m

As far as I can tell, this definition is equivalent to the previous one
(correct me if I'm wrong), so I was wondering why this definition was
chosen instead of the other one. Does anybody know?


Any time you see a more convoluted definition which ought to be 
equivilent to a simpler one, the answer is usually because this way 
makes some important compiler optimisation fire. It's even possible 
that the optimisation in question would fire anyway now, but way back 
when the code was written, the compiler wasn't as smart.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread austin seipp
Looking at the Core for an utterly trivial example (test x = concatMap
k x where k i = [i..i*2]), the foldr definition seems to cause a
little extra optimization rules to fire, but the result seems pretty
big. The definition using concatMap results in core like this:

main_go2 =
  \ (ds_aqV :: [Int]) -
case ds_aqV of _ {
  [] - [] @ Int;
  : y_ar0 ys_ar1 -
case y_ar0 of _ { I# x_arj -
let {
  y1_ase [Dmd=Just L] :: Int#

  y1_ase = *# x_arj 2 } in
let {
  n_sRv :: [Int]

  n_sRv = main_go2 ys_ar1 } in
case # x_arj y1_ase of _ {
  False -
letrec {
  go_sRx [Occ=LoopBreaker] :: Int# - [Int]

  go_sRx =
\ (x1_asi :: Int#) -
  :
@ Int
(I# x1_asi)
(case ==# x1_asi y1_ase of _ {
   False - go_sRx (+# x1_asi 1);
   True - n_sRv
 }); } in
go_sRx x_arj;
  True - n_sRv
}
}
}

But with the foldr definition, we get:

Main.main_go [Occ=LoopBreaker] :: GHC.Prim.Int# - [GHC.Types.Int]
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType L]
Main.main_go =
  \ (x_asu :: GHC.Prim.Int#) -
GHC.Base.++
  @ GHC.Types.Int
  (GHC.Enum.eftInt x_asu (GHC.Prim.*# x_asu 2))
  (case x_asu of wild_B1 {
 __DEFAULT - Main.main_go (GHC.Prim.+# wild_B1 1);
 10 - GHC.Types.[] @ GHC.Types.Int
   })
end Rec }

There seems to be a binding for my 'test' example, but it seems to be
abandoned in the final core for some reason (there are no references
too it, but it's not eliminated as an unused binding?) Main simply
calls the simplified/inlined version above.

As you can see, with the foldr definition, GHC is able to fuse the
iteration of the input list with the generation of the result - note
the 'GHC.Base.++' call with the first argument being a list from
[x..x*2], and the second list to append being a recursive call. So it
simplifies the code quite a bit - it doesn't really end up traversing
a list, but instead generating a list only in this case, as it stores
current iteration in a Int# and has the upper limit inlined into the
case statement.

I don't know why GHC doesn't have this rule by default, though. We can
at least rig it with a RULES pragma, however:

$ cat concatmap.hs
module Main where

{-# RULES
concatMap/foldr forall x k. concatMap k x = foldr ((++) . k) [] x
  #-}

test :: [Int] - [Int]
--test x = foldr ((++) . k) [] x
test x = concatMap k x
  where k i = [i..i*2]

main :: IO ()
main = do
  print $ test [1..10]

$ ghc -fforce-recomp -O2 -ddump-simpl concatmap.hs

   1 ↵
[1 of 1] Compiling Main ( concatmap.hs, concatmap.o )

 Tidy Core 
Rec {
Main.main_go [Occ=LoopBreaker] :: GHC.Prim.Int# - [GHC.Types.Int]
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType L]
Main.main_go =
  \ (x_ato :: GHC.Prim.Int#) -
GHC.Base.++
  @ GHC.Types.Int
  (GHC.Enum.eftInt x_ato (GHC.Prim.*# x_ato 2))
  (case x_ato of wild_B1 {
 __DEFAULT - Main.main_go (GHC.Prim.+# wild_B1 1);
 10 - GHC.Types.[] @ GHC.Types.Int
   })
end Rec }
...
...
...
-- Local rules for imported ids 
concatMap/foldr [ALWAYS]
forall {@ b_aq7 @ a_aq8 x_abH :: [a_aq8] k_abI :: a_aq8 - [b_aq7]}
  GHC.List.concatMap @ a_aq8 @ b_aq7 k_abI x_abH
  = GHC.Base.foldr
  @ a_aq8
  @ [b_aq7]
  (GHC.Base..
 @ [b_aq7]
 @ ([b_aq7] - [b_aq7])
 @ a_aq8
 (GHC.Base.++ @ b_aq7)
 k_abI)
  (GHC.Types.[] @ b_aq7)
  x_abH


Linking concatmap ...
$

Maybe it should be added to the base libraries?

On Mon, May 16, 2011 at 1:03 PM, Andrew Coppin
andrewcop...@btinternet.com wrote:
 On 16/05/2011 10:07 AM, Michael Vanier wrote:

 Usually in monad tutorials, the = operator for the list monad is
 defined as:

 m = k = concat (map k m) -- or concatMap k m

 but in the GHC sources it's defined as:

 m = k = foldr ((++) . k) [] m

 As far as I can tell, this definition is equivalent to the previous one
 (correct me if I'm wrong), so I was wondering why this definition was
 chosen instead of the other one. Does anybody know?

 Any time you see a more convoluted definition which ought to be equivilent
 to a simpler one, the answer is usually because this way makes some
 important compiler optimisation fire. It's even possible that the
 optimisation in question would fire anyway now, but way back when the code
 was written, the compiler wasn't as smart.

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe




-- 
Regards,
Austin

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org

Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Felipe Almeida Lessa
On Mon, May 16, 2011 at 3:49 PM, austin seipp a...@hacks.yi.org wrote:
 I don't know why GHC doesn't have this rule by default, though. We can
 at least rig it with a RULES pragma, however:

 $ cat concatmap.hs
 module Main where

 {-# RULES
 concatMap/foldr forall x k. concatMap k x = foldr ((++) . k) [] x
  #-}

Well, that's the definition of concatMap :) [1], at least since as
long as Hackage can go [2].  So the problem seems something else.

Cheers,

[1] 
http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/src/GHC-List.html#concatMap
[2] 
http://hackage.haskell.org/packages/archive/base/3.0.3.1/doc/html/src/GHC-List.html#concatMap

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Daniel Fischer
On Monday 16 May 2011 20:49:35, austin seipp wrote:
 Looking at the Core for an utterly trivial example (test x = concatMap
 k x where k i = [i..i*2]), the foldr definition seems to cause a
 little extra optimization rules to fire, but the result seems pretty
 big. The definition using concatMap results in core like this:
 

Hmm, something seems to be amiss, writing

test :: [Int] - [Int]
test x = concat (map k x)
  where
k :: Int - [Int]
k i = [i .. 2*i]

the core I get is

Rec {
ConcatMap.test_go [Occ=LoopBreaker]
  :: [GHC.Types.Int] - [GHC.Types.Int]
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType S]
ConcatMap.test_go =
  \ (ds_aoS :: [GHC.Types.Int]) -
case ds_aoS of _ {
  [] - GHC.Types.[] @ GHC.Types.Int;
  : y_aoX ys_aoY -
case y_aoX of _ { GHC.Types.I# x_aom -
GHC.Base.++
  @ GHC.Types.Int
  (GHC.Enum.eftInt x_aom (GHC.Prim.*# 2 x_aom))
  (ConcatMap.test_go ys_aoY)
}
}
end Rec }

which is identical to the core I get for foldr ((++) . k) [].
Writing concatMap, I get the larger core (I'm not sure which one's better, 
the concatMap core uses only (:) to build the result, that might make up 
for the larger code).

But, as Felipe noted, concatMap is defined as

-- | Map a function over a list and concatenate the results.
concatMap   :: (a - [b]) - [a] - [b]
concatMap f =  foldr ((++) . f) []

in GHC.List. There are no RULES or other pragmas involving concatMap either 
there or in Data.List. In the absence of such pragmas, I would expect 
concatMap to be inlined and thus to yield exactly the same core as
foldr ((++) . f) []

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread austin seipp
You're both right indeed - I didn't look for the definition of
concatMap in GHC.List.

I thought it could be some behavior with the new inliner, so I defined
concatMap in terms of foldr, put it in a seperate module, and then
included it and used it in my test:

Concatmap2.hs:
module Concatmap2 (concatMap2) where

concatMap2 :: (a - [b]) - [a] - [b]
concatMap2 f x = foldr ((++) . f) [] x

And concatmap.hs:

module Main where
import Concatmap2

test :: [Int] - [Int]
test x = concatMap2 k x
  where k i = [i..i*2]

main :: IO ()
main = do
  print $ test [1..10]

Initially I thought it might be something to do with the new inliner
heuristics (something about only inlining if call sites are 'fully
saturated' with the amount of arguments they explicitly take,) but
defining concatMap as a partial function or in terms of 'x' didn't
make a difference - both resulted in generating the longer version of
core.

Attaching an INLINEABLE pragma to the definition of concatMap2
however, causes its definition in the interface file (Concatmap2.hi)
to change, and it results in the core turning into the small version.
Compiling with the pragma causes the persisted version of concatMap2
in the iface file to change from:

8d333e8d08e5926fd304bc7152eb286d
  concatMap2 :: forall a b. (a - [b]) - [a] - [b]
{- Arity: 2, HasNoCafRefs, Strictness: LS,
   Unfolding: (\ @ a @ b f :: a - [b] x :: [a] -
   letrec {
 go :: [a] - [b] {- Arity: 1, Strictness: S -}
 = \ ds :: [a] -
   case @ [b] ds of wild {
 [] - GHC.Types.[] @ b : y ys - GHC.Base.++
@ b (f y) (go ys) }
   } in
   go x) -}

To:

075ec6b9bcabc12777955494312f5e61
  concatMap2 :: forall a b. (a - [b]) - [a] - [b]
{- Arity: 2, HasNoCafRefs, Strictness: LS,
   Inline: INLINABLE[ALWAYS],
   Unfolding: stable (\ @ a @ b f :: a - [b] x :: [a] -
GHC.Base.foldr
  @ a
  @ [b]
  (\ x1 :: a - GHC.Base.++ @ b (f x1))
  (GHC.Types.[] @ b)
  x) -}

Which I assume exposes the needed code (namely the foldr) for
additional RULES to fire later, resulting in the small code.

So perhaps we should mark concatMap INLINEABLE, instead?

On Mon, May 16, 2011 at 2:46 PM, Daniel Fischer
daniel.is.fisc...@googlemail.com wrote:
 On Monday 16 May 2011 20:49:35, austin seipp wrote:
 Looking at the Core for an utterly trivial example (test x = concatMap
 k x where k i = [i..i*2]), the foldr definition seems to cause a
 little extra optimization rules to fire, but the result seems pretty
 big. The definition using concatMap results in core like this:


 Hmm, something seems to be amiss, writing

 test :: [Int] - [Int]
 test x = concat (map k x)
  where
    k :: Int - [Int]
    k i = [i .. 2*i]

 the core I get is

 Rec {
 ConcatMap.test_go [Occ=LoopBreaker]
  :: [GHC.Types.Int] - [GHC.Types.Int]
 [GblId, Arity=1, Caf=NoCafRefs, Str=DmdType S]
 ConcatMap.test_go =
  \ (ds_aoS :: [GHC.Types.Int]) -
    case ds_aoS of _ {
      [] - GHC.Types.[] @ GHC.Types.Int;
      : y_aoX ys_aoY -
        case y_aoX of _ { GHC.Types.I# x_aom -
        GHC.Base.++
          @ GHC.Types.Int
          (GHC.Enum.eftInt x_aom (GHC.Prim.*# 2 x_aom))
          (ConcatMap.test_go ys_aoY)
        }
    }
 end Rec }

 which is identical to the core I get for foldr ((++) . k) [].
 Writing concatMap, I get the larger core (I'm not sure which one's better,
 the concatMap core uses only (:) to build the result, that might make up
 for the larger code).

 But, as Felipe noted, concatMap is defined as

 -- | Map a function over a list and concatenate the results.
 concatMap               :: (a - [b]) - [a] - [b]
 concatMap f             =  foldr ((++) . f) []

 in GHC.List. There are no RULES or other pragmas involving concatMap either
 there or in Data.List. In the absence of such pragmas, I would expect
 concatMap to be inlined and thus to yield exactly the same core as
 foldr ((++) . f) []




-- 
Regards,
Austin

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Daniel Fischer
On Monday 16 May 2011 20:49:35, austin seipp wrote:
 As you can see, with the foldr definition, GHC is able to fuse the
 iteration of the input list with the generation of the result - note
 the 'GHC.Base.++' call with the first argument being a list from
 [x..x*2], and the second list to append being a recursive call. So it
 simplifies the code quite a bit - it doesn't really end up traversing
 a list, but instead generating a list only in this case, as it stores
 current iteration in a Int# and has the upper limit inlined into the
 case statement.
 
 I don't know why GHC doesn't have this rule by default, though.

Probably because it's not good. The core it generates for concatMap is 
better. With the driver

module Main (main) where

import ConcatMap
import System.Environment (getArgs)

main :: IO ()
main = do
args - getArgs
let n = case args of
  (a:_) - read a
  _ - 12
print $ sum $ test [1 .. n]

and different definitions of test, I get:

./useFoldr 1 +RTS -s 
1933803664
   3,415,002,336 bytes allocated in the heap
 232,459,348 bytes copied during GC

  MUT   time1.55s  (  1.55s elapsed)
  GCtime0.80s  (  0.80s elapsed)
  Total time2.36s  (  2.35s elapsed)

(the same figures for concat (map k x) and for (x = k)) and

./useConcatMap 1 +RTS -s 
1933803664
   2,203,739,388 bytes allocated in the heap
 256,508 bytes copied during GC

  MUT   time0.88s  (  0.88s elapsed)
  GCtime0.03s  (  0.03s elapsed)
  Total time0.91s  (  0.91s elapsed)


I'm still totally at a loss, why GHC generates different code for concatMap 
and foldr ((++) . k) [], though.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] = definition for list monad in ghc

2011-05-16 Thread Daniel Fischer
On Monday 16 May 2011 22:26:18, I wrote:
 On Monday 16 May 2011 20:49:35, austin seipp wrote:
  As you can see, with the foldr definition, GHC is able to fuse the
  iteration of the input list with the generation of the result - note
  the 'GHC.Base.++' call with the first argument being a list from
  [x..x*2], and the second list to append being a recursive call. So it
  simplifies the code quite a bit - it doesn't really end up traversing
  a list, but instead generating a list only in this case, as it stores
  current iteration in a Int# and has the upper limit inlined into the
  case statement.
  
  I don't know why GHC doesn't have this rule by default, though.
 
 Probably because it's not good. The core it generates for concatMap is
 better. ...

Aw, but that's some black magic which only works under special 
circumstances. You need nice types and a nice k for that to work out so 
neatly. Give it less to work on (a less simple k, for example) and you get 
the same for concatMap k, concat . map k and for foldr ((++) . k) [].

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe