Hi Clinton,

On 25/07/2022 05:15, Clinton Mead wrote:
I'm presuming (correct me if I'm wrong) that when I have an overloaded string literal in my program, let's say it's overloaded to Text, the conversion from [Char] -> Text happens at runtime, instead of what would be ideal being that the compiler somehow does that conversion at compile time and puts the raw bytes that represent the Text object in the executable?

GHC can actually do a little bit better than this, because a literal is stored in the binary as a CString# rather than a list of Char. Thus the conversion doesn't need to follow a linked list, but it is still O(n).


If it does work like that, that's not a big deal. But let's say I have:

f :: Text -> Text -> Text
f x y = Text.concat ["Hello", x, "World", y]

This is desugared like so:

f :: Text -> Text -> Text
f x y = Text.concat [fromString "Hello", x, fromString "World", y]

Would there be a [Char] -> Text conversion from ['H', 'e', 'l', 'l', 'o'] -> Text "Hello" everytime 'f' is called, or is 'f' rewritten something like this:

f :: Text -> Text -> Text
f x y = Text.concat [hello, x, world, y]

hello :: Text
hello = fromString "Hello"

world :: Text
world = fromString "World"

So the [Char] -> Text conversion is only done once and memomised?

This transformation is known as "let floating" or "full-laziness" and is enabled with -O1. The easiest way to see this is to put this definition in a module and compile it, like this:

$ ghc-9.2.2 M.hs -ddump-simpl -dno-typeable-binds -dsuppress-all -O0

The Core output takes a bit of decoding, but if you squint you can see the definition of f is essentially unchanged:

f = \ x_arM y_arN ->
      concat
        (: (($fIsStringText `cast` <Co:2>) (unpackCString# "Hello"#))
           (: x_arM
              (: (($fIsStringText `cast` <Co:2>) (unpackCString# "World"#))
                 (: y_arN []))))

If you compile with -O1 instead you get:

f4 = "Hello"#
f3 = unpackCString# f4
f2 = "World"#
f1 = unpackCString# f2
f = \ x_arI y_arJ -> concat (: f3 (: x_arI (: f1 (: y_arJ []))))

which is pretty much what you wanted.


I guess this brings up the more general question, if I have this:

f x = h x (g (42 :: Int))

Can I rely on GHC rewriting it like this?

f x = h x y
y = g (42 :: Int)

Or is this a transformation I should do explicitly if I want to rely on it?

In this case, a similar experiment shows that GHC will float out the function call (here I've used NOINLINE to prevent the function applications being optimised out):

f x = h x (g 42)

{-# NOINLINE g #-}
g :: Int -> Int
g x = x

{-# NOINLINE h #-}
h :: Int -> Int -> Int
h x y = x * y

Compiling the above gives this Core:

f2 = I# 42#
f1 = g f2
f = \ x_atQ -> h x_atQ f1

Whether you can rely on this is a slightly different question though. There are lots of heuristics in GHC's optimiser which mean that in more complicated situations it will not always apply full-laziness as much as possible. Moreover, full-laziness can introduce space leaks. My colleague Edsko wrote a nice blog post a few years ago that explores this in more detail: https://www.well-typed.com/blog/2016/09/sharing-conduit/

Hope this helps,

Adam


--
Adam Gundry, Haskell Consultant
Well-Typed LLP, https://www.well-typed.com/

Registered in England & Wales, OC335890
27 Old Gloucester Street, London WC1N 3AX, England
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

Reply via email to