Re: Compilation of Overloaded strings and saving constant results

Adam Gundry Mon, 25 Jul 2022 12:54:22 -0700

Hi Clinton,


On 25/07/2022 05:15, Clinton Mead wrote:

I'm presuming (correct me if I'm wrong) that when I have an overloadedstring literal in my program, let's say it's overloaded to Text, theconversion from [Char] -> Text happens at runtime, instead of what wouldbe ideal being that the compiler somehow does that conversion at compiletime and puts the raw bytes that represent the Text object in theexecutable?

GHC can actually do a little bit better than this, because a literal isstored in the binary as a CString# rather than a list of Char. Thus theconversion doesn't need to follow a linked list, but it is still O(n).

If it does work like that, that's not a big deal. But let's say I have:

f :: Text -> Text -> Text
f x y = Text.concat ["Hello", x, "World", y]

This is desugared like so:

f :: Text -> Text -> Text
f x y = Text.concat [fromString "Hello", x, fromString "World", y]

Would there be a [Char] -> Text conversion from ['H', 'e', 'l', 'l','o'] -> Text "Hello" everytime 'f' is called, or is 'f'rewritten something like this:


f :: Text -> Text -> Text
f x y = Text.concat [hello, x, world, y]

hello :: Text
hello = fromString "Hello"

world :: Text
world = fromString "World"

So the [Char] -> Text conversion is only done once and memomised?

This transformation is known as "let floating" or "full-laziness" and isenabled with -O1. The easiest way to see this is to put this definitionin a module and compile it, like this:


$ ghc-9.2.2 M.hs -ddump-simpl -dno-typeable-binds -dsuppress-all -O0

The Core output takes a bit of decoding, but if you squint you can seethe definition of f is essentially unchanged:


f = \ x_arM y_arN ->
      concat
        (: (($fIsStringText `cast` <Co:2>) (unpackCString# "Hello"#))
           (: x_arM
              (: (($fIsStringText `cast` <Co:2>) (unpackCString# "World"#))
                 (: y_arN []))))

If you compile with -O1 instead you get:

f4 = "Hello"#
f3 = unpackCString# f4
f2 = "World"#
f1 = unpackCString# f2
f = \ x_arI y_arJ -> concat (: f3 (: x_arI (: f1 (: y_arJ []))))

which is pretty much what you wanted.

I guess this brings up the more general question, if I have this:

f x = h x (g (42 :: Int))

Can I rely on GHC rewriting it like this?

f x = h x y
y = g (42 :: Int)

Or is this a transformation I should do explicitly if I want to rely on it?

In this case, a similar experiment shows that GHC will float out thefunction call (here I've used NOINLINE to prevent the functionapplications being optimised out):


f x = h x (g 42)

{-# NOINLINE g #-}
g :: Int -> Int
g x = x

{-# NOINLINE h #-}
h :: Int -> Int -> Int
h x y = x * y

Compiling the above gives this Core:

f2 = I# 42#
f1 = g f2
f = \ x_atQ -> h x_atQ f1

Whether you can rely on this is a slightly different question though.There are lots of heuristics in GHC's optimiser which mean that in morecomplicated situations it will not always apply full-laziness as much aspossible. Moreover, full-laziness can introduce space leaks. Mycolleague Edsko wrote a nice blog post a few years ago that exploresthis in more detail:https://www.well-typed.com/blog/2016/09/sharing-conduit/


Hope this helps,

Adam


--
Adam Gundry, Haskell Consultant
Well-Typed LLP, https://www.well-typed.com/

Registered in England & Wales, OC335890
27 Old Gloucester Street, London WC1N 3AX, England
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

Re: Compilation of Overloaded strings and saving constant results

Reply via email to