Hi Clinton,
On 25/07/2022 05:15, Clinton Mead wrote:
I'm presuming (correct me if I'm wrong) that when I have an overloaded
string literal in my program, let's say it's overloaded to Text, the
conversion from [Char] -> Text happens at runtime, instead of what would
be ideal being that the compiler somehow does that conversion at compile
time and puts the raw bytes that represent the Text object in the
executable?
GHC can actually do a little bit better than this, because a literal is
stored in the binary as a CString# rather than a list of Char. Thus the
conversion doesn't need to follow a linked list, but it is still O(n).
If it does work like that, that's not a big deal. But let's say I have:
f :: Text -> Text -> Text
f x y = Text.concat ["Hello", x, "World", y]
This is desugared like so:
f :: Text -> Text -> Text
f x y = Text.concat [fromString "Hello", x, fromString "World", y]
Would there be a [Char] -> Text conversion from ['H', 'e', 'l', 'l',
'o'] -> Text "Hello" everytime 'f' is called, or is 'f'
rewritten something like this:
f :: Text -> Text -> Text
f x y = Text.concat [hello, x, world, y]
hello :: Text
hello = fromString "Hello"
world :: Text
world = fromString "World"
So the [Char] -> Text conversion is only done once and memomised?
This transformation is known as "let floating" or "full-laziness" and is
enabled with -O1. The easiest way to see this is to put this definition
in a module and compile it, like this:
$ ghc-9.2.2 M.hs -ddump-simpl -dno-typeable-binds -dsuppress-all -O0
The Core output takes a bit of decoding, but if you squint you can see
the definition of f is essentially unchanged:
f = \ x_arM y_arN ->
concat
(: (($fIsStringText `cast` <Co:2>) (unpackCString# "Hello"#))
(: x_arM
(: (($fIsStringText `cast` <Co:2>) (unpackCString# "World"#))
(: y_arN []))))
If you compile with -O1 instead you get:
f4 = "Hello"#
f3 = unpackCString# f4
f2 = "World"#
f1 = unpackCString# f2
f = \ x_arI y_arJ -> concat (: f3 (: x_arI (: f1 (: y_arJ []))))
which is pretty much what you wanted.
I guess this brings up the more general question, if I have this:
f x = h x (g (42 :: Int))
Can I rely on GHC rewriting it like this?
f x = h x y
y = g (42 :: Int)
Or is this a transformation I should do explicitly if I want to rely on it?
In this case, a similar experiment shows that GHC will float out the
function call (here I've used NOINLINE to prevent the function
applications being optimised out):
f x = h x (g 42)
{-# NOINLINE g #-}
g :: Int -> Int
g x = x
{-# NOINLINE h #-}
h :: Int -> Int -> Int
h x y = x * y
Compiling the above gives this Core:
f2 = I# 42#
f1 = g f2
f = \ x_atQ -> h x_atQ f1
Whether you can rely on this is a slightly different question though.
There are lots of heuristics in GHC's optimiser which mean that in more
complicated situations it will not always apply full-laziness as much as
possible. Moreover, full-laziness can introduce space leaks. My
colleague Edsko wrote a nice blog post a few years ago that explores
this in more detail:
https://www.well-typed.com/blog/2016/09/sharing-conduit/
Hope this helps,
Adam
--
Adam Gundry, Haskell Consultant
Well-Typed LLP, https://www.well-typed.com/
Registered in England & Wales, OC335890
27 Old Gloucester Street, London WC1N 3AX, England
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users