Hi!

while trying to make sure cross compilation with Template Haskell works properly
with 8.4, I ran into the following situation:

When serializing data types, e.g. `Name OccName NameFlavour` in the transmission
of Template Haskell Splice results from a 32bit arm device to the x86_64 host 
ghc.

We expect to see:
```
  .- 0 (first constructor)         .- 1  .- 4  (fifth constructor)             
.- 4                                 .- 3
  v                                v     v                                     
v                                    v
\NUL \NUL\NUL\NUL\NUL\NUL\NUL\NUL\SOHf \EOT \NUL 
\NUL\NUL\NUL\NUL\NUL\NUL\NUL\EOTmain \NUL\NUL\NUL\NUL\NUL\NUL\NUL\ETXTmp
                                            '--' 
'----------------------------------' '---------------------------------'
                                            NameSpace       PkgName             
                  ModName
     '-------------------------------' 
'--------------------------------------------------------------------------------'
                  OccName                       NameG NameSpace PkgName ModName 
:: NameFlavour
'-----------------------------------------------------------------------------------------------------------------------'
                                Name OccName NameFlavour :: Name
```

However, the `NameSpace` on the 32bit arm ends up being 8 bytes. Even though 
the full
`Namespace` data type can be fully serialized in a single byte.

The `binary` package tries to compute the size it needs for a generic data 
type, using
the following logic (from binary/src/Data/Binary/Generic.hs):

```
class SumSize f where
  sumSize :: Tagged f Word64

newtype Tagged (s :: * -> *) b = Tagged {unTagged :: b}

instance (SumSize a, SumSize b) => SumSize (a :+: b) where

sumSize = Tagged $ unTagged (sumSize :: Tagged a Word64) +
                   unTagged (sumSize :: Tagged b Word64)

instance SumSize (C1 c a) where
  sumSize = Tagged 1
```


Thus for a simple sum type `data X = A | B` we should get a `sumSize` of 2.
The arm32 device however ends up getting 2^33, because `sumSize :: Tagged a 
Word64`
and `sumSize :: Tagged b Word64` each end up being 2^32.

With some help from the nice folks in #ghc, I was able to conjure up the 
following
condensed test case:

```
{-# LANGUAGE DeriveGeneric, KindSignatures, PolyKinds, CPP,
    ScopedTypeVariables, TypeOperators, TypeSynonymInstances,
    FlexibleInstances  #-}
{-# OPTIONS_GHC -O2 #-}

import GHC.Generics

import Data.Word
import Debug.Trace

data X = A | B deriving (Show, Generic)

main :: IO ()
main = print (sumSize :: Tagged (Rep X))

-- like traceShowId, but allows us to prepend a message.
t :: Show a => String -> a -> a
#if TRACE
t msg x = traceShow (msg ++ show x) x
#else
t _ = id
#endif

class SumSize f where
    sumSize :: Tagged f

newtype Tagged (s :: * -> *) = Tagged {unTagged :: WORD} deriving Show

instance (SumSize a, SumSize b) => SumSize (a :+: b) where
    sumSize = t "SumSize (a :+: b): " $ Tagged $ unTagged (t "a :+: b => 
sumSize :: Tagged a: " $ sumSize :: Tagged a) +
                                                 unTagged (t "a :+: b => 
sumSize :: Tagged b: " $ sumSize :: Tagged b)

instance SumSize (C1 c a) where
    sumSize = t "SumSize (C1 c a): " $ Tagged 1

instance SumSize a => SumSize (M1 D c a) where
    sumSize = t "SumSize (M1 D c a): " $ Tagged . unTagged $ (sumSize :: Tagged 
a)

```

compiling this with `-DWORD=Word32 -DTRACE=1` yields the correct result (=2), 
with `-DWORD=Word64 -DTRACE=0` as well.
With `-DWORD=Word64 -DTRACE=2` the wrong result (=2^33)

Optimization flags seem not to play any role when everything is in a single 
module (as the test case).

As such I have attached the `-ddump-simple -dsuppress-all` files for the Word64 
and Word32 with TRACE=1, as well
as the diff between the Word32 and Word64 dump.

The output with WORD=Word32, TRACE=1 is:
```
"SumSize (C1 c a): Tagged {unTagged = 1}"
"a :+: b => sumSize :: Tagged a: Tagged {unTagged = 4294967296}"
"a :+: b => sumSize :: Tagged b: Tagged {unTagged = 4294967296}"
"SumSize (a :+: b): Tagged {unTagged = 8589934592}"
"SumSize (M1 D c a): Tagged {unTagged = 8589934592}"
Tagged {unTagged = 8589934592}
```

with WORD=Word64, TRACE=1 is:
```
"SumSize (C1 c a): Tagged {unTagged = 1}"
"a :+: b => sumSize :: Tagged a: Tagged {unTagged = 1}"
"a :+: b => sumSize :: Tagged b: Tagged {unTagged = 1}"
"SumSize (a :+: b): Tagged {unTagged = 2}"
"SumSize (M1 D c a): Tagged {unTagged = 2}"
Tagged {unTagged = 2}
```

Any help with this would be greatly appreciated!

Cheers,
 Moritz

PS: I'm not absolutely sure, but this might also be related to 
https://ghc.haskell.org/trac/ghc/ticket/13513

Attachment: Main.word32.dump-simpl
Description: Binary data

Attachment: Main.word64.dump-simpl
Description: Binary data

Attachment: Main.dump-simpl.diff
Description: Binary data

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply via email to