Re: Pointer-or-Int 63-bit representations for Integer

Sylvain Henry Mon, 08 Mar 2021 10:33:26 -0800

Hi Chris,

It has been considered in the past. There are some traces in the wiki:https://gitlab.haskell.org/ghc/ghc/-/wikis/replacing-gmp-notes

>> The suggestion discussed by John Meacham<http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010670.html>,Lennart Augustsson<http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010664.html>,Simon Marlow<http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010677.html>and Bulat Ziganshin<http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010687.html>was to change the representation of Integer so the Int# does the work ofS# and J#: the Int# could be either a pointer to the Bignum libraryarray of limbs or, if the number of significant digits could fit intosay, 31 bits, to use the extra bit as an indicator of that fact and holdthe entire value in the Int#, thereby saving the memory from S# and J#.

It's not trivial because it requires a new runtime representation thatis dynamically boxed or not.

> An unboxed sum might be an improvement? e.g. (# Int# | ByteArray# #)-- would this "kind of" approximate the approach described? I don't havea good intuition of what the memory layout would be like.

After the unariser pass, the unboxed sum becomes an unboxed tuple: (#Int# {-tag-}, Int#, ByteArray# #)

The two fields don't overlap because they don't have the same slot type.

In my early experiments before implementing ghc-bignum, performance gotworse in some cases with this encoding iirc. It may be worth checkingagain if someone has time to do it :). Nowadays it should be easier aswe can define pattern synonyms with INLINE pragmas to replace Integer'sconstructors.

Another issue we have with Integer/Natural is that we have to mark mostoperations NOINLINE to support constant-folding. To be fair benchmarksshould take this into account.


Cheers,
Sylvain


On 08/03/2021 18:13, Chris Done wrote:

Hi all,
In OCaml's implementation, they use a well known 63-bit representationof ints to distinguish whether a given machine word is either apointer or to be interpreted as an integer.
I was wondering whether anyone had considered the performance benefitsof doing this for the venerable Integer type in Haskell? I.e. if theInt fits in 63-bits, just shift it and do regular arithmetic. If theresult ever exceeds 63-bits, allocate a GMP/integer-simple integer andreturn a pointer to it. This way, for most applications--in myexperience--integers don't really ever exceed 64-bit, so you would(possibly) pay a smaller cost than the pointer chasing involved inbignum arithmetic. Assumption: it's cheaper to do more CPUinstructions than to allocate or wait for mainline memory.
This would need assistance from the GC to be able to recognize saidbit flag.
As I understand the current implementation of integer-gimp, they alsotry to use an Int64 where possible using a constructor(https://hackage.haskell.org/package/integer-gmp-1.0.3.0/docs/src/GHC.Integer.Type.html#Integer<https://hackage.haskell.org/package/integer-gmp-1.0.3.0/docs/src/GHC.Integer.Type.html#Integer>),but I believe that the compiled code will still pointer chase throughthe constructor. Simple addition or subtraction, for example, is 24times slower in Integer than in Int for 1000000 iterations:
https://github.com/haskell-perf/numbers#addition<https://github.com/haskell-perf/numbers#addition>
An unboxed sum might be an improvement? e.g. (# Int# | ByteArray# #)-- would this "kind of" approximate the approach described? I don'thave a good intuition of what the memory layout would be like.
Just pondering.

Cheers,

Chris

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: Pointer-or-Int 63-bit representations for Integer

Reply via email to