Re: WordX/IntX wrap Word#/Int#?

2017-06-15 Thread Simon Marlow
On 11 June 2017 at 22:44, Joachim Breitner  wrote:

> Hi,
>
> Am Sonntag, den 11.06.2017, 10:44 -0400 schrieb Ben Gamari:
> > This is certainly one consideration. Another is that you would also
> > need to teach the garbage collector to understand closures with sub-
> > word-size fields. Currently we can encode whether each field of a
> > closure is a pointer or not with a simple bitmap. If we naively
> > allowed smaller fields we would need to increase the granularity of
> > this representation to encode bytes.
> >
> > Of course, one way to work around this would be to impose an
> > invariant that guarantees that pointers are always word-aligned. Then
> > we would probably want to shuffle sub-word sized fields, allowing two
> > Word16s to inhabit a single word.
>
> that is not an issue; we already sort field into pointers first, and
> non-pointers later. So all pointers are at the beginning and nicely
> aligned, and all the non-pointer data can follow in whatever weird
> format. The GC only needs to know how many words in total are used by
> the non-pointer data.
>

But the compiler has no support for sub-word-sized fields yet.  I made a
partial patch to support it a while ago: https://phabricator.haskell.org/D38


Cheers
Simon


Greetings,
> Joachim
> --
> Joachim “nomeata” Breitner
>   m...@joachim-breitner.de • https://www.joachim-breitner.de/
>   XMPP: nome...@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F
>   Debian Developer: nome...@debian.org
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: WordX/IntX wrap Word#/Int#?

2017-06-13 Thread Michal Terepeta
Just for the record, I've opened:
https://ghc.haskell.org/trac/ghc/ticket/13825
to track this.

Cheers,
Michal

On Mon, Jun 12, 2017 at 8:45 PM Michal Terepeta 
wrote:

> Thanks a lot for the replies & links!
>
> I'll try to finish Simon's diff (and probably ask silly questions if I get
> stuck ;)
>
> Cheers,
> Michal
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: WordX/IntX wrap Word#/Int#?

2017-06-12 Thread Michal Terepeta
Thanks a lot for the replies & links!

I'll try to finish Simon's diff (and probably ask silly questions if I get
stuck ;)

Cheers,
Michal
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: WordX/IntX wrap Word#/Int#?

2017-06-11 Thread Ben Gamari
Joachim Breitner  writes:

> Hi,
>
> Am Sonntag, den 11.06.2017, 10:44 -0400 schrieb Ben Gamari:
>> This is certainly one consideration. Another is that you would also
>> need to teach the garbage collector to understand closures with sub-
>> word-size fields. Currently we can encode whether each field of a
>> closure is a pointer or not with a simple bitmap. If we naively
>> allowed smaller fields we would need to increase the granularity of
>> this representation to encode bytes.
>> 
>> Of course, one way to work around this would be to impose an
>> invariant that guarantees that pointers are always word-aligned. Then
>> we would probably want to shuffle sub-word sized fields, allowing two
>> Word16s to inhabit a single word.
>
> that is not an issue; we already sort field into pointers first, and
> non-pointers later. So all pointers are at the beginning and nicely
> aligned, and all the non-pointer data can follow in whatever weird
> format. The GC only needs to know how many words in total are used by
> the non-pointer data.
>
Ahh, great point. I stand corrected.

Cheers,

- Ben



signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: WordX/IntX wrap Word#/Int#?

2017-06-11 Thread Joachim Breitner
Hi,

Am Sonntag, den 11.06.2017, 10:44 -0400 schrieb Ben Gamari:
> This is certainly one consideration. Another is that you would also
> need to teach the garbage collector to understand closures with sub-
> word-size fields. Currently we can encode whether each field of a
> closure is a pointer or not with a simple bitmap. If we naively
> allowed smaller fields we would need to increase the granularity of
> this representation to encode bytes.
> 
> Of course, one way to work around this would be to impose an
> invariant that guarantees that pointers are always word-aligned. Then
> we would probably want to shuffle sub-word sized fields, allowing two
> Word16s to inhabit a single word.

that is not an issue; we already sort field into pointers first, and
non-pointers later. So all pointers are at the beginning and nicely
aligned, and all the non-pointer data can follow in whatever weird
format. The GC only needs to know how many words in total are used by
the non-pointer data.


Greetings,
Joachim
-- 
Joachim “nomeata” Breitner
  m...@joachim-breitner.de • https://www.joachim-breitner.de/
  XMPP: nome...@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F
  Debian Developer: nome...@debian.org

signature.asc
Description: This is a digitally signed message part
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: WordX/IntX wrap Word#/Int#?

2017-06-11 Thread Ben Gamari


On June 11, 2017 8:03:10 AM EDT, Michal Terepeta  
wrote:
>Hi all,
>
>I've just noticed that all `WordX` (and `IntX`) data types are
>actually implemented as wrappers around `Word#` (and `Int#`). This
>probably doesn't matter much if it's stored on the heap (due to
>pointer indirection and heap alignment), but it also means that:
>```
>data Foo = Foo {-# UNPACK #-} !Word8 {-# UNPACK #-} !Int8
>```
>will actually take *a lot* of space: on 64 bit we'd need 8 bytes for
>header, 8 bytes for `Word8`, 8 bytes for `Int8`.
>
>Is there any reason for this? The only thing I can see is that this
>avoids having to add things like `Word8#` primitives into the
>compiler. (also the codegen would need to emit zero-extend moves when
>loading from memory, like `movzb{l,q}`)
>
This is certainly one consideration. Another is that you would also need to 
teach the garbage collector to understand closures with sub-word-size fields. 
Currently we can encode whether each field of a closure is a pointer or not 
with a simple bitmap. If we naively allowed smaller fields we would need to 
increase the granularity of this representation to encode bytes.

Of course, one way to work around this would be to impose an invariant that 
guarantees that pointers are always word-aligned. Then we would probably want 
to shuffle sub-word sized fields, allowing two Word16s to inhabit a single word.

As you mention, this would no doubt require a bit of engineering. In 
particular, while x86 has robust support for sub-word-size operations, I don't 
believe all the platforms we support do. I these cases we would need to 
perform, for instance, aligned word-sized loads and stores and mask as 
appropriate. I may be wrong, however.

Another consideration is that the byte code interpreter would need to learn to 
understand these closures.

Regardless, Simon Marlow began some work in this direction a few years ago. 
There is a mostly complete patch in D38. All it needs is rebasing, fixing of 
the byte code interpreter, and then perhaps introduction of Word8# and friends. 
I think it would be great if we could make our heap representation a bit more 
space-conscious. Perhaps you could open a ticket so we collect these tidbits?

Another somewhat related issue that would be good think about in parallel to 
this issue is the treatment of the word-sized dependence of Word. See #11953.

Cheers,

- Ben


-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


WordX/IntX wrap Word#/Int#?

2017-06-11 Thread Michal Terepeta
Hi all,

I've just noticed that all `WordX` (and `IntX`) data types are
actually implemented as wrappers around `Word#` (and `Int#`). This
probably doesn't matter much if it's stored on the heap (due to
pointer indirection and heap alignment), but it also means that:
```
data Foo = Foo {-# UNPACK #-} !Word8 {-# UNPACK #-} !Int8
```
will actually take *a lot* of space: on 64 bit we'd need 8 bytes for
header, 8 bytes for `Word8`, 8 bytes for `Int8`.

Is there any reason for this? The only thing I can see is that this
avoids having to add things like `Word8#` primitives into the
compiler. (also the codegen would need to emit zero-extend moves when
loading from memory, like `movzb{l,q}`)

If we had things like `Word8#` we could also consider changing `Bool`
to just wrap it (with the obvious encoding). Which would allow to both
UNPACK `Bool` *and* save the size within the struct. (alternatively
one could imagine a `Bool#` that would be just a byte)

I couldn't find any discussion about this, so any pointers would be
welcome. :)

Thanks,
Michal

PS.  I've had a look at it after reading about the recent
implementation of struct field reordering optimization in rustc:
http://camlorn.net/posts/April%202017/rust-struct-field-reordering.html
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs