Re: Confused about PAP object layout

2020-02-27 Thread Simon Marlow
On Wed, 26 Feb 2020 at 18:48, Ömer Sinan Ağacan 
wrote:

> So the key points from this thread are:
>
> - PAP payloads are scavenged using the function's bitmap. Because a PAPs
> payload
>   will have less number of closures than the function's arity the bitmap
> will
>   always have enough bits.
>
> - A bit in a function bitmap is NOT for liveness (e.g. does not indicate
> whether
>   an argument used or not), but for pointers vs. non-pointers. Function
> bitmaps
>   are called "liveness bits" in the code generator which is misleading.
>

I think of all bitmaps as representing "liveness" (or equivalently
"pointerhood") for the purposes of GC. There's no difference from the GC's
perspective between a non-pointer and a pointer that it doesn't need to
follow.

In fact there's nothing to prevent us using the function bitmap to indicate
dead arguments too - it would require zero changes in the RTS, the compiler
would only need to mark unused pointer arguments as non-pointers in the
bitmap. Probably wouldn't be worth very much overall, but I do recall one
space leak that would have been cured by this.

- In a function bitmap (small or large), 0 means pointer, 1 means
> non-pointer.
>

This is true of bitmaps generally I think, not just function bitmaps.

  This is really what confused me in my last email above. For some reason I
>   intuitively expected 1 to mean pointer, not 0. Simon M also got this
> wrong
>

Oops :)

I think there may originally have been a good reason to have it this way
around: before eval/apply, we used bitmaps to describe stack frames, but we
didn't need to encode a size in the bitmap because the default was for the
stack contents to be pointers unless there was something to tell us
otherwise. So a zero suffix of a bitmap just meant "the rest is just normal
stack". This changed with eval/apply, but we kept the convention that zero
meant pointer in a bitmap.


>   ("So a 0 in the bitmap always means non-pointer.") so maybe this is
> confusing
>   to others too.
>
> - For functions with known argument patterns we don't use the function's
> bitmap.
>   These function's type are greater than ARG_BCO (2), and for those we use
> the
>   stg_arg_bitmaps array to get the bitmap.
>
>   For example, the bitmap for ARG_PPP (function with 3 pointer arguments)
> is at
>   index 23 in this array, which is 0b11. For ARG_PNN it's 0b11011. The
> least
>   significant 6 bits are for the size (3), the remaining 0b110 means the
> first
>   argument is a pointer, rest of the two are non-pointers.
>

Actually I think documentation on this is missing in the wiki, I guess I
never got around to updating it when we implemented eval/apply. This page
should really describe function info tables:
https://gitlab.haskell.org/ghc/ghc/wikis/commentary/rts/storage/heap-objects#info-tables

If you want to add documentation that would be a good place.

Cheers
Simon

I still don't understand why this assertion
>
> ASSERT(BITMAP_SIZE(bitmap) >= size);
>
> I added to scavenge_small_bitmap in !2727 is failing though.
>
> Ömer
>
> Simon Peyton Jones , 24 Şub 2020 Pzt, 13:45
> tarihinde şunu yazdı:
> >
> > I’m not following this in detail, but do please make sure that the
> results of this discussion end up in a suitable Note.  Obviously it’s not
> transparently clear as-is, and I can see clarity emerging
> >
> >
> >
> > Thanks!
> >
> >
> > Simon
> >
> >
> >
> > From: ghc-devs  On Behalf Of Simon Marlow
> > Sent: 24 February 2020 08:22
> > To: Ömer Sinan Ağacan 
> > Cc: ghc-devs 
> > Subject: Re: Confused about PAP object layout
> >
> >
> >
> > On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan 
> wrote:
> >
> > > I'm not sure what you mean by "garbage". The bitmap merely determines
> whether
> > > a field is a pointer,
> >
> > I think the bitmap is for liveness, not for whether a field is pointer
> or not.
> > Relevant code for building an info table for a function:
> >
> > mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
> >   = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags
> arg_bits
> >; let fun_type | null liveness_data = aRG_GEN
> >   | otherwise  = aRG_GEN_BIG
> >  extra_bits = [ packIntsCLit dflags fun_type arity ]
> >++ (if inlineSRT dflags then [] else [
> srt_lit ])
> >++ [ liveness_lit, slow_entry ]
> >; return (Nothing, Nothing, extra_bits, liveness_data) }
> >
> > This uses the word "liveness&q

Re: Confused about PAP object layout

2020-02-26 Thread Ömer Sinan Ağacan
> I still don't understand why this assertion
>
> ASSERT(BITMAP_SIZE(bitmap) >= size);
>
> I added to scavenge_small_bitmap in !2727 is failing though.

Ahh, this is becuase in the call sites we do a bit shift only pass the contents
of the bitmap, without the size:

bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
p = scavenge_small_bitmap(p, size, bitmap);

BITMAP_BITS is the macro that does this.

Ömer

Ömer Sinan Ağacan , 26 Şub 2020 Çar, 21:48
tarihinde şunu yazdı:
>
> So the key points from this thread are:
>
> - PAP payloads are scavenged using the function's bitmap. Because a PAPs 
> payload
>   will have less number of closures than the function's arity the bitmap will
>   always have enough bits.
>
> - A bit in a function bitmap is NOT for liveness (e.g. does not indicate 
> whether
>   an argument used or not), but for pointers vs. non-pointers. Function 
> bitmaps
>   are called "liveness bits" in the code generator which is misleading.
>
> - In a function bitmap (small or large), 0 means pointer, 1 means non-pointer.
>
>   This is really what confused me in my last email above. For some reason I
>   intuitively expected 1 to mean pointer, not 0. Simon M also got this wrong
>   ("So a 0 in the bitmap always means non-pointer.") so maybe this is 
> confusing
>   to others too.
>
> - For functions with known argument patterns we don't use the function's 
> bitmap.
>   These function's type are greater than ARG_BCO (2), and for those we use the
>   stg_arg_bitmaps array to get the bitmap.
>
>   For example, the bitmap for ARG_PPP (function with 3 pointer arguments) is 
> at
>   index 23 in this array, which is 0b11. For ARG_PNN it's 0b11011. The 
> least
>   significant 6 bits are for the size (3), the remaining 0b110 means the first
>   argument is a pointer, rest of the two are non-pointers.
>
> I still don't understand why this assertion
>
> ASSERT(BITMAP_SIZE(bitmap) >= size);
>
> I added to scavenge_small_bitmap in !2727 is failing though.
>
> Ömer
>
> Simon Peyton Jones , 24 Şub 2020 Pzt, 13:45
> tarihinde şunu yazdı:
> >
> > I’m not following this in detail, but do please make sure that the results 
> > of this discussion end up in a suitable Note.  Obviously it’s not 
> > transparently clear as-is, and I can see clarity emerging
> >
> >
> >
> > Thanks!
> >
> >
> > Simon
> >
> >
> >
> > From: ghc-devs  On Behalf Of Simon Marlow
> > Sent: 24 February 2020 08:22
> > To: Ömer Sinan Ağacan 
> > Cc: ghc-devs 
> > Subject: Re: Confused about PAP object layout
> >
> >
> >
> > On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan  
> > wrote:
> >
> > > I'm not sure what you mean by "garbage". The bitmap merely determines 
> > > whether
> > > a field is a pointer,
> >
> > I think the bitmap is for liveness, not for whether a field is pointer or 
> > not.
> > Relevant code for building an info table for a function:
> >
> > mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
> >   = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
> >; let fun_type | null liveness_data = aRG_GEN
> >   | otherwise  = aRG_GEN_BIG
> >  extra_bits = [ packIntsCLit dflags fun_type arity ]
> >++ (if inlineSRT dflags then [] else [ srt_lit ])
> >++ [ liveness_lit, slow_entry ]
> >; return (Nothing, Nothing, extra_bits, liveness_data) }
> >
> > This uses the word "liveness" rather than "pointers".
> >
> > However I just realized that the word "garbage" is still not the best way to
> > describe what I'm trying to say. In the example
> >
> > [pap_info, x, y, z]
> >
> > If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
> > argument, or "garbage" as I describe in my previous email) OR it may be a
> > non-pointer, but used (i.e. not a garbage).
> >
> >
> >
> > I don't think we ever put a zero in the bitmap for a pointer-but-not-used 
> > argument. We don't do liveness analysis for function arguments, as far as 
> > I'm aware. So a 0 in the bitmap always means "non-pointer".
> >
> >
> >
> > The only reaosn the code uses the terminology "liveness" here is that it's 
> > sharing code with the code that handles bitmaps for stack frames, which do 
> > deal with liveness.
> >
> >
> >
&g

Re: Confused about PAP object layout

2020-02-26 Thread Ömer Sinan Ağacan
So the key points from this thread are:

- PAP payloads are scavenged using the function's bitmap. Because a PAPs payload
  will have less number of closures than the function's arity the bitmap will
  always have enough bits.

- A bit in a function bitmap is NOT for liveness (e.g. does not indicate whether
  an argument used or not), but for pointers vs. non-pointers. Function bitmaps
  are called "liveness bits" in the code generator which is misleading.

- In a function bitmap (small or large), 0 means pointer, 1 means non-pointer.

  This is really what confused me in my last email above. For some reason I
  intuitively expected 1 to mean pointer, not 0. Simon M also got this wrong
  ("So a 0 in the bitmap always means non-pointer.") so maybe this is confusing
  to others too.

- For functions with known argument patterns we don't use the function's bitmap.
  These function's type are greater than ARG_BCO (2), and for those we use the
  stg_arg_bitmaps array to get the bitmap.

  For example, the bitmap for ARG_PPP (function with 3 pointer arguments) is at
  index 23 in this array, which is 0b11. For ARG_PNN it's 0b11011. The least
  significant 6 bits are for the size (3), the remaining 0b110 means the first
  argument is a pointer, rest of the two are non-pointers.

I still don't understand why this assertion

ASSERT(BITMAP_SIZE(bitmap) >= size);

I added to scavenge_small_bitmap in !2727 is failing though.

Ömer

Simon Peyton Jones , 24 Şub 2020 Pzt, 13:45
tarihinde şunu yazdı:
>
> I’m not following this in detail, but do please make sure that the results of 
> this discussion end up in a suitable Note.  Obviously it’s not transparently 
> clear as-is, and I can see clarity emerging
>
>
>
> Thanks!
>
>
> Simon
>
>
>
> From: ghc-devs  On Behalf Of Simon Marlow
> Sent: 24 February 2020 08:22
> To: Ömer Sinan Ağacan 
> Cc: ghc-devs 
> Subject: Re: Confused about PAP object layout
>
>
>
> On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan  wrote:
>
> > I'm not sure what you mean by "garbage". The bitmap merely determines 
> > whether
> > a field is a pointer,
>
> I think the bitmap is for liveness, not for whether a field is pointer or not.
> Relevant code for building an info table for a function:
>
> mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
>   = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
>; let fun_type | null liveness_data = aRG_GEN
>   | otherwise  = aRG_GEN_BIG
>  extra_bits = [ packIntsCLit dflags fun_type arity ]
>++ (if inlineSRT dflags then [] else [ srt_lit ])
>++ [ liveness_lit, slow_entry ]
>; return (Nothing, Nothing, extra_bits, liveness_data) }
>
> This uses the word "liveness" rather than "pointers".
>
> However I just realized that the word "garbage" is still not the best way to
> describe what I'm trying to say. In the example
>
> [pap_info, x, y, z]
>
> If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
> argument, or "garbage" as I describe in my previous email) OR it may be a
> non-pointer, but used (i.e. not a garbage).
>
>
>
> I don't think we ever put a zero in the bitmap for a pointer-but-not-used 
> argument. We don't do liveness analysis for function arguments, as far as I'm 
> aware. So a 0 in the bitmap always means "non-pointer".
>
>
>
> The only reaosn the code uses the terminology "liveness" here is that it's 
> sharing code with the code that handles bitmaps for stack frames, which do 
> deal with liveness.
>
>
>
> So maybe "liveness" is also not the best way to describe this bitmap, as 0 
> does
> not mean dead but rather "don't follow in GC".
>
>
> On my quest to understand and document this code better I have one more
> question. When generating info tables for functions with know argument 
> patterns
> (ArgSpec) we initialize the bitmap as 0. Relevant code:
>
> mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
>   = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label
>; return (Nothing, Nothing,  extra_bits, []) }
>
> Here the last return value is for the liveness data. I don't understand how 
> can
> this be correct, because when we use this function in a PAP this will cause 
> NOT
> scavenging the PAP payload. Relevant code (simplified):
>
> STATIC_INLINE GNUC_ATTR_HOT StgPtr
> scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
> {
> const StgFunInfoTable *fun_info =
>   

RE: Confused about PAP object layout

2020-02-24 Thread Simon Peyton Jones via ghc-devs
I’m not following this in detail, but do please make sure that the results of 
this discussion end up in a suitable Note.  Obviously it’s not transparently 
clear as-is, and I can see clarity emerging

Thanks!

Simon

From: ghc-devs  On Behalf Of Simon Marlow
Sent: 24 February 2020 08:22
To: Ömer Sinan Ağacan 
Cc: ghc-devs 
Subject: Re: Confused about PAP object layout

On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan 
mailto:omeraga...@gmail.com>> wrote:
> I'm not sure what you mean by "garbage". The bitmap merely determines whether
> a field is a pointer,

I think the bitmap is for liveness, not for whether a field is pointer or not.
Relevant code for building an info table for a function:

mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
  = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
   ; let fun_type | null liveness_data = aRG_GEN
  | otherwise  = aRG_GEN_BIG
 extra_bits = [ packIntsCLit dflags fun_type arity ]
   ++ (if inlineSRT dflags then [] else [ srt_lit ])
   ++ [ liveness_lit, slow_entry ]
   ; return (Nothing, Nothing, extra_bits, liveness_data) }

This uses the word "liveness" rather than "pointers".

However I just realized that the word "garbage" is still not the best way to
describe what I'm trying to say. In the example

[pap_info, x, y, z]

If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
argument, or "garbage" as I describe in my previous email) OR it may be a
non-pointer, but used (i.e. not a garbage).

I don't think we ever put a zero in the bitmap for a pointer-but-not-used 
argument. We don't do liveness analysis for function arguments, as far as I'm 
aware. So a 0 in the bitmap always means "non-pointer".

The only reaosn the code uses the terminology "liveness" here is that it's 
sharing code with the code that handles bitmaps for stack frames, which do deal 
with liveness.

So maybe "liveness" is also not the best way to describe this bitmap, as 0 does
not mean dead but rather "don't follow in GC".

On my quest to understand and document this code better I have one more
question. When generating info tables for functions with know argument patterns
(ArgSpec) we initialize the bitmap as 0. Relevant code:

mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
  = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label
   ; return (Nothing, Nothing,  extra_bits, []) }

Here the last return value is for the liveness data. I don't understand how can
this be correct, because when we use this function in a PAP this will cause NOT
scavenging the PAP payload. Relevant code (simplified):

STATIC_INLINE GNUC_ATTR_HOT StgPtr
scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
{
const StgFunInfoTable *fun_info =
get_fun_itbl(UNTAG_CONST_CLOSURE(fun));

StgPtr p = (StgPtr)payload;

StgWord bitmap;
switch (fun_info->f.fun_type) {
...
default:
bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
small_bitmap:
p = scavenge_small_bitmap(p, size, bitmap);
break;
}
return p;
}

Here if I have a function with three pointer args (ARG_PPP) the shown branch
that will be taken, but because the bitmap is 0 (as shown in the mk_pieces code
above) nothing in the PAPs payload will be scavenged.

It gets the bitmap from stg_arg_bitmaps[fun_info->f.fun_type], not from the 
info table.  Hope this helps.

Cheers
Simon



Here's an example from a debugging session:

>>> print pap
$10 = (StgPAP *) 0x42001fe030

>>> print *pap
$11 = {
  header = {
info = 0x7fbdd1f06640 
  },
  arity = 2,
  n_args = 1,
  fun = 0x7fbdd2d23ffb,
  payload = 0x42001fe048
}

So this PAP is applied one argument, which is a boxed object (a FUN_2_0):

>>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
$20 = {
  layout = {
payload = {
  ptrs = 2,
  nptrs = 0
},
bitmap = 2,
large_bitmap_offset = 2,
__pad_large_bitmap_offset = 2,
selector_offset = 2
  },
  type = 11,
  srt = 1914488,
  code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
}

However if I look at the function of this PAP:

>>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
$21 = {
  f = {
slow_apply_offset = 16,
__pad_slow_apply_offset = 3135120895,
b = {
  bitmap = 74900193017889,
  bitmap_offset = 258342945,
  __pad_bitmap_offset = 258342945
},
fun_type = 23,
arity = 3
  },
  i = {
layout = {
 

Re: Confused about PAP object layout

2020-02-24 Thread Simon Marlow
On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan 
wrote:

> > I'm not sure what you mean by "garbage". The bitmap merely determines
> whether
> > a field is a pointer,
>
> I think the bitmap is for liveness, not for whether a field is pointer or
> not.
> Relevant code for building an info table for a function:
>
> mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
>   = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags
> arg_bits
>; let fun_type | null liveness_data = aRG_GEN
>   | otherwise  = aRG_GEN_BIG
>  extra_bits = [ packIntsCLit dflags fun_type arity ]
>++ (if inlineSRT dflags then [] else [ srt_lit
> ])
>++ [ liveness_lit, slow_entry ]
>; return (Nothing, Nothing, extra_bits, liveness_data) }
>
> This uses the word "liveness" rather than "pointers".
>
> However I just realized that the word "garbage" is still not the best way
> to
> describe what I'm trying to say. In the example
>
> [pap_info, x, y, z]
>
> If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
> argument, or "garbage" as I describe in my previous email) OR it may be a
> non-pointer, but used (i.e. not a garbage).
>

I don't think we ever put a zero in the bitmap for a pointer-but-not-used
argument. We don't do liveness analysis for function arguments, as far as
I'm aware. So a 0 in the bitmap always means "non-pointer".

The only reaosn the code uses the terminology "liveness" here is that it's
sharing code with the code that handles bitmaps for stack frames, which do
deal with liveness.


> So maybe "liveness" is also not the best way to describe this bitmap, as 0
> does
> not mean dead but rather "don't follow in GC".
>

> On my quest to understand and document this code better I have one more
> question. When generating info tables for functions with know argument
> patterns
> (ArgSpec) we initialize the bitmap as 0. Relevant code:
>
> mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
>   = do { let extra_bits = packIntsCLit dflags fun_type arity :
> srt_label
>; return (Nothing, Nothing,  extra_bits, []) }
>
> Here the last return value is for the liveness data. I don't understand
> how can
> this be correct, because when we use this function in a PAP this will
> cause NOT
> scavenging the PAP payload. Relevant code (simplified):
>
> STATIC_INLINE GNUC_ATTR_HOT StgPtr
> scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord
> size)
> {
> const StgFunInfoTable *fun_info =
> get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
>
> StgPtr p = (StgPtr)payload;
>
> StgWord bitmap;
> switch (fun_info->f.fun_type) {
> ...
>
default:
> bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
> small_bitmap:
> p = scavenge_small_bitmap(p, size, bitmap);
> break;
> }
> return p;
> }
>

> Here if I have a function with three pointer args (ARG_PPP) the shown
> branch
> that will be taken, but because the bitmap is 0 (as shown in the mk_pieces
> code
> above) nothing in the PAPs payload will be scavenged.
>

It gets the bitmap from stg_arg_bitmaps[fun_info->f.fun_type], not from the
info table.  Hope this helps.

Cheers
Simon



>
> Here's an example from a debugging session:
>
> >>> print pap
> $10 = (StgPAP *) 0x42001fe030
>
> >>> print *pap
> $11 = {
>   header = {
> info = 0x7fbdd1f06640 
>   },
>   arity = 2,
>   n_args = 1,
>   fun = 0x7fbdd2d23ffb,
>   payload = 0x42001fe048
> }
>
> So this PAP is applied one argument, which is a boxed object (a FUN_2_0):
>
> >>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
> $20 = {
>   layout = {
> payload = {
>   ptrs = 2,
>   nptrs = 0
> },
> bitmap = 2,
> large_bitmap_offset = 2,
> __pad_large_bitmap_offset = 2,
> selector_offset = 2
>   },
>   type = 11,
>   srt = 1914488,
>   code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
> }
>
> However if I look at the function of this PAP:
>
> >>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
> $21 = {
>   f = {
> slow_apply_offset = 16,
> __pad_slow_apply_offset = 3135120895,
> b = {
>   bitmap = 74900193017889,
>   bitmap_offset = 258342945,
>   __pad_bitmap_offset = 258342945
> },
> fun_type = 23,
> arity = 3
>   },
>   i = {
> layout = {
>   payload = {
> ptrs = 0,
> nptrs = 0
>   },
>   bitmap = 0,
>   large_bitmap_offset = 0,
>   __pad_large_bitmap_offset = 0,
>   selector_offset = 0
> },
> type = 14,
> srt = 1916288,
> code = 0x7fbdd2b50260 
> 

Re: Confused about PAP object layout

2020-02-20 Thread Ben Gamari
Ömer Sinan Ağacan  writes:

>> I'm not sure what you mean by "garbage". The bitmap merely determines whether
>> a field is a pointer,
>
> I think the bitmap is for liveness, not for whether a field is pointer or not.
> Relevant code for building an info table for a function:
>
> mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
>   = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
>; let fun_type | null liveness_data = aRG_GEN
>   | otherwise  = aRG_GEN_BIG
>  extra_bits = [ packIntsCLit dflags fun_type arity ]
>++ (if inlineSRT dflags then [] else [ srt_lit ])
>++ [ liveness_lit, slow_entry ]
>; return (Nothing, Nothing, extra_bits, liveness_data) }
>
> This uses the word "liveness" rather than "pointers".
>
> However I just realized that the word "garbage" is still not the best way to
> describe what I'm trying to say. In the example
>
> [pap_info, x, y, z]
>
> If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
> argument, or "garbage" as I describe in my previous email) OR it may be a
> non-pointer, but used (i.e. not a garbage).
>
> So maybe "liveness" is also not the best way to describe this bitmap, as 0 
> does
> not mean dead but rather "don't follow in GC".
>
This is indeed my understanding; "not live" in this context really just
means "not a pointer traced by the GC". I agree that "liveness" is a
poor word for it. This is briefly described in a comment in
includes/rts/storage/InfoTables.h.

> On my quest to understand and document this code better I have one more
> question. When generating info tables for functions with known
> argument patterns (ArgSpec) we initialize the bitmap as 0. Relevant
> code:
>
> mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
>   = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label
>; return (Nothing, Nothing,  extra_bits, []) }
>
> Here the last return value is for the liveness data. I don't understand how 
> can
> this be correct, because when we use this function in a PAP this will cause 
> NOT
> scavenging the PAP payload. Relevant code (simplified):
>
> STATIC_INLINE GNUC_ATTR_HOT StgPtr
> scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
> {
> const StgFunInfoTable *fun_info =
> get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
>
> StgPtr p = (StgPtr)payload;
>
> StgWord bitmap;
> switch (fun_info->f.fun_type) {
> ...
> default:
> bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
> small_bitmap:
> p = scavenge_small_bitmap(p, size, bitmap);
> break;
> }
> return p;
> }
>
> Here if I have a function with three pointer args (ARG_PPP) the shown branch
> that will be taken, but because the bitmap is 0 (as shown in the mk_pieces 
> code
> above) nothing in the PAPs payload will be scavenged.
>
> Here's an example from a debugging session:
>
> >>> print pap
> $10 = (StgPAP *) 0x42001fe030
>
> >>> print *pap
> $11 = {
>   header = {
> info = 0x7fbdd1f06640 
>   },
>   arity = 2,
>   n_args = 1,
>   fun = 0x7fbdd2d23ffb,
>   payload = 0x42001fe048
> }
>
> So this PAP is applied one argument, which is a boxed object (a FUN_2_0):
>
> >>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
> $20 = {
>   layout = {
> payload = {
>   ptrs = 2,
>   nptrs = 0
> },
> bitmap = 2,
> large_bitmap_offset = 2,
> __pad_large_bitmap_offset = 2,
> selector_offset = 2
>   },
>   type = 11,
>   srt = 1914488,
>   code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
> }
>
> However if I look at the function of this PAP:
>
> >>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
> $21 = {
>   f = {
> slow_apply_offset = 16,
> __pad_slow_apply_offset = 3135120895,
> b = {
>   bitmap = 74900193017889,
>   bitmap_offset = 258342945,
>   __pad_bitmap_offset = 258342945
> },
> fun_type = 23,
> arity = 3
>   },
>   i = {
> layout = {
>   payload = {
> ptrs = 0,
> nptrs = 0
>   },
>   bitmap = 0,
>   large_bitmap_offset = 0,
>   __pad_large_bitmap_offset = 0,
>   selector_offset = 0
> },
> type = 14,
> srt = 1916288,
> code = 0x7fbdd2b50260 
> "I\203\304(M;\245X\003"
>   }
> }
>
> It has arity 3. Since the first argument is a boxed object and this function 
> has
> arity 3, if the argument is actually live in the function (i.e. not an unused
> argument), then the bitmap should have a 1 for this. But because the argument
> pattern is known (ARG_PPP) we initialized 

Re: Confused about PAP object layout

2020-02-20 Thread Ömer Sinan Ağacan
> I'm not sure what you mean by "garbage". The bitmap merely determines whether
> a field is a pointer,

I think the bitmap is for liveness, not for whether a field is pointer or not.
Relevant code for building an info table for a function:

mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
  = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
   ; let fun_type | null liveness_data = aRG_GEN
  | otherwise  = aRG_GEN_BIG
 extra_bits = [ packIntsCLit dflags fun_type arity ]
   ++ (if inlineSRT dflags then [] else [ srt_lit ])
   ++ [ liveness_lit, slow_entry ]
   ; return (Nothing, Nothing, extra_bits, liveness_data) }

This uses the word "liveness" rather than "pointers".

However I just realized that the word "garbage" is still not the best way to
describe what I'm trying to say. In the example

[pap_info, x, y, z]

If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
argument, or "garbage" as I describe in my previous email) OR it may be a
non-pointer, but used (i.e. not a garbage).

So maybe "liveness" is also not the best way to describe this bitmap, as 0 does
not mean dead but rather "don't follow in GC".

On my quest to understand and document this code better I have one more
question. When generating info tables for functions with know argument patterns
(ArgSpec) we initialize the bitmap as 0. Relevant code:

mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
  = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label
   ; return (Nothing, Nothing,  extra_bits, []) }

Here the last return value is for the liveness data. I don't understand how can
this be correct, because when we use this function in a PAP this will cause NOT
scavenging the PAP payload. Relevant code (simplified):

STATIC_INLINE GNUC_ATTR_HOT StgPtr
scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
{
const StgFunInfoTable *fun_info =
get_fun_itbl(UNTAG_CONST_CLOSURE(fun));

StgPtr p = (StgPtr)payload;

StgWord bitmap;
switch (fun_info->f.fun_type) {
...
default:
bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
small_bitmap:
p = scavenge_small_bitmap(p, size, bitmap);
break;
}
return p;
}

Here if I have a function with three pointer args (ARG_PPP) the shown branch
that will be taken, but because the bitmap is 0 (as shown in the mk_pieces code
above) nothing in the PAPs payload will be scavenged.

Here's an example from a debugging session:

>>> print pap
$10 = (StgPAP *) 0x42001fe030

>>> print *pap
$11 = {
  header = {
info = 0x7fbdd1f06640 
  },
  arity = 2,
  n_args = 1,
  fun = 0x7fbdd2d23ffb,
  payload = 0x42001fe048
}

So this PAP is applied one argument, which is a boxed object (a FUN_2_0):

>>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
$20 = {
  layout = {
payload = {
  ptrs = 2,
  nptrs = 0
},
bitmap = 2,
large_bitmap_offset = 2,
__pad_large_bitmap_offset = 2,
selector_offset = 2
  },
  type = 11,
  srt = 1914488,
  code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
}

However if I look at the function of this PAP:

>>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
$21 = {
  f = {
slow_apply_offset = 16,
__pad_slow_apply_offset = 3135120895,
b = {
  bitmap = 74900193017889,
  bitmap_offset = 258342945,
  __pad_bitmap_offset = 258342945
},
fun_type = 23,
arity = 3
  },
  i = {
layout = {
  payload = {
ptrs = 0,
nptrs = 0
  },
  bitmap = 0,
  large_bitmap_offset = 0,
  __pad_large_bitmap_offset = 0,
  selector_offset = 0
},
type = 14,
srt = 1916288,
code = 0x7fbdd2b50260 
"I\203\304(M;\245X\003"
  }
}

It has arity 3. Since the first argument is a boxed object and this function has
arity 3, if the argument is actually live in the function (i.e. not an unused
argument), then the bitmap should have a 1 for this. But because the argument
pattern is known (ARG_PPP) we initialized the bitmap as 0! Not sure how this
can work.

What am I missing?

Thanks,

Ömer

Ben Gamari , 14 Şub 2020 Cum, 20:25 tarihinde şunu yazdı:
>
> Ömer Sinan Ağacan  writes:
>
> > I think that makes sense, with the invariant that n_args <= bitmap_size. We
> > evacuate the arguments used by the function but not others. Thanks.
> >
> > It's somewhat weird to see an object with useful stuff, then garbage, then
> > useful stuff again in the heap, but that's not an issue by itself. For 
> > example
> > if I have something like
> >
> >  

Re: Confused about PAP object layout

2020-02-14 Thread Ben Gamari
Ömer Sinan Ağacan  writes:

> I think that makes sense, with the invariant that n_args <= bitmap_size. We
> evacuate the arguments used by the function but not others. Thanks.
>
> It's somewhat weird to see an object with useful stuff, then garbage, then
> useful stuff again in the heap, but that's not an issue by itself. For example
> if I have something like
>
> [pap_info, x, y, z]
>
> and according to the function `y` is dead, then after evacuating I get
>
> [pap_info, x, , z]
>
> This "garbage" is evacuated again and again every time we evacuate this PAP.
>
I'm not sure what you mean by "garbage". The bitmap merely determines
whether a field is a pointer, not whether it is copied during
evacuation. A field's bitmap bit not being set merely means that we won't
evacuate the value of that field during scavenging.

Nevertheless, this all deserves a comment in scavenge_PAP.

Cheers,

- Ben



signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Confused about PAP object layout

2020-02-14 Thread Ömer Sinan Ağacan
I think that makes sense, with the invariant that n_args <= bitmap_size. We
evacuate the arguments used by the function but not others. Thanks.

It's somewhat weird to see an object with useful stuff, then garbage, then
useful stuff again in the heap, but that's not an issue by itself. For example
if I have something like

[pap_info, x, y, z]

and according to the function `y` is dead, then after evacuating I get

[pap_info, x, , z]

This "garbage" is evacuated again and again every time we evacuate this PAP.

Ömer

Alexis King , 14 Şub 2020 Cum, 18:43 tarihinde
şunu yazdı:

>
> Disclaimer: I am not an expert. But I happened to have been looking at this 
> code just yesterday, so I’ll try to answer to check my understanding. :)
>
> Fundamentally, a PAP is not fully-saturated, so the number of arguments in 
> its payload may be smaller than the information contained in the function’s 
> bitmap. scavenge_large_bitmap calls walk_large_bitmap, which uses the bitmap 
> as a “ruler” to guide the traversal, lining up each element in the payload to 
> information in the bitmap. But the traversal only actually walks a payload of 
> the specified size, so if there’s less information in the payload than there 
> is information in the bitmap, the traversal will just terminate early.
>
> > On Feb 14, 2020, at 09:30, Ömer Sinan Ağacan  wrote:
> >
> > Right, I think that's the problem. We then pass the same "size" to
> > scavenge_large_bitmap as the size of the bitmap. So we assume size of the 
> > bitmap
> > is pap->n_args.
> >
> > So the call stack is
> >
> > - scavenge_PAP, calls scavenge_PAP_payload with pap->n_args as "size"
> > - scavenge_PAP_payload, calls scavenge_large_bitmap with "size" (== 
> > pap->n_args)
> >  as the bitmap's size
> >
> > Is this expected?
> >
> > Ömer
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Confused about PAP object layout

2020-02-14 Thread Ben Gamari
Ömer Sinan Ağacan  writes:

> Right, I think that's the problem. We then pass the same "size" to
> scavenge_large_bitmap as the size of the bitmap. So we assume size of the 
> bitmap
> is pap->n_args.
>
> So the call stack is
>
> - scavenge_PAP, calls scavenge_PAP_payload with pap->n_args as "size"
> - scavenge_PAP_payload, calls scavenge_large_bitmap with "size" (== 
> pap->n_args)
>   as the bitmap's size
>
> Is this expected?
>
Omer and I discussed this via IRC.

I believe that the intent here is that scavenge_PAP_payload scavenges
precisely the number of arguments that the PAP's stack fragment
includes, using the function's bitmap to do so (since we must know which
of this arguments are pointers).

Moreover, we know that pap->n_args is less or equal to than the size of
the bitmap (since otherwise this wouldn't be a partial application).
Consequently this is safe and correct.

Cheers,

- Ben


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Confused about PAP object layout

2020-02-14 Thread Alexis King
Disclaimer: I am not an expert. But I happened to have been looking at this 
code just yesterday, so I’ll try to answer to check my understanding. :)

Fundamentally, a PAP is not fully-saturated, so the number of arguments in its 
payload may be smaller than the information contained in the function’s bitmap. 
scavenge_large_bitmap calls walk_large_bitmap, which uses the bitmap as a 
“ruler” to guide the traversal, lining up each element in the payload to 
information in the bitmap. But the traversal only actually walks a payload of 
the specified size, so if there’s less information in the payload than there is 
information in the bitmap, the traversal will just terminate early.

> On Feb 14, 2020, at 09:30, Ömer Sinan Ağacan  wrote:
> 
> Right, I think that's the problem. We then pass the same "size" to
> scavenge_large_bitmap as the size of the bitmap. So we assume size of the 
> bitmap
> is pap->n_args.
> 
> So the call stack is
> 
> - scavenge_PAP, calls scavenge_PAP_payload with pap->n_args as "size"
> - scavenge_PAP_payload, calls scavenge_large_bitmap with "size" (== 
> pap->n_args)
>  as the bitmap's size
> 
> Is this expected?
> 
> Ömer
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Confused about PAP object layout

2020-02-14 Thread Ömer Sinan Ağacan
Right, I think that's the problem. We then pass the same "size" to
scavenge_large_bitmap as the size of the bitmap. So we assume size of the bitmap
is pap->n_args.

So the call stack is

- scavenge_PAP, calls scavenge_PAP_payload with pap->n_args as "size"
- scavenge_PAP_payload, calls scavenge_large_bitmap with "size" (== pap->n_args)
  as the bitmap's size

Is this expected?

Ömer

Simon Marlow , 14 Şub 2020 Cum, 18:08 tarihinde şunu yazdı:
>
> On Fri, 14 Feb 2020 at 11:49, Ömer Sinan Ağacan  wrote:
>>
>> Hi Simon,
>>
>> In this code: (slightly simplified)
>>
>> StgPtr
>> scavenge_PAP (StgPAP *pap)
>> {
>> evacuate(>fun);
>> return scavenge_PAP_payload (pap->fun, pap->payload, pap->n_args);
>> }
>>
>> StgPtr
>> scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord 
>> size)
>> {
>> const StgFunInfoTable *fun_info =
>> get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
>> StgPtr p = (StgPtr)payload;
>>
>> switch (fun_info->f.fun_type) {
>> case ARG_GEN_BIG:
>> scavenge_large_bitmap(p, GET_FUN_LARGE_BITMAP(fun_info), size);
>> p += size;
>> break;
>> ...
>> }
>> return p;
>> }
>>
>> Here the `size` argument in `scavenge_PAP_payload` is the number of arguments
>> applied to the function in `pap->fun`. But when scavenging the function's 
>> bitmap
>> we're using this number as the size of the bitmap which doesn't make sense to
>> me, because I think size of the function's bitmap and size of the PAP's 
>> payload
>> may be different.
>
>
> "size" is an argument to scavenge_PAP_payload(), and when we call it we pass 
> pap->n_args as the value, not the bitmap's size.
>
> Does that help?
>
> Cheers
> Simon
>
>
>>
>>
>> Or in other words I may have the same function used in many PAPs with 
>> different
>> n_args, but that'd be buggy if this code is correct.
>>
>> I haven't checked every single place where we build a PAP but for example the
>> `NEW_PAP` macro uses the argument's (another PAP) function directly, without
>> making any bitmap-related changes, but bumps n_args by one. If the code 
>> above is
>> right, then this new PAP will be scavenged incorrectly.
>>
>> Am I missing anything?
>>
>> Thanks,
>>
>> Ömer
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Confused about PAP object layout

2020-02-14 Thread Simon Marlow
On Fri, 14 Feb 2020 at 11:49, Ömer Sinan Ağacan 
wrote:

> Hi Simon,
>
> In this code: (slightly simplified)
>
> StgPtr
> scavenge_PAP (StgPAP *pap)
> {
> evacuate(>fun);
> return scavenge_PAP_payload (pap->fun, pap->payload, pap->n_args);
> }
>
> StgPtr
> scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord
> size)
> {
> const StgFunInfoTable *fun_info =
> get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
> StgPtr p = (StgPtr)payload;
>
> switch (fun_info->f.fun_type) {
> case ARG_GEN_BIG:
> scavenge_large_bitmap(p, GET_FUN_LARGE_BITMAP(fun_info), size);
> p += size;
> break;
> ...
> }
> return p;
> }
>
> Here the `size` argument in `scavenge_PAP_payload` is the number of
> arguments
> applied to the function in `pap->fun`. But when scavenging the function's
> bitmap
> we're using this number as the size of the bitmap which doesn't make sense
> to
> me, because I think size of the function's bitmap and size of the PAP's
> payload
> may be different.
>

"size" is an argument to scavenge_PAP_payload(), and when we call it we
pass pap->n_args as the value, not the bitmap's size.

Does that help?

Cheers
Simon



>
> Or in other words I may have the same function used in many PAPs with
> different
> n_args, but that'd be buggy if this code is correct.
>
> I haven't checked every single place where we build a PAP but for example
> the
> `NEW_PAP` macro uses the argument's (another PAP) function directly,
> without
> making any bitmap-related changes, but bumps n_args by one. If the code
> above is
> right, then this new PAP will be scavenged incorrectly.
>
> Am I missing anything?
>
> Thanks,
>
> Ömer
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Confused about PAP object layout

2020-02-14 Thread Ömer Sinan Ağacan
Hi Simon,

In this code: (slightly simplified)

StgPtr
scavenge_PAP (StgPAP *pap)
{
evacuate(>fun);
return scavenge_PAP_payload (pap->fun, pap->payload, pap->n_args);
}

StgPtr
scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
{
const StgFunInfoTable *fun_info =
get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
StgPtr p = (StgPtr)payload;

switch (fun_info->f.fun_type) {
case ARG_GEN_BIG:
scavenge_large_bitmap(p, GET_FUN_LARGE_BITMAP(fun_info), size);
p += size;
break;
...
}
return p;
}

Here the `size` argument in `scavenge_PAP_payload` is the number of arguments
applied to the function in `pap->fun`. But when scavenging the function's bitmap
we're using this number as the size of the bitmap which doesn't make sense to
me, because I think size of the function's bitmap and size of the PAP's payload
may be different.

Or in other words I may have the same function used in many PAPs with different
n_args, but that'd be buggy if this code is correct.

I haven't checked every single place where we build a PAP but for example the
`NEW_PAP` macro uses the argument's (another PAP) function directly, without
making any bitmap-related changes, but bumps n_args by one. If the code above is
right, then this new PAP will be scavenged incorrectly.

Am I missing anything?

Thanks,

Ömer
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs