So the key points from this thread are: - PAP payloads are scavenged using the function's bitmap. Because a PAPs payload will have less number of closures than the function's arity the bitmap will always have enough bits.
- A bit in a function bitmap is NOT for liveness (e.g. does not indicate whether an argument used or not), but for pointers vs. non-pointers. Function bitmaps are called "liveness bits" in the code generator which is misleading. - In a function bitmap (small or large), 0 means pointer, 1 means non-pointer. This is really what confused me in my last email above. For some reason I intuitively expected 1 to mean pointer, not 0. Simon M also got this wrong ("So a 0 in the bitmap always means non-pointer.") so maybe this is confusing to others too. - For functions with known argument patterns we don't use the function's bitmap. These function's type are greater than ARG_BCO (2), and for those we use the stg_arg_bitmaps array to get the bitmap. For example, the bitmap for ARG_PPP (function with 3 pointer arguments) is at index 23 in this array, which is 0b11. For ARG_PNN it's 0b110000011. The least significant 6 bits are for the size (3), the remaining 0b110 means the first argument is a pointer, rest of the two are non-pointers. I still don't understand why this assertion ASSERT(BITMAP_SIZE(bitmap) >= size); I added to scavenge_small_bitmap in !2727 is failing though. Ömer Simon Peyton Jones <simo...@microsoft.com>, 24 Şub 2020 Pzt, 13:45 tarihinde şunu yazdı: > > I’m not following this in detail, but do please make sure that the results of > this discussion end up in a suitable Note. Obviously it’s not transparently > clear as-is, and I can see clarity emerging > > > > Thanks! > > > Simon > > > > From: ghc-devs <ghc-devs-boun...@haskell.org> On Behalf Of Simon Marlow > Sent: 24 February 2020 08:22 > To: Ömer Sinan Ağacan <omeraga...@gmail.com> > Cc: ghc-devs <ghc-devs@haskell.org> > Subject: Re: Confused about PAP object layout > > > > On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan <omeraga...@gmail.com> wrote: > > > I'm not sure what you mean by "garbage". The bitmap merely determines > > whether > > a field is a pointer, > > I think the bitmap is for liveness, not for whether a field is pointer or not. > Relevant code for building an info table for a function: > > mk_pieces (Fun arity (ArgGen arg_bits)) srt_label > = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits > ; let fun_type | null liveness_data = aRG_GEN > | otherwise = aRG_GEN_BIG > extra_bits = [ packIntsCLit dflags fun_type arity ] > ++ (if inlineSRT dflags then [] else [ srt_lit ]) > ++ [ liveness_lit, slow_entry ] > ; return (Nothing, Nothing, extra_bits, liveness_data) } > > This uses the word "liveness" rather than "pointers". > > However I just realized that the word "garbage" is still not the best way to > describe what I'm trying to say. In the example > > [pap_info, x, y, z] > > If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused > argument, or "garbage" as I describe in my previous email) OR it may be a > non-pointer, but used (i.e. not a garbage). > > > > I don't think we ever put a zero in the bitmap for a pointer-but-not-used > argument. We don't do liveness analysis for function arguments, as far as I'm > aware. So a 0 in the bitmap always means "non-pointer". > > > > The only reaosn the code uses the terminology "liveness" here is that it's > sharing code with the code that handles bitmaps for stack frames, which do > deal with liveness. > > > > So maybe "liveness" is also not the best way to describe this bitmap, as 0 > does > not mean dead but rather "don't follow in GC". > > > On my quest to understand and document this code better I have one more > question. When generating info tables for functions with know argument > patterns > (ArgSpec) we initialize the bitmap as 0. Relevant code: > > mk_pieces (Fun arity (ArgSpec fun_type)) srt_label > = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label > ; return (Nothing, Nothing, extra_bits, []) } > > Here the last return value is for the liveness data. I don't understand how > can > this be correct, because when we use this function in a PAP this will cause > NOT > scavenging the PAP payload. Relevant code (simplified): > > STATIC_INLINE GNUC_ATTR_HOT StgPtr > scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size) > { > const StgFunInfoTable *fun_info = > get_fun_itbl(UNTAG_CONST_CLOSURE(fun)); > > StgPtr p = (StgPtr)payload; > > StgWord bitmap; > switch (fun_info->f.fun_type) { > ... > > default: > bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]); > small_bitmap: > p = scavenge_small_bitmap(p, size, bitmap); > break; > } > return p; > } > > > Here if I have a function with three pointer args (ARG_PPP) the shown branch > that will be taken, but because the bitmap is 0 (as shown in the mk_pieces > code > above) nothing in the PAPs payload will be scavenged. > > > > It gets the bitmap from stg_arg_bitmaps[fun_info->f.fun_type], not from the > info table. Hope this helps. > > > > Cheers > > Simon > > > > > > > Here's an example from a debugging session: > > >>> print pap > $10 = (StgPAP *) 0x42001fe030 > > >>> print *pap > $11 = { > header = { > info = 0x7fbdd1f06640 <stg_PAP_info> > }, > arity = 2, > n_args = 1, > fun = 0x7fbdd2d23ffb, > payload = 0x42001fe048 > } > > So this PAP is applied one argument, which is a boxed object (a FUN_2_0): > > >>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0])) > $20 = { > layout = { > payload = { > ptrs = 2, > nptrs = 0 > }, > bitmap = 2, > large_bitmap_offset = 2, > __pad_large_bitmap_offset = 2, > selector_offset = 2 > }, > type = 11, > srt = 1914488, > code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003" > } > > However if I look at the function of this PAP: > > >>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun)) > $21 = { > f = { > slow_apply_offset = 16, > __pad_slow_apply_offset = 3135120895, > b = { > bitmap = 74900193017889, > bitmap_offset = 258342945, > __pad_bitmap_offset = 258342945 > }, > fun_type = 23, > arity = 3 > }, > i = { > layout = { > payload = { > ptrs = 0, > nptrs = 0 > }, > bitmap = 0, > large_bitmap_offset = 0, > __pad_large_bitmap_offset = 0, > selector_offset = 0 > }, > type = 14, > srt = 1916288, > code = 0x7fbdd2b50260 <base_GHCziRead_list3_info> > "I\203\304(M;\245X\003" > } > } > > It has arity 3. Since the first argument is a boxed object and this function > has > arity 3, if the argument is actually live in the function (i.e. not an unused > argument), then the bitmap should have a 1 for this. But because the argument > pattern is known (ARG_PPP) we initialized the bitmap as 0! Not sure how this > can work. > > What am I missing? > > Thanks, > > Ömer > > Ben Gamari <b...@smart-cactus.org>, 14 Şub 2020 Cum, 20:25 tarihinde şunu > yazdı: > > > > Ömer Sinan Ağacan <omeraga...@gmail.com> writes: > > > > > I think that makes sense, with the invariant that n_args <= bitmap_size. > > > We > > > evacuate the arguments used by the function but not others. Thanks. > > > > > > It's somewhat weird to see an object with useful stuff, then garbage, then > > > useful stuff again in the heap, but that's not an issue by itself. For > > > example > > > if I have something like > > > > > > [pap_info, x, y, z] > > > > > > and according to the function `y` is dead, then after evacuating I get > > > > > > [pap_info, x, <garbage>, z] > > > > > > This "garbage" is evacuated again and again every time we evacuate this > > > PAP. > > > > > I'm not sure what you mean by "garbage". The bitmap merely determines > > whether a field is a pointer, not whether it is copied during > > evacuation. A field's bitmap bit not being set merely means that we won't > > evacuate the value of that field during scavenging. > > > > Nevertheless, this all deserves a comment in scavenge_PAP. > > > > Cheers, > > > > - Ben > > _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs