Re: [bitc-dev] GC, RC, and real time.

Bennie Kloosteman Tue, 15 Oct 2013 11:05:15 -0700

On Tue, Oct 15, 2013 at 10:53 PM, Jonathan S. Shapiro <[email protected]>wrote:

> On Mon, Oct 14, 2013 at 10:09 PM, Bennie Kloosteman <[email protected]>wrote:
>
>> Yep  and if 50% of objects are value types and 50% of the ref
>> objects/actions are in enclosed regions there should only be 25% of the ref
>> counts ... 12% basic  interlock  ref count system cost becomes 3% .
>>
>
> A value type is simply a type that is pass-by-value rather than
> pass-by-reference. I have the sense that you are using the term "value
> type" here to mean "a type that does not have outbound references". I have
> been referring to these as "reference-free types".
>

I meant objects that are passd by value ( in C#) are not heap objects (
unless boxed as you mention later) and hence would not have a reference
count to them .. eg Point[2000] in C# vs Java .

>
> A value type can live on the heap, and when it does, it will be "wrapped"
> by a conventional object header. If you like, you can imagine that for
> every value type V there is a corresponding reference type V_ref, and the
> two are assignment compatible by dispensation.
>

Yep but that would be poor design in a ref counted language , you would
often put many in a single object , try to use regions and stack more etc..

The proportion of reference-free types in the heap is unknown (at least to
> me), so it's hard to make sensible predictions based on that. Even
> reference-free types can have virtual functions and run-time type
> information, and *those* require something very like a reference. So the
> real beauty of a reference-free type is that none of its references are *
> mutable*. This means that no "store reference to slot" operation will
> occur on these types, and in consequence a reference type never needs to be
> logged by the reference counting scheme.
>

Agree imutable is important and also note string implimentation  as an
array reference vs an embeded char[]. Ref counting has huge implications on
design ..its going to be hard getting a standard lib that works well with
ref counting and a GC.

> Admitting unboxed types in the language/runtime has several consequences:
>
> 1. Objects, on average, tend to be slightly larger. The actual impact on
> object size demographics is something I don't know.
> 2. The number of roots on the stack goes up, because (a) locals are
> bigger, and (b) procedure calls make copies of value types, some of which
> contain references. The larger number of roots *may* be offset by having
> fewer objects to mark in the heap (because, in effect, one layer of
> indirection has been unboxed).
> 3. I suspect that the average size of a logged object grows (more
> importantly: the average number of references per logged object).
>
> Statement 3 is a belief that I haven't seen examined in the literature. My
> theory is that larger objects have more semantic "weight" in the program,
> and are therefore more likely to be the focus of mutations. This could be
> totally wrong. What *is* clearly true is that the efficiency of logging
> for deferred reference counting is a function of the weighted average
> number of references per logged object. It seems likely that unboxed types
> will tend to make that number go up.
>
> True and it is hard to see .. I think if you didnt have value types larger
objects would be worse but with good value type support you the total will
be less .  I have done a lot of work with Domain Driven Design recently ,
which tend to have larger opaque objects (Aggregates) ,  and its very
interisting and  they are large objects and would not decrease the ref
count  .  A lot of managed architecture is driven by the GC  , we need to
be careful we dont fall into the trap like micro kernels running monolithic
kernel apps.  Designs will change over time and teh stdlib should make that
happen .  Another example is C# value types are avoided because they are
boxed..a better implimentation would not have this issue.

>

> When i wrote that i was thinking reference could be  11-12 bytes
>> structures with the pointer  ( or possibly masked high bits in a 64 bit
>> pointer  or 32 bit pointer as an option on a 64 bit machine)  and some
>> flags (freeze / release immutable ) and a counter so you dont have to have
>> a header.
>>
>
> Adding size to the reference is far worse than having an object header.
> Unless you can play mapping games to implement lazy masking, masking is
> also expensive. On some machines you can used the VM system for masking,
> because some virtual caches aren't physically anti-aliased.
>

Im not convinced of that ...ref counting on C++ does exactly that  and the
basic implimentation C++ is much faster than Java using bits in the header(
yet alone a new header field)  .  I think the cost of the object header is
greater than we think ( well i thought anyway ) especially after Shahriyar
measuerd the cost of adding 1 field ,  it may well be that GCs are faster
but we loose a lot in having the header ( though obviously you cant have
one without the other) .  Im aware masking is expensive  ( but it can be
reduced )  but i dont think we really know what is faster
add 1 byte to the ptr
mask 64 bit pointers
32 bit  pointer in 64 bit ref on 64 bit machine
Header
VM Masking

Im pretty sure the fastest will be 32 bit pointer on 64 bit machine .. but
it  only handles say half the cases  , I dont know what is faster after
that ( i can only make some gueses) but from C++ i dont think its the
header.  As far as memory cost the header adds it to every object , if you
have 1M 4 byte objects , with an 8 byte header you have  12M instead of 4M
 ( or 24M on the CLR)   , fat references only to objects with references (
So it comes down to the % of reference-free types which we also dont know
we do know small objects are very common  so i would not be surprised to
see a 20% heap reduction)  . On modern HW reducing memory and adding some
CPU is normally a good trade off   . Small strings/objects and creating
them is pretty common .. I would say thats a significant %.    The paper
showed adding 1 dummy 32 bit field to the header with no changes decreased
JAVA ( Emphasis)  perfomance by 3- 4% ( Down for the Count? Getting
Reference Counting Back in the Ring end of 4.1) . So whats the cost of 2 32
bit fields + the header overhead itself 7-10%  ?

Purely speculating you may find the header costs 6%  adding the mask 2% and
adding a field to 32 bit pointers 7% ... In which case dropping the header
would give 4%  on 64 bit systems fairly cheaply well worth the 1% drop on
64 bit systems.  Kind of important to know and have some  concrete
figures... Could get some idea by  getting C++ benchmarks and adding a
field to every objects and measuring the cost  vs the  fat pointer
mechanisms and maybe do manual ref counts.  You can also see the memory
overhead.

>
> Masked pointers are *sometimes* viable on 64-bit machines, but they are
> problematic on 32-bit machines, and you still would need an object header
> for forwarding.
>

Not for  URC. And there should be other ways to do forwarding..

Ben

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] GC, RC, and real time.

Reply via email to