The latest specification has replaced (unbox T) with (deref T) and
(box T) with (ref T). I will be using the new notation.

An implementation issue has come up in the situation where a
parameterized type is instantiated to an unboxed reference type, or is
directly unboxed within a type declaration. This is purely an issue of
pragmatics, not an issue of semantics.


Suppose I define the type "list 'a" in the usual way in BitC:

  (define (list 'a) (cons 'a (list 'a)) nil)

and I am then handed an expression like:

  (cons (deref (tuple #\c 5)) nil)

As we originally envisioned this (back when "deref" was called "unbox"),
this would have meant that the tuple cell consisting of

+-----+----+
| #\c |  5 |
+-----+----+

would have been "inlined" into the containing union type to give

+----------+-----+---+---+
| cons-tag | #\c | 5 | * |
+----------+-----+---+-|-+
                       |
                       v
                       +---------+
                       | nil-tag |
                       +---------+

    [Yes, before someone asks, I am aware of Luca Cardelli's
     representation optimization to get rid of the tag in this 
     example in ML. It doesn't work in BitC, which is what I'm 
     trying to get at here.]

However, this is potentially unfortunate, because code like

(lambda (e : (list 'a)
  (case e
    (cons hd tl) ( ... expression using hd and tl ... )
    ...))

is now very hard to compile, because the offset of 'tl' is not known
unless we can determine the size of the type 'a. Since there may not be
any single concrete type for 'a, we are forced into variant compilation
here, which in turn forces us to inject a hidden "type parameter" into
the lambda arguments. This is a very well known problem with size-based
polymorphic specialization.

What we noticed today is that there are actually three separable
statements being made when one writes "(deref T)"

1. A statement about copy and reference semantics. If the (deref T) is
contained inside some container object, it is copied exactly when the
container is copied, and no escaping reference to the member can be
permitted to be captured.

2. A statement about storage contiguity. When we write

  (tuple (deref 'a) 'b)

we are saying that the storage for the tuple pair, including any
necessary storage for the first member, will be allocated as a
contiguous block in memory. This becomes particularly important when the
example is extended to

  (deref (tuple (deref 'a) 'b))

where the contiguity rule has the effect of ensuring that the first
element gets stack allocated exactly if the containing tuple gets stack
allocated.

3. A statement about elimination of indirection. Namely, that if you
already have a pointer to the outer tuple, no  further pointer
dereferences will be required to access a top level element of any
contained unboxed object.


Ignoring issues of precise structure representation, we could
conceivably choose to satisfy the "contiguity" rule but NOT eliminate
the indirection. This would not cause any semantically observable
difference in behavior. In this representation, the CONS cell above
would look like:

             +-------+
             |       |
             |       v
+----------+-|-+---+-----+---+
| cons-tag | * | * | #\c | 5 |
+----------+---+-|-+-----+---+
                 |
                 v
                 +---------+
                 | nil-tag |
                 +---------+

That is, the representation state of unboxed elements is *appended* to
the original object, and the pointers are left in place.


This must surely seem like lunacy, but note that the compiler is faced
with a set of bad choices:

  1. Eliminate the pointer indirection, at the cost of requiring that
     procedure code generation becomes complicated because of the need
     for runtime case analysis of offsets, or

  2. Eliminate the case analysis by leaving the pointers intact, at
     the cost of greater indirection overhead.

If we somehow knew by magic that the unboxed element was only going to
have one concrete type, the decision would be easy. Unfortunately, this
is possible only in two situations:

  1. If the unboxed type is a concrete type,
  2. If, under whole program compilation, it is later discovered that
     an unboxed alpha type is only instantiated one way.

Given these two statements, it seems very clear that concrete types can
be inlined without any complexity cost, and therefore should be. The
question now becomes: what should the representation be for unboxed
alpha types?

It appears that the right decision is usage dependent. In some cases the
performance cost of indirection will be the primary concern. In other
cases the lost instruction cache density resulting from procedure
specialization will be more important.

If separate compilation is a goal, inlined unboxed alpha types can only
be implemented by adding a (potentially large number) of hidden size
parameters at procedure interfaces. We strongly suspect that the cost of
these additional parameters very much outweighs the cost of a pointer
indirection to appended data that will often turn out to be cache-hot.

Further, we observe that concerns about exact representation control is
kind of silly when you aren't in a position to say what the concrete
types are in the first place. The cases where representation control
appears to be important all involve concrete types.

Given which, we are contemplating the following pragmatics rules for
handling of "unboxed" types:

1. Given a member element of type (deref T), where T is some concrete
type, the representation state of the member element will be directly
inlined into its containing object.

2. Given a member element of type (deref T) where T is some
parameterized type, the representation state of the member element will
be allocated contiguously with its containing object, but the
representation will NOT be inlined into the container. This preserves
the essential objective of controlling heap vs. stack allocation without
introducing unexpected and/or unpredictable polyinstantiation of
polymorphic procedures.


A final, pleasant aside to this is that if we adopt this rule it is once
again possible to use Cardelli's representation optimization to ensure
that the union tag can be dropped for types like (list 'a) and
(option 'a).


Reactions?


shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to