Re: [bitc-dev] Interior objects - round 2

Jonathan S. Shapiro Mon, 04 Nov 2013 10:19:39 -0800

Bennie is asking how frequently interior pointers are likely to be used.
It's a good question. It's hard to answer, because we don't actually have
the feature available, so nobody can code with it. What we *can* say is
that it's a pattern that is used a lot in systems codes.

I wouldn't say it's common, exactly, but there are a lot of patterns in
systems codes that go "well, I know that this thing is a field of that
thing, and it *only* lives in that thing, and at a known offset". Use
cases for this arise from legacy data structures, but more commonly from
linked lists, queues (which are really linked lists), and custom allocators.

*Embedded Links*

The example from the Coyotos kernel would be in the
ObjectHeader<http://dev.eros-os.com/hg/coyotos/trunk/file/tip/src/sys/kerninc/ObjectHeader.h>structure.
It is the first element of the
MemHeader<http://dev.eros-os.com/hg/coyotos/trunk/file/tip/src/sys/kerninc/ObjectHeader.h#l149>and
Process<http://dev.eros-os.com/hg/coyotos/trunk/file/tip/src/sys/kerninc/Process.h>structures.
MemHeader is used in bare form as the metadata structure for
pages. It is also the leading substructure of the
GPT<http://dev.eros-os.com/hg/coyotos/trunk/file/tip/src/sys/kerninc/GPT.h>data
structure. The first member of ObjectHeader is
Ageable<http://dev.eros-os.com/hg/coyotos/trunk/file/tip/src/sys/kerninc/AgeList.h>,
which in turn has a
Link<http://dev.eros-os.com/hg/coyotos/trunk/file/tip/src/sys/kerninc/Link.h>as
it's first member, by which every object ends up on exactly one age
list
at all times.

If an object needs to live on just *one* linked list, you might imagine
that you could move it to the front of the object, use single inheritance,
and have done. That's what we did with the Ageable structure in Coyotos.
The age lists in Coyotos are partitioned by type: Process structures are on
a different age queue than GPT structures. That lets us link the objects as
if they were of type Link, but remove them as type Process (or whatever).
Which gives us shared code but weak typing. We could hypothetically
preserve that shared code by introducing a new kind of pointer type (I'm
making up a syntax):

P.age

Meaning "for some reference type P, the type of an interior pointer to the
age field of P". So this is a *statically typed* interior pointer. Those
are a very minor nuisance for GC, but they are *only* a minor nuisance as
long as they are statically typed. , Note that this type is a subtype of
Ageable. In fact, we could extend this in the obvious way to:

P.age.link

and arrive at a reference that is of union type (P | typeof(P.age) <
Ageable | typeof(P.age.link) < Link) and apply the appropriate offsets
during cast operations. I'm actually not sure if we want to think of this
as a union type or as a new and different sort of thing (containment
typing?).

So if we could express this type, we could say "well, I want a linked list
running through the age field of the process structure, so I'll make the
link structure parametric over subtypes of Link and then instantiate one
over this P.age type". And we could do the same thing for the
queue_link<http://dev.eros-os.com/hg/coyotos/trunk/file/tip/src/sys/kerninc/Process.h#l234>field
of the Process structure. The two link structures can't both live at
the front of the object simultaneously, but with static typing that doesn't
matter.

Not so fast. In both cases, the linked lists are circular. Which is fine,
except that the link element at the head of the age list isn't stored in a
Process, so the linked list elements actually *are* heterogeneously typed -
because the head element is an exceptional case. It's important, in this
data structure, that the head element is unique, because that's what lets
us get away with treating it exceptionally. And I don't know how to
statically type that.

In these two cases, however, we can redesign the data structure. The
objects could be linked circularly, and the link head could point at the
first member of the list. This works because given the object, we know how
to find the list. If we do that, then the entire typing problem goes away.
In fact, the only reason containment typing is of interest at that point is
to avoid some constant offset computations at run time. Well, except for
the fact that they might by us some implementation-level code reuse.

So how useful is this to general programs? I really don't know, though I
think it might reduce the storage required for certain kinds of
collections. It's a common idiom in graphics toolkits. How important is it
in things like kernels and systems programs? Really important. Read on to
get a sense of how constrained we are for space in a kernel. In brief, *every
word counts*.

*Custom Allocators*

Bennie asked whether building a custom allocator from an array was
important. The answer is "yes and no". First, let me give a sense of just
how tight for space we are in a kernel to give you a sense of the problem.

In Coyotos, we use a bare MemHeader structure for every real page frame. On
x86, the Coyotos kernel gets 0.5G of virtual space (total), but the machine
can have up to 64G of real physical memory. Pages are 4KB, so if the
MemHeader structure is 64 bytes, and we need one per frame, that occupies
(2^36 / 2^12) * 2^6 = 2^30 = 1G of memory. Our actual structure is around
48 bytes, which lets us support up to 16G of physical ram in that kernel
configuration. And looking at the data structure I see that we have
something out of position and I should move a field.

Right now, MemHeader structures are allocated from an array. If we switched
to an array of* references* to MemHeaders, we would effectively add a word
to the cost of every MemHeader. We simply don't *have* an additional word
for these structures.

What we *could *do, I suppose, is move the array notion into the allocator
and deal with it there as an unsafe operation. In a kernel that would be
okay, because it's very well localized. We need some unsafe operations down
there in any case, because that's the layer where we are converting raw
memory to object storage.

The thing to note here, I suppose, is that that containment typing trick
doesn't work in this case. We aren't dealing with a particular, known
element.

shap

_______________________________________________
bitc-dev mailing list
bitc-dev@coyotos.org
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Interior objects - round 2

Reply via email to