On Tue, 2006-08-01 at 12:41 +1000, skaller wrote:
> What are the desired semantics of a variable length array?

(What, no comments? Hmm)

This problem turns out to be deeper. At the moment, the collector
shape object provides one polymorphic method for objects:
a destructor.

The collector also supports mobility: some objects can be moved,
if they're allocated in an arena. The requires 

(a) physically copying the object from address A to B
(b) adjusting every pointer to A to point to B

The collector can manage these steps by itself ..
ON THE ASSUMPTION objects can be moved (the client has
to flag them immobile if they can't be: the RTS supports
this, but there's no language feature to state it because
Felix objects are intrinsically mobile, provided their
contained primitives are.

The simplest way to move a C++ object is to copy it,
then destroy the source location.

A diversion: C++ supports default initialisation,
copy assignment, copy construction, and destruction.
These are core polymorphic operations: in traditional
FPL's pointers are used, and so a single set of these
operations on pointers suffices for all objects:
this is boxing.

Now back: there are two ways to copy most objects at the moment:

(a) bitwise copies
(b) just don't!

Some objects -- namely stack frames -- also support a clone()
method.

So right now, in principle, the arena allocator is unsound,
because it just bitwise copies things.

In fact .. this is the wrong idea, since it actually needs
to MOVE objects, which is not the same thing at all,
and certainly not always the same as copy/destroy. Some objects
can be moved by NOT copied, for example in a binary tree with
parent pointers you can move the root node provided the child
node's parent pointers are adjusted .. but you can't COPY
the root node (because children only have one parent).

So as well as the usual C++ operations, we also need MOVE,
which some people in the C++ community also want -- in fact
the current 'autoptr' has move semantics (but there is no 
sane syntax for it).

Now .. for arrays where we are reallocating the linear
store required, we're going to have to move objects.
However it isn't a usual kind of move! The realloc()
call already physically moves the objects, so we're
not moving from A to B as such -- the concrete object
is already moved, we simply have to adjust the object
in place (and things referring to it!).

This is not a MOVE operation, but a REMAP which you can
also effect with mmap() and friends.

Note the problem here isn't so much moving Felix objects
and adjusting Felix pointers .. the gc can do that.
The problem is C/C++ primitives, where the pointers
are hidden.

Another example: a string. Although you can move a string
by copying it and destroying the source .. it's very
inefficient. A typical string class can just be bitwise
copied, leaving the actual char array where it was.

So what's the point? The problem is variable length arrays.
To extend them, we have to either

(a) make a new store, and copy the objects to it,
then either (1) forget it, but leave the gc to reap
the original objects, or (2) destroy the original objects

(b) realloc the store, and if necessary 'remap' the objects.

Furthermore, unless we initialise variable length arrays to
length zero .. we may have to initialise the elements.

So .. actually we seem to need *as polymorphic routines*

(a) default init
(b) copy
(c) assign
(d) move
(e) remap
(f) destroy

whereas at present the GC shape objects only support
asynchronous destructors (finalisers).

These finalisers are 'pragmatic' because each one 
defaults to either 0 (none) or to ~T(), the C++
destructor. This works, even for most composite Felix
types, since they're all modelled with C++: if a type
has special needs, it can be defined in the C++ destructor.

Felix simply wraps up the destructor in a C function.

Carrying around a function for each primitive is no big deal,
though it may bloat the source a bit defining them.

Specifying them for C types where the C++ operation
doesn't make sense is currently not supported.

But even harder, for aggregate types, it is quite nasty
for the Felix compiler to actually *generate* composite
operations, if it a POD with a user supplied C routine
is used as the operator (rather than the C++ routines).

basically, that would be asking Felix to do the work
of defining things which C++ does automatically ..
except C++ does NOT do it very well for some types,
such as arrays. (STL vector, of course, does all
these operations).

Further note that Felix 'move' operator, as well
as 'destruct' operator, is supported by a map
of where the pointers are. In fact the 'pointer
chasing code' -- basically the sweep of the collector --
requires a fully flattened array of offsets to work.

For example a struct just has the offsets of all
the components of all the substructs joined up:
the run time representation gives no clue the
type is a struct, and has certain components.

In turn, the representation cannot handle primitives
which might actually contain Felix pointers.
So there is a hint, to make the 'pointer scanning'
algorithm use an OO style, where the current method
is an instance of a higher level abstraction, which
could be provided for complex C++ objects to locate
heap pointers.

The same argument more or less applies to the 'copy'
methods -- they need to be 'polymorphic' rather than
fixed routines (such as bitwise copy everything).

For example, for structures, you'd like to use
a recursive descent and an array of offsets of
the top level components, calling the scanner for
each of these objects to find pointers, implying
that we also need to associate a shape with each offset.

Array shapes then have just another 'class' of scanner,
rather than the current situation where a special
encoding (an element count) is used.. and turns out
to be not enough for real variable length array
operations.

So it looks like there's a need to revamp the collector
and RTTI, at least to contain a 'vtable' of operators,
not just 'finaliser', and possibly allow for recursive
analysis instead of having to flatten everything
(even though C++ supports this automatically for copy
operations etc it doesn't support either 'move' or
'remap' or pointer chasing).

Whew.

Bottom line: arrays are much harder than I thought.
It requires additional RTTI extensions and possibly
a change in the encoding.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to