On Mon, 2006-07-10 at 00:59 -0700, Erick Tryzelaar wrote:

Here are some thoughts on additional requirements/priorities:

1. Variable length Array support. 
*********************************

Felix currently supports
fixed length arrays as a special kind of tuple.

A C++ style 'vector' raises several issues:

(a) it can't be done without changes to the garbage collector.
The reason is, there is a difference between allocated storage
and used storage. The collector must only scan used storage,
but the allocator must manage the lot.

(b) Thread safety.

(c) Extension. C++ vector is a nasty compromise: you can
reserve storage and grow up to the reserved size, but after
that you need a new allocation and copying the objects.
Since reallocation is the 'default' and only avoided by 
carefully reserving enough storage at the right times,
vector is only 'safe' to use by indexing, not 
with iterators/pointers.

Felix currently supports movable objects though!

In addition, we can add a new method 'reserve_address_space()',
which is like reserve(), but it doesn't reserve memory, only
address space. We can then extend the array with mmap, without
invalidating any iterators or requiring a global collector
scan to move objects.

2. Better performing representations of variants.
*************************************************

At present, a general variant is represented by a _uctor_
which is like this:

struct _uctor_ {
  int variant;
  void *data;
};

The data must point to a heap allocated object. The sharing
semantics is NOT defined. Note that a new variant constructor
currently always copies the data, however the copy is only
shallow, and therefore doesn't prevent, for example, sharing
of the tails of a list.

If a variant has no argument (or only a unit), then data is 
set to NULL.

In the special case all the variants have no arguments (or only
unit argument), the variant type is called an enumeration.
In this case, a plain 'int' is used as the representation.

This situation can be improved considerably. For example,
suppose heap pointers are allocated on 16 byte boundaries,
then the 4 low bits of the heap pointer will always be zero.
Thus, we can pack the variant code and the object address
into a single machine word (of pointer size) provided the
number of variants is less than or equal to 16.

In addition, since Felix is a whole program analyser,
it can discover which variant cases are actually used,
and eliminated those that are not, reducing the count.
It can also throw out pattern matches for those cases.
There needs to be a way to inhibit this, at least for
modelling C compatible variants and where a particular
layout is required .. for example for svc calls, the
driver variant codes must match those in the library.

In addition, for some small arguments, instead of a
variant code and a pointer, we could put the data
in place of the pointer: for example even a 64 bit
integer will fit in the space of a pointer so we could
use

        struct _int_ctor_ {
                int variant;
                long data;
        }

instead of a _uctor_. Indeed, we can probably do
even better .. since int is 4 bytes and long is 8,
there's a chance the variant is 8 byte aligned anyhow,
and we're wasting 4 bytes. 

Of course, again we might be able to pack the variant
code and data.


3. Reduced GC overhead.
***********************

Felix gc currently uses 48 bytes of overhead per heap
object on an AMD64. for a linked list with optimised
variant as in (2), only 1 word (8 bytes) is required
for the user data.

Some way to reduce the overhead is needed. Boehm GC
only uses 1 bit, Ocaml uses between 0 bits and 1 word.

It may also be interesting to consider using a trie
to find pointers. Basically an 8 byte pointer can
be represented by 7 levels of page tables of 256 entries,
plus an array of 256 bits (less with alignment taken
into account).

Apart from asking 'is this a heap allocated pointer',
we can change the 256 bits to 256 pointers to 
memory descriptors .. and thus have zero overhead
on the actual heap.

If we organise 'like' objects to be in contiguous
storage, we can get away with one descriptor for
all objects of the same type (the address will
tell the type).

Allocation is roughly like Knuth's buddy system.
The page table stuff can also be limited to a fixed
extent with overflow as normal. This is basically
an advanced version of the arena allocator.

4. Both more relaxed and more fine grained typing
*************************************************

Generators -- work if you use indirection, directly
storing state is prevented by cloning function variables,
which is done to ensure they're re-entrant. Of course
generators are deliberatly NOT re-entrant, we want
to re-enter them with state preserved. Note a generator
is a kind of control inverted fthread/procedure.

Const pointers. Felix doesn't understand C/C++ notion
of const (except rather naively). Make interfacing
a bit of a pain.

Also, we want an attribute 'immutable' which is like
const, however doesn't support conversions to non-immutable
objects (this ensures an immutable object accessed thru
a pointer can be cached).

We should look at Cyclone, which has a more complete taxonomy
of pointer types.

Side effects. Well there's really no reason Felix functions
should not allow side effects. We can already write such functions.
If the compiler could recognize them, it could avoid surprising
optimisations. There is an issue of avoiding loss of referential
transparency, which is mainly syntactic [we want to see when
an expression is pure and when it has side effects].

In general, the typing scheme and evaluation semantics
need to revisited and clarified.

> 1.2: olabl-like labels/keywords

I really want to see a proposal for the syntax for this.
Remember Felix functions currently support overloading,
so the proposal must do more than specify the syntax
of declaration and calling protocol .. it also has
to specify how lookup is handled.

[I'm in favour of the idea .. I just don't know how to do it]

> 1.3: classes

Require inheritance and subtyping, and thus possibly
interfaces .. this is a big job. It has to be done
but I'm in favour of tackling something easier for a while :)

> 1.3: multimethods

I would generalise this topic to 'dynamic dispatch'. The issue
is more general than just class methods: how do we integrate
a nice dynamic type system with a nice static type system
in a nice way .. ?

The ideal first example of this is a lowly variable
length array. The general topic is dependent typing.

Another interesting case is 'pointer polymorphism'.
Consider:

fun swap[U,V] (x: &U, y: &V): &V * &U => y,x;

This routine accepts a tuple of two pointers and returns
a tuple with the pointers swapped. Currently Felix uses
extensional polymorphism:

        swap (1,2)
        swap (1.0,2.0)

will create two 'physical' routines, one for each
of the two used argument types.

But we can do better in C! We just use void* and casts!
We only need ONE routine, because all pointers have
the same representation.

In some FPL, all, or almost all functions have this
property: because all types are represented by pointers,
which is called boxing, all operations which are limited
to initialisation/copying/assignment are universal.

Because they're universal, polymorphic routines only
need one instance: this is called intensional polymorphism.

Generic operations performed this way on boxed values
not only require only one encoding, thus saving on 
code bloat .. they're also almost always 
FASTER than extensionally polymorphic routines,
because pointers are small and fit nicely into registers.

What's more, even non-polymorphic routines  -- which have
to dereference the pointers -- are not as slow as you might
think .. although it looks like an extra indirection is
required, the machine instruction for doing that has
zero cost. All the costs are in loading the cache with
the extra word. In a loop .. the value may already be
cached, so the cost is merely a slight bloating of
the cache size (and no cost in memory accesses).

Thus, compared to copying objects around, boxing can
be faster, as fast, or almost as fast, in quite a significant
number of cases.

The main problem with boxing is semantic: it is transparent
for immutable values. It is demanded for 'object oriented
objects' also, as a matter of semantics. In both cases,
significant memory is saved and performance is good:
the downside is that mutation is not transparent for
shared objects.

An obvious goal here is simply for the compiler to recognize
when one or more arguments are 'boxable', and remove that
argument from the list of 'instantiable types' used to
construct the function, instead, we make the typing
instantiable at the call point -- by way of a cast.
An improved way of doing this may be to use template
wrapper for the typing.

BTW: in case you're wondering "what has this got to 
do with multi-methods??" ... don't forget a pointer
can point to a function too! And don't forget classes
in Felix are always boxed!

So the idea is roughly .. NO! I do NOT want to make
any multi-methods. I want the USER to be able to implement
their own dispatch algorithms .. preferably in Felix.

To do this, we need just the right kinds of polymorphism.
So the issue is to find more polymorphism.

One 'obvious' thing to implement is something based
on C++ dynamic casts.

> 2.0: type classes
> 2.0: ocaml-like module interfaces for info hiding

Along with class subtyping and interfaces .. this is hard!

> compiler:
> 1.1.3: detecting missing patterns when matching

messy..

> 1.1.3: __LINENO__, __FILE__ macros

Done

> 2.0: partial recompilation

Separate compilation .. :)

In particular, some way for Felix DLL's to talk to each other.
It can be done now but:

(a) the types have to be C types
(b) it's messy to organise

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to