Re: [Felix-language] Sundry questions

skaller Sat, 10 Feb 2007 19:22:28 -0800

On Sat, 2007-02-10 at 18:58 -0500, Chris King wrote:
> Hi,
> 
> I've been slowly working my way through the tutorial and reference
> manual and I've come up with a long list of questions and comments
> from a language-design point of view.
> 
> 
> - Redundancy -


Lots of that ..

> It seems Felix has a few redundant syntactic constructs, which is to
> be expected since it is heavily under development.  But I was
> wondering which of the following constructs are "preferred" (if they
> are indeed identical), and if any are deprecated:

The problem is partly the high level of experimental features, and
partly the plain old lack of users to provide feedback.

But there is a hidden agenda here too: Felix uses a LOT of 
syntactic sugar (it's the only way to get the compiler to work
at all, or there would be too many terms to handle).

The aim is to introduce a lot deliberately with the idea
that forces the developers to look for ways to move it out
of the compiler and into the library!

In other words, introduce enough power that you can create
DSSLs -- Domain Specific Sub Languages.

The two heaviest DSSL's supported are lexing and parsing 
constructions.

The available extraction techniques .. apart from the usual
ability to define types and functions in a library --
include 

(0) Higher order functions with syntactic sugar
like { .. } allows things like
        
        foreach { i = 0; } { i < 10 } { ++i; } { .. };

(which is a HOF in the library, not a builtin statement).

(1) ability to bind to C++ .. that's actually a 
meta-programming technique

(2) user defined syntax using for example

#statement
#infix

pre-processor directives. The statement

        while e do .. done;

is actually defined in the library.

(3) Syntax macros (which are used heavily by feature (2))

(4) I recently introduced token macros (process streams
of tokens) though they don't seem that useful,
they might help with 'user defined literals' somehow.

Token stream filtering allows 'long int' to name
the Felix type 'int' .. the parser cannot handle 
the former .. however this is done with a hard
coded feature at the moment not token macros.

(5) Arbitrary sequences of special characters can be
tokens, allowing 'any old' symbol to be a user defined
operator -- much to the annoyance of some people this
create a mess with /* .. */ and // style comments.

None of this stuff is as powerful as full scale code
generation (printing code out) but it is better integrated
and more tightly constrained. Staged programming, for
example as in MetaOcaml is another approach: it has the
advantage of symmetry across levels and strong typing
but can be messy to write .. and also requires well
integrated dynamic loading which native code Ocaml compilers
do not support. Bootstrapping Felix might solve this problem,
however Felix dynamic loading uses binary shared libraries
and is very hard to type-check -- and we'd still need
to rewrite the compiler in Felix to effect this.


> * At a first glance namespaces seem to be a superset of modules... is
> there a reason they coexist?  E.g. is the lack of extendability of
> modules considered beneficial in some cases?

Yes, modules are deliberately closed. Namespaces were added only
a few weeks ago. In fact namespaces are syntactic sugar ..
the compiler collates components into a module anyhow :)

> * Do jump and loop always perform the same optimizations?  I.e. do
> they coexist solely for documentation purposes?

No. In fact jump and loop are syntactic sugar: they're both
replaced by a call followed by a return. Then the compiler
reintroduces them when it sees a call followed by a return.

Loop is a restricted case of a jump. You can jump to any
callable procedure, it's just a tail call. Loops can
only call yourself or a parent, and should reduce to
a self-tail-rec call which might reuse the stack frame,
suggesting equivalent to a goto (i.e. *fast*).

BTW: there are sometimes constructions with a semantic that
is not actually checked when it should be.

> * Can first-order polymorphism accomplish anything that metatyping
> can't?  It seems that polymorphism could be implemented as sugar for a
> simple type function.

I'm not quite sure what you mean. The syntactic form:

        type x[t] = ...

is short for

        typedef x[t:TYPE] = ...

so the meta-type TYPE, the category of all types, is simply
defaulted. You can also write the isomorhic

        typedef fun x(t:TYPE):TYPE =  ...

However these aren't syntactic sugar, although they probably
should be (I'm guessing that's the issue).

Felix originally had ONLY the latter form, with
functions in Haskell style:

        fun f(t:TYPE) (x:t) = ...

instead of

        fun f[t:TYPE] (x:t) = ..

BUT there are problems: the latter would allow mixing types and
values, and so allows expression of higher order polymorphism
which isn't supported -- not necessarily a bad thing, but a bit
nasty to check. But the real problem was the call syntax:

        f (int) (1)

The thing is, Felix can't tell if the expression (int) is a
type or a value .. if it could be a type, then so could (1) be:
and indeed (1) really IS a type in Felix at the moment.

The main reason this is difficult is overloading: if you could
just lookup 'f' function you could see if the first argument
was a type or not .. but you can't do that. You have to calculate
the types of the arguments FIRST, to choose which 'f' to use
from an overload set.

So it got too hard and I introduced the simplified syntax

        fun f[t:TYPE] (x:t) = ..

which is called like

        f[int] (1)

In fact you can call it like:

        f 1

and the 'int' is deduced in this case (as in C++).

> * Is there a difference between objects and classes?

Yes. Objects are a really old construction that probably should
be removed: they use the Lisp closure trick where certain
functions nested in another function are stored into a record
and returned: the outer function acts as a constructor,
the nested functions as methods, and the record entries
are already bound by lexical scoping to the constructor
function stack frame. Functional abstraction automatically
hides representation details, that is, variables of the
stack frame. Such objects are constructed dynamically and
the 'obj' syntax is just syntactic sugar.

Classes are full scale constructions invading the whole
architecture of the compiler.


> * val vs. :=, enum vs. union vs. C-union...?

Yup, x := y; is syntactic sugar for val x = y; the former
form won't admit a type annotation whereas the latter does:

        val x : int = 1;

val can also be used in functions:

        fun f(val x : int, y: int)

where it is the default (so y is a val here too).
The completeness is needed in the ugly syntax

        def val x, y, var z = 1,2,3;

This is equivalent to:

        val x = 1; y = 2; var z = 3;

that is, x and z are new variables, whereas z is
an existing variable.

        
> 
> - Syntax -
> 
> These questions are probably answered somewhere in the tutorial but I
> can't seem to find them:
> 
> * Is there a way to alias patterns? E.g. (?x, ?y as ?xy)?

Yes, like:

        (?x, ?y as xy)

should work. However there is actually an issue here
for types, so be warned: 'as' has two distinct meanings
in type patterns, since type patterns are now fully
generalised and part of the type system, that is,
a type pattern is actually any type with 'hole's in it.
(More precisely typematch variables and wildcards are
terms in the meta-typing syntax so that

        val x : ?t, int = 1,2;

parses correctly, and probably should even work, although
I doubt that it does. The problem is that 'as' has
another meaning: type recursion:

        typedef list = unit + int * list;

can be written in closed form:

        typedef list = unit + int * x as x;

which means:

        list = mu x. unit + int * x

Jacques Carette actually suggested changing to something
like this to resolve the syntactic conflict. It's purely
an "I can't think of another keyword" problem.

Note this only applies to *typematch* .. but then
the type system has the full power of Jay's pattern
calculus now (it uses typematch and type lambda terms
instead of Jay's patterns but I believe they're equivalent).


> * Is there a way to write a literal of type char?

No, there isn't. That's an outstanding gripe.
I can't think of a way to do it without wasting a valuable
and currently unused punctuator, namely backtick, or giving
up on the current Python string syntax which allows both
the ' and " characters as delimiters (which avoids a LOT
of horrid escape sequences).

Note you can *construct* a character from a string literal
or integer .. 

        char "H"

makes the character 'H' .. but it isn't really the same
as a character literal.

> * How can one write a value of type a^0 or a^1?

The type a^0 is equal to 0 and has no values so there
is no way to write a value of that type -- none exist :)

The type a^1 is equal to a.

Erick and I tried to make a^k more like arrays, especially
as they're actually CALLED that:

        typedef array[a,k] = a ^ k;

is actually defined in the library. However a ^ k is just
a tuple of k components of type a: and a tuple of one
value is not only isomorphic to that value, it is equal 
to it.

This choice plus the definition of a^k as sugar for 
tuples of repeated values explains why a^1 isn't
distinct from a.

That choice was not arbitrary, but also not necessary.
It was partly made -- at cost of quite considerable
extra complexity in the compiler I might add --
to simplify the calling of and syntax for functions,
especially C functions: when you write

        fun f: int * int -> int = "f($1, $2)";

it is really a total hack: f actually requires two
distinct arguments. But you can still call like:

        val ab = 1,2;
        val r = f ab;

with a single argument. This would be very hard to support
when you consider:

        fun g: int -> int = "g($1)";


WOOPS? What does $1 mean here? Is the argument an int,
or is it a tuple containing a single int? If the latter,
how would you denote the type?

Anyhow: I made to choice to identify a tuple of one element
with element .. the choice is very much open to question.

> - Bugs -
> 
> * The example for the loop statement in section 2.40 of the tutorial
> works with jump, but not with loop.

Ok, I'll have a look.

> * f"%s" expects a C_hack::ptr[char], but it should probably expect a
> C_hack::cptr[char] since it doesn't modify its argument. 

There are bound to be quite a few such messes because Felix
doesn't support the concept of 'const' at all. Sure,
there is cptr[t], ptr[t] in the library but they're quite
distinct abstract types.

>  On a similar
> note, is there a format specifier for a string so one can avoid
> cstr()?

There should be .. however the implementation actually uses
a wrapper for C vsnprintf function. I think I looked at this
already and there was some reason it wasn't a trivial mod
to the code.

It could be a distinct %code should be used.

> * It seems that vals don't have to be initialized... is this a bug?
> (Especially since there is no way to assign anything to them
> afterward...)

Yes, it's a bug. It exists because the parser must allow it
in some circumstances, for example we may support

        typeclass X[t] { virtual val d: t; }
        instance X[int] { val d : int = 1; }

although this doesn't work at the moment, in other words,
the parser allows it because it is useful in interfaces
and the parser doesn't have enough context to exclude it
where it shouldn't be allowed.

OTOH, by the time we want to do checking, the initialisation
has already been split up into a declaration and an assignment,
so there's no way to tell past that point, short of data flow
analysis.

I can probably fix it, and it does result from a feature --
ML style module functors -- which I once attempted to implement,
failed, and dropped.


> - Documentation -
> 
> I found a few inconsistencies in the docs, if you like I could fix
> them myself and submit a patch:

Note you cannot modify generated documentation .. and almost all
of it is generated from interscript sources.

Patches are horrible .. I hate them.

I would much prefer if you just become a developer and
commit directly to the repository. If you bugger something
up, well someone will just fix it, either me or you or
another developer... I bugger things up all the time.

Of course it is good try avoid committing something that
breaks everything and going on holiday, but there are
times I do commit broken code, particularly when I have
to reboot into Windows to check the Win32 build,
the central repository is the only way I have to share
the mods (I'm too lazy to get Samba working even though
I have two dual boot AMD64's on my desk .. :)

> * The docs don't make it clear that types & variables share the same 
> namespace.

It is true though. All names are in a unified namespace.

> * $ is never formally introduced (although it is used a lot).

I'll bet there are quite a few more such things .. I keep
finding features I forgot I introduced :)

> * The docs say that "unlike functions, procedures may have side
> effects"... does this actually mean that procedures are guaranteed to
> be evaluated for side effects, whereas functions are not?

Procedures are sure to be called if they do something and
control actually flows through the call.

For functions, well they're not ALLOWED to have side-effects.
This allows the compiler to elide, delay, or duplicate calls
to functions as it sees fit. Roughly, function calling
in Felix is either Lazy or Eager or a combination of both
at the whim of the compiler.

There are now also things called generators which look like
functions, but which cannot be duplicated and can have
side effects. The canonical examples are malloc() and random().

Applications of generators are lifted out of expressions
and replaced by a variable which is initialised by the generator
application before the expression is evaluated .. for example:

        r := random() + random();

is implemented by

        var tmp1 = random();
        var tmp2 = random();
        r := tmp1 + tmp2;

and these 'vars' are not 're-substituted' back into the
expression (at least, they're not supposed to be .. :)

This ensures a definite though unknown sequencing of
side effects, and in particular exactly one invocation per
function call actually written.

[But I think there was and may still be a bug, if the
function returns unit the call might be elided]

> * Section 2.29 of the tutorial says that procedures produced from HOFs
> are not called when the () is elided when in fact they are.

Well actually, this only appears to be the case with top
level statements. You can write:

        endl;

and it prints an end line, but this is a parser hack:
the parser actually replaces this with

        endl ();

In that sense the tutorial is actually correct; more precisely

        some-expression;

is NOT allowed in Felix. There are no expression statements.
Note that assignment:

        a = b;

is NOT an expression. So actually when the parser sees
an expression statement .. it fixes it up, by changing
it into a call by adding () to the end .. :)

This is actually messy. If you write:

        f a;

it will not be 'fixed up' to

        f a ();

because it already looks like a call, only things that
don't already look like calls are fixed up by the parser.
[But the type checker may do something too, I don't
remember]

But you're probably right: the tutorial is probably wrong
from the readers viewpoint and should be fixed.

> - Theory -
> 
> * I like that Felix has lightweight anonymous structs; using objects
> in O'Caml for that purpose always seems like overkill to me.  On the
> flip side, does Felix have anonymous unions?  I.e. something
> equivalent to O'Caml's polymorphic variants?

Felix does have algebraic structurally typed sums: eg

        typedef unit = 1;
        typedef bool = 1 + 1;

That + is for a sum type. The constructors are written

        (case 0 of (type expression)) (arguments if any)
        
and they can be pattern matched (without needing to specify
the type expression since that can be deduced from the
match argument).

In particular

        macro val false = case 0 of 2;
        macro val true = case 1 of 2;

The zero origin case numbering is very annoying from an English
viewpoint but consistent with zero indexing in other places.

There is no dual of records (anonymous structs) which would
be like Ocaml polymorphic variants .. it could be done, though
we'd have to think how to do it.

For structs, the components are always present at the point
of use so the structural types is determinate.

For polymorphic variants, they're all cases of a universal
type. There are two ways to establish that: by whole program
analysis (which is what Ocaml does for exceptions), or
just by defining the univeral type into existence.

In the latter case we need a way to check the tags at
run time. C++ Dynamic cast is a possibility.

Note Felix supports overloading type constructors, not
just functions .. and this includes union variant tags.

This would probably have to be abandoned for anonymous
variants (otherwise we'd have to encode the argument
type signature into the mangled name).

Also, it is unlikely a naive implementation would actually
support covariance, which Ocaml polymorphic variants do.
In other words they woudn't be as useful as in Ocaml
for recursive types (even though in Ocaml they're useless
if there are more than couple of parameters, since you have
to use open recursion trick to make covariant extension
possible).


> * It seems to me that in Felix neither objects nor classes support
> subtyping.  Am I missing something?  Or is the plan to outright
> replace objects and classes with typeclasses (which would be awesome)?

Felix does not support implicit subtyping with one exception:
there's a HACK to support lvalues. Lvalues are a hack and
should not be in the type system .. but they are to make
interfacing with C primitives easier. In particular
without them you can't do general address taking.

Lvalue[t] can decay implicitly to t, as in C++,
that is a C function accepting an rvalue argument
can also accept an lvalue.

Subtyping is missing for two reasons:

(a) it is HARD to get right!
(b) it is even HARDER to get right in the presence of overloading

Note that Felix does support specialisation subtyping, that
is, it does support overload resolution in the presence
of more and less specialised polymorphic signatures:
given

        fun f[t,u] : t * u -> int
        fun f[t] : t * t -> int
        fun f: int * int -> int

the most specialised function is always used on a call.


> Sorry for the long e-mail... I think Felix is a fascinating language
> and has great potential, or else I wouldn't nitpick so much :)

Well your punishment is that my reply is even longer .. :)

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language

Re: [Felix-language] Sundry questions

Reply via email to