I have been thinking how to sort out the representation of projections. This is a very long explanation and you have to read all of it to get an idea of the problem!
The problem ========== At present, Felix has a combinator BEXPR_get_n (idx,aggregate) which does this. Originally, the idx was an integer constant. This was the internal model of the tuple projection (1,"hello",3.4) . 1 // --> "hello" but was extended to make idx an expression so this would work too: (1, 2, 3) . (1 + 1) // -> 3 which is an array projection with index an expression. Recall we try to model an array as a tuple with "all values the same type". Recall also an important symmetry of Felix: in principle there is no assignment. Instead we have pointers and we can store at a pointer: val p = new (1,2,3); p <- (4,5,6); The <- operator is, in principle, just a library function: proc _set[T] : &T * T = "*$1 = $2"; I will note here that every compiler writer and language designers dream is to *minimise* the number of terms the system has to deal with and move everything possible into the library. Now, here we have another way to get a pointer: var x = (1,2,3); val y = &x; by "taking the address of" a variable. In C, you can address any lvalue expression and in C++ there is a similar rule although there it is type dependent. The concept of lvalues is evil and must be eliminated! So in Felix, we have the rule instead: you can only take the address of a variable. Again I must explain, that you are NOT applying a real operator to the variable. Rather the variable *IS* the address. The actual name of the variable above is &x and it is an address, the plain symbol x actually means * (&x) we just munge the notation to make it easier for C programmers to get a grasp of Felix. So actually "&" is not an operator, it just saves dereferencing the real variable &x everywhere it is used. OK, so now we must observe this leads to a problem: how do we implement this one: var x = (1,2,3); x . 1 = 42; assert x == (1,42,3); This can certainly be done as so: x = (x.0, 42, x.2); i.e. by replacing the whole value of x with a new value constructed to be the same as the old one except for the modified component. This mechanism has a name, its called "functional update" and it isn't well represented in Felix for arrays or tuples. It is well represented for records though: you can write: var r = (a=1, b=2, c=3); r = extend r with (b=42) end; Anyhow, that's an aside! We want to write to the array component *directly* because we want to be as fast as C and creating a new array just to change one component of a variable is a tad slow. So we do MAGIC: we create a high power symmetry. The idea is simple. We have defined projections for arrays and tuples. Now we just extend the definition to pointers to these things too. var x = (1,2,3); var px = &x; (&px) . 1 <- 42; What this says is that if x is an array of T, then &x is an array of &T. This is very similar to what C does, but the way C does it is seriously broken. The way Felix does this, there is no "decay" of an array to a pointer to the first element. Instead we have a properly typed model for pointers to arrays. It really important to understand why this operation is vastly superior to C. We have dispensed with the notion of lvalues entirely. We have eliminated assignment from the language as well. All we have is pointers, and a store function, the projection function which fetches values from an array is extended to calculate addresses. Why is this SO totally cool? Because, apart from the store function in the library, all the rest of it is PURELY FUNCTIONAL. Lets note a difference to C then: in C you can do this: int a[] = { 1,2,3 }; int *pa = & (a[1]); This is really BAD because you are addressing an lvalue a[1] = *(a + 1). In Felix you cannot do this because there are no lvalues. Instead you just do val pa = (&a) . 1; and you aren't "taking the address of" anything at all. Remember that &a isn't taking the address of a variable, its just a name for the address that the variable IS. Its very important to understand why "get and set" methods do NOT work. Particularly as you will actually see these in the library :) The reason, trivially, is that for a matrix there is no way to involute the set method, you can set a row of the matrix, and get one, but to change one element you have to get the row, set an element of the row, and then store the whole row back. With pointer calculations, you can calculate the element address directly (as you can in C, for the same reason). For example: var m = ( (1,2,3), (4,5,6), (7,8,9)); (&m) . 1 . 1 <- 42; // replaces 5 This works because not only is m an array of arrays of integers, but &m is an array of pointers to arrays of integers. And then (&m) . 1 is a pointer to an array of integers so ( (&m) . 1 ) . 1 is a pointer to a single integer. There is a very cool commutative diagram here, that is a symmetry: assert *( (&m) . 1) . 1) == (*(&m . 1)) . 1; and this is the crux of the Felix object model. The same law exists in C, but it is NOT reflected in the notation correctly (the concept of lvalues and the decay of array types gets in the way). So far I haven't explained the problem, all the above is background! Its really important to see how extending array projections to pointer to array projections *defines* the notion of object, in a purely functional way. Of course we want to do the same for tuples and records and structs! We certainly want to be able to do this: struct X { int a; int b; }; struct Y { X x; int y; }; var y = Y ( X (1,2), 3); y . x . a = 42; well this is nonsense, we have no assignment so, (&y) . x . a <- 42; And this should work, by a similar extension to that used for arrays: we have y.x as the x projection of y being an X, so (&y) . x is a projection of a pointer to a Y, which would then be a pointer to an X. It all works nicely! But now we have a glimpse of "the problem"! For arrays, we were happy with integers as indices. That was cool because integer *expressions* could be used. However, this does NOT work for tuple or records. We know from C and C++ that there is an expression index for structs, its called a pointer to member in C++, which is a (rather badly) typed version of an offsetof() offset and a bunch of casts in C. For tuples in Felix we used integer constants. But the only way to get expressions would be if all the tuple elements had the same type. The FIX for this is to note that the array projections are WRONG and repair them. More precisely, they don't generalise! Its very important to note again the array projections we studied have a vital symmetry: they extend naturally to pointers allow us to get rid of lvalues. We must not lose this valuable property! For a tuple we KNOW how to do a projection with an expression index: suppose we have a tuple (1, "hello", 4.3) which has type int * string * double then the i'th component has type: int OR string OR double and in Felix that is just a sum type: int + string + double For a struct, the result is similar: for struct X { int a; double b; }; the result is of type: union X' { int a'; double b'; }; where this is a FELIX union (NOT a C union!). Unfortunately Felix does NOT have a type for an offset at the moment. So lets go back to tuples! Lets *change* the meaning of indexing a tuple: var result : int + double = (1, 2.3) . true; The result is clearly (case 1 of (int + double)) 2.3 We can extract this by: match (1, 2.3) . true with | case 0 (?i) => "integer " + i . str | case 1 (?d) => "double " + d . str endmatch That's how we process unions: with switches (match is a glorified switch). It's important to note here I chose the index as type typedef bool = 2; because there are two components in the tuple. We should do this for arrays too of course. So now an array index must be a value of type N, where N is the length of the array. Having done that, we cannot require any array bounds check. But NOW we have a glimpse of the real problem. If an array is a tuple, then why do tuple projections (accepting expressions as indicies instead of constants) return a sum type, whereas arrays don't?? The answer is obvious: the tuple case is the most general, arrays are a special case, so array indexing is just plain wrong. What we need is a special operator I am going to call "smash". Smash throws away the match otherwise required, and only works on the dual of an array type: a sum of all the same type. We need to look at the theory here. We know an array is given by T^N that is, a type T *raised to the power of N* where N is a unit sum. We know this is isomorphic to: T * T * T * .... * T // N copies of T and indeed in Felix (at the moment) it isn't just isomorphic, its equal. The rule for sum types is similar: T * N is a sum type with N options, each of type T, which is isomorphic to T + T + T .. + T // N options for T Of course there are just your usual indexing laws, which is entirely the point! In fact, you will notice that: T * N looks like a tuple with two components, a T and an N. And indeed that is the usual representation of a sum component: a pair consisting of the value and a discriminant saying which component it is. So now, we have a more general model of a projection: gproj : N -> A -> ~A where ~A means "the dual of A" and A is either a tuple or array. and we have smash: T + T + T + ... + T -> T * N which translates the internal representation of a sum to an actual pair (since this isomorphism isn't necessarily an equality at the representation level). In particular we can now recover our old array projections: fun proj[N,T] (ix:N) (a:T^N): = (gproj x a). smash . 0 ; So we have problem with the concrete syntax, because a . i is expected to return an array element, but in theory it should return a sum which needs smashing: if a is a tuple you cannot smash it and instead have to write out the match long hand. But the REAL point of all this is that I want to get rid of BEXE_get_n! Because now, it returns the theoretically wrong type. The constraint that the index must be an integer literal for tuples but any integer expression for arrays, is crap. In both cases the index should be a unit sum of the required length, that is, a modulo N number, and the result is a sum type. If used for an array the smash has to be applied. Instead of BEXE_get_n I want to just use BEXE_apply because after all, a projection is just a function. And the problem is that I AM ALREADY DOING THIS in some places in the compiler and the result is extremely confusing. I want to show another issue here. Felix has what I have called compact linear types. What there are are basically just tuples of small values packed into an integer. Unlike C, which allows bitfields, Felix goes a step further: compact linear types are compact, meaning they use optimal packing, using a variable base number system: a * N * M + b * N + c where c: N, b : N * M. If N, M etc are powers of two this reduces to the usual bitfields. The importance of this representation is two fold. The "compact linear" bit ensures you can use ordinary integer increments to range through all the values. This is how Felix does polyadic arrays (array operations such as folds which are independent of the number of dimensions). The other really super cool advantage is that: var x = true, true, false, true; has type bool * bool * bool * bool but wait, bool is just 2, so that's 2 * 2 * 2 * 2 but wait all the components have the same type so thats 2 ^ 4 Yes, its an array of bool. So what you say! But by specification arrays are tuples and a tuple of small units sums is a compact linear type. And so the representation of this array should AUTOMATICALLY be compact, that is, it will be a single machine word with the components stored as single bits. C++ eat your heart out! The point: bitmaps, with NO SPECIAL NOTATION. Nothing special. You get a bit map for an array of bits at least 8 times more compact than anything you would get in C or C++. The problem? Clearly a single machine word cannot be used to address these bits, but they HAVE to be addressable or we cannot set them. Either that or the automatic handling of compact linear types has to be removed and replaced by manual pack/unpack. If we want to store these things, we have to have a pointer that looks like: base address divisor multiplier where the divisor and multipiler aren't just constants. Felix can already do all this when the indices are constants! But it CHEATS, it uses assignment to set bits, instead of pointers. -- john skaller skal...@users.sourceforge.net http://felix-lang.org ------------------------------------------------------------------------------ How ServiceNow helps IT people transform IT departments: 1. Consolidate legacy IT systems to a single system of record for IT 2. Standardize and globalize service processes across IT 3. Implement zero-touch automation to replace manual, redundant tasks http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language