I'm still stuck on the order of indices :) Given an array of arrays:
T ^ I ^ J the indexing order is fixed by the data structure: a . j . i Note this is the reverse of the index order in the type. This is not arbitrary: you have to select the subarray of the top level array first, then select the element within it second. This means in j index moves fastest, that is ++j moves sizeof T bytes. Note that using math notation: i (j a) reverses the ordering. However postfix notation is easier for me to grok for array subscripts. Originally I just said that you can coerce the array to T ^ (I * J) so with tuple multi-index the indexing is "invariant" in postfix notation: a . (i,j) [however I screwed this up in one place in the compiler] Note the j index still moves fastest. This is the natural lexicographical sort ordering: think of strings: AA AB AC ...BA BB BC ... the LHS index is the slowest moving in a linear order. however in math notation we get: (i,j) a that is, the ordering in the tuple cannot just swap over: its the same for both notations. With this convention, we get the compaction formula: index = i * len J + j because i is jumping over subarrays of length J. With this notation, a new index for the inner array T --> U ^ N U ^ N ^ I ^ J goes on the end: u . j . i . n whereas if you put the whole thing as an element of a new array: T ^ I ^ J ^ K u . k . j . i HOWEVER, whilst (n,i,j,k) is natural lexicographical ordering, it is also "big endian" like a number system. The biggest digit goes first. Little endian is nicer in many ways. All of this is derived from the arbitrary decision: T ^ I ^ J -> T ^ (I * J) i.e. the order of "indices" is preserved in the type transformation instead of being reversed. It's not clear how sane this is given that with T ^ J you have to read right to left: array length J of array length I of T If we used -> as an exponential instead the whole argument breaks down: a:J -> I -> T Note -> is right associative. Both ^ and -> are exponentials with reversed order of operands. Now the natural order of indexing is invariant: a j i // is a T Note here the array is like a function. It is applied to the first index, which has to be j, to select the inner array which gives another function. That's applied to the second index to give the final element. Now the "curried" form of this function would normally be: a': J * I -> T a (j,i) which is the opposite of what the previous analysis dictates. Of couse it is a simple problem: do we preserve the order of type indicies in the ^ notation or the -> notation? We cannot have both, because reversing the operand order: D -> C == C ^ D where D suggests domain and C codomain, this naturally reverses the indexing order too. Changing the definition of currying seems out of the question! Quite simply the -> matches up the argument order with the type whereas ^ swaps it. So the question is: did I get the rule right, or should it be instead: T ^ I ^ J => T ^ (J * I) and how does this impact the other natural laws of algebra: A * (B + C) => A * B + A * C A ^ (B + C) => A ^ B * A ^ C etc? All of these have meaning. however with types operations like + and * are only commutative and associative up to isomorphism: the types: int * double * long int * (long * double) are isomorphic but not equal. Here the isomorphism implies that you can define a converter from generic types which is fully parametric: convert: T * U * V -> T * ( V * U) meaning that the function doesn't care what the types are. However some isomorphism are more "natural" than others. For example associativity is more natural than commutativity because if you look at the representations in C: struct { int; double; long; } struct { int; struct { double; long; } } for example, the elements are in the same order and the structures are "layout compatible". This means we can use a reinterpret cast on a reference or pointer to these types "safely" in the sense that we don't need to actually move anything around or generate any code (such casts do break strict aliasing rules of ISO C). Exactly which isomorphisms are natural is determined by the representation. In particular an isomorphism in the abstract category of types TYPE may not be an isomorphism in the types of the representation category (in this case C types) in the sense of identity or in the sense of layout compatibility. Indeed the representation encoding is a morphism from Felix types to C types, and that is naturally regarded as a functor. At a lower level its a natural transformation. Which means it preserves isomorphisms. The critical point here is that there are at least three kinds of isomorphism: (1) most general: requires moving stuff about (2) layout compatible: requires a cast only (3) identical. Nothing to do, they're the same and which ones the language designer chooses DICTATES the properties of the language. Currently Felix choses t his one at the abstract level: int * int * int = int ^ 3 as an equality. These two types are identical. At the C level they're not! Tuples are C structs likeL struct { T0; T1; T2; } whereas an array is struct { T data[N]; } These are layout compatible. To "fix" this the Felix compiler translates all tuples of the same type into arrays. However it cannot do this fully until AFTER monomorphisation. So you cannot apply array operations to a generic tuple EVEN IF the types end up being the same. This avoids "dependent name lookup" which is precisely the premature application of operations to a type based on the fact that "it might" end up converting (eg in C++ you can add U and V types. No problem! Provided they end up as numbers! in Felix this is banned: you have to use a type class constraint to allow it] But there's a problem: you cannot prevent say using an expression to index a tuple (1,2,3,4) . i because the type of the LHS is int ^ 4. The isomorphism is applied automagically, which means arrays must be a strict "subtype" of tuples: any operation on tuples applies to arrays too. But the above is a typo: I meant to write: (1,2,3,4) . 1 :-) It may be better to make the user apply the isomorphism: [< 1,2,3,4 >] or perhaps: array_of_tuple (1,2,3,4) Which would also simplify the compiler. In particular instead of magically applying casts in the generated code, the user would have had to write them. Why does all this matter? The tuple/array thing is quite nice. It works reasonably well. But now we're looking at a much larger set of isomorphisms! The choices of which ones are identities and which ones are casts etc are now critical. Here's an example. A compact linear type is one like: 3 + 3 * (6 * 7) The representation is a single integer. This means for a polymorphic type we cannot calculate the representation. This is already true for arrays vs tuples, but in that case the two are always layout compatible. Compact linear types are integers and are clearly NOT layout compatible with tuples or sums. Introducing the representation as magical (as I am doing) makes the compiler very complex. Pointers requires three words instead of one (you have to divide and modulo the integer to extract a component, the same as shift and mask for bitfields). -- john skaller skal...@users.sourceforge.net http://felix-lang.org ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_123012 _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language