I am writing this to explain compact linear types to myself so I can get the code right.
First, the word compact just means there are "no holes", and linear means totally ordered. A subrange of int 0..n-1 is a compact linear type denoted n, it has n values. Consider a pair: 3 * 5 That's an integer in the set {0,1,2} followed by one in the set {0,1,2,3,4}. If we represent this in two successive bytes the 16 bit word resulting is not compact. But here is a compact representation: x * 5 + y // where x:3, y:5 It's clear there are 3*5=15 values, represented by integers in the range 0..14, which is a compact linear type. More generally this is just a standard positional number system with arbitrary radix in each position. For the type 3 * 5 * 7 the encoding is just x * 5 * 7 + y * 7 + z If the radices are all the same, for example: 2 * 2 * 2 then we just get the usual formula: x * 2 * 2 + y * 2 + z = x * 2 ^2 + y * 2^1 + z * 2 ^0 Note the exponent is equal to the digit position from right to left, but tuples are numbered left to right. So that's how products are encoded. What about sums? Consider the sum type 3 + 5 What does this mean? Well it means, in the first case we have a value in {0,1,2} OR we have a value in the second case in {0,1,2,3,4}. The obvious encoding scheme is: (x + 5) + y So basically the values 0,1,2 of the first case map to the values 5,6,7 and the values of the second case stay as they are. More generally, we just add a value to the sum of the sizes of all the cases after it. Unsurprisingly we have 3 + 5 = 8 values. Now, to decode a sum K, we add the size of the right most N terms until we get a value greater than K, then it is the N-1'th case from the right. For example 5 < 6 < 3 + 5, so N=2 which means we have 1 case from the right, which is case number 0 from the left. To get the i'th term out of a product type, we first divide by the product of all the sizes on the right, then take the remainder of the size of the i'th term from the right. So for value 14 of type 3 * 5, to get the first component we first divide by 5 which yields 2, then modulo by 3 also leaving 2. To get the second term, we divide by 1, and modulo by 5, leaving 4, and hence 2 * 5 + 4 = 14 Note that for both products and sums I have chosen a BIG ENDIAN encoding which means the tuple or sum component number numbered from the right instead of the left. This is basically because everyone is sure to believe that the binary number 10 represents 2: we read numbers from high to low, left to right, so the low order digit is last. So the tuple (1,0) : 2 * 2 is encoded as value 2. However the index numbering in code is left to right. The encoding is counter intuitive in one sense but it ensures if we sequence through the values 0,1,2,3 we get the tuples (0,0), (0,1), (1,0), (1,1). Well, that's it. Its that simple really, except we have to throw in recursion. If we note that the type 3 is actually a sum 3 = 1 + 1 + 1 representing 3 choices, where 1 is the type of the empty tuple () which has 1 value, then a compact linear type is any combination of sums and products of nothing: 1 is a compact linear type any sum of compact linear types is compact linear any product of compact linear types is compact linear That's it (actually we could throw in 0 too). Why are compact linear types useful? Obviously, they provide a guaranteed minimal representation. However the linearity is useful too: by ranging from 0 through to the size minus 1 of the type, you exhaust all possible values. In particular if we have, say, and array of length 3 of arrays of length 5: (int ^ 5) ^ 3 there's an associated matrix: int ^ ( 3 * 5) with indices which are values of the tuple type 3 * 5. With our compact linear representation i = x * 5 + y we can range i from 0 to 14, and cover all the elements of the array of arrays in the order they're stored. Note that the order reverses going from (int ^ 5) ^ 3 --> int ^ (3 * 5) This seems weird but note: a . x . y --> m . (x,y) results. The x index is applied first in both cases, then the y term. y moves fastest, so adjacent y values mean adjacent store. One has to note that in the expression: (int ^ 5) ^ 3 you proceed outside in, so you have to select a x:3 first, then a y:5. When you convert your array of arrays of int with two linear indices to an array-array (matrix) of int with a single pair index AND you then encode the pair as a compact linear type, you have got back to a single linear array with a single linear index. And that is how polyadic array addressing works. Do both conversions, and you can scan all the elements independently of how many dimensions it originally had. I will note finally the strange fact that with compact linear indexing this is actually an array: ( (0,1,2), (3,4,5,6,7) ) The type seems to be (int ^ 3) * (int * 5) which is surely just a tuple not an array. But with magic is this not also: int ^ ( 3 + 5) which means a single index with a sum type. And recall the compact linear encoding is 0 .. 7. When you think what the indices mean, it means: first chose either the left or right subarray If you chose the left one, pick one of the 3 elements else you chose the right one so pick one of the 5 elements and our theory just says that's the same as picking one of 8 elements. Ok, so there is one more "fun" thing to note at the moment. We had a complicated array thing like: int ^ (3 *5 + 2) where the index type is compact linear. But remembering bool = 2 what about 2 ^ (3 * 5 + 2) This is an array with a compact linear index, but the base type is also compact linear. And an array is a tuple! So in this case *the array itself is compact linear*. CAVEAT: in practice we have to limit the size of a compact linear type to one that fits into some integer type because in general we have to decode it with integer operations like division and remainder. The NATURAL use of such types is then array indices, because arrays are always limited by available memory, and thus in practice the index will always fit in a machine word. Even for a bit array, you only need to ensure on a 64 bit machine you only have arrays of around 8'th size of memory. -- john skaller skal...@users.sourceforge.net http://felix-lang.org ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language