Optionally strongly typed array indexes

bearophile via Digitalmars-d Tue, 03 Jun 2014 14:21:13 -0700

This language feature is absent in D, and it's present in Ada andpartially present in ObjectPascal. I think it's significant.

All the following ideas are preliminary (uncooked, primitive),they are just ideas (and quite probably there are smart peoplethat can invent better things).

This is an example of wrong code, it allocates and initializes amatrix with three rows and two columns:


void main() {
    import std.stdio: writeln;
    auto mat = new int[][](3, 2);
    writeln(mat);

    foreach (immutable i; 0 .. 2)
        foreach (immutable j; 0 .. 3)
            mat[i][j] = i * 10 + j; // Wrong.
    writeln(mat);
}

The out of bound array access bug is found at run-time with thiserror followed by a stack trace:


[email protected](8): Range violation


The Ada language allows you to spot that bug at compile time.

Note: in D you usually avoid that bug writing the code like this.But this is not enough in more complex situations:


    foreach (immutable i, row; mat)
        foreach (immutable j, ref m; row)
            m = i * 10 + j;

If you give strong types to the array indexes, the code becomesmore self-documenting, and the compiler can catch some more ofyour mistakes. In D the associative array type syntax alreadygives a type to the indexes, so I have to tell apart the the caseof associative array from the case of normal dynamic/fixed-sizearray with a strongly typed index. This will make the syntaxworse.


This is a first possible syntax (it looks ugly):


void main() {
    auto M = new int[TR = @typed][TC = @typed](3, 2);

    foreach (immutable TR i; 0 .. 2)
        foreach (immutable TC j; 0 .. 3)
            M[i][j] = i * 10 + j; // Wrong.
}

@typed used inside the [] means that array has strongly typed(size_t) index. "TR = @typed" means that TR is the aliased nameof such type (so you can think of it as "[alias TR = @typed]").

Now such program gives two compile time errors (type mismatch onthe j and i indexes).

Other examples of allocation of 1D arrays with strongly typedindexes:


auto v1 = new int[TV = @typed](10);
int[TV = @typed 10] v2; // fixed-size.

An usage example in a simple function that performs a 2D matrixtransposition:



T[][] transpose(T)(in T[TC = @typed][TR = @typed] m)
pure nothrow @safe {
    auto r = new T[@typed(TR)][@typed(TC)](m[0].length, m.length);
    foreach (immutable nr, const row; m)
        foreach (immutable nc, immutable c; row)
            r[nc][nr] = c;
    return r;
}

"@typed(TR)" means that for this array I am using an alreadydefined index type named TR.

In theory you can also infer the type of the index with thetemplate, but I am not sure how much useful this is:


void foo(TI)(int[@typed(TI)] data) {}

In that transpose function you can also see that you can assign amatrix with typed indexes to one without index types:


int[TI = @typed 10] a;
int[10] b = a; // OK.
a = b;         // OK.

This transpose returns a matrix with untyped indexes to simplifya little the use of the resulting matrix. But if you iterate witha foreach on such result, the index types are inferred, so it'snot a big problem.

Probably there is also some need for a trait to get the type ofthe index of an array (it could return size_t if it's untyped):


static assert(is(__trait(index_type, a) == TI));

(In Ada you can also get the range of an array, so such intervaltypes also keep some other compile-time information. But I thinkthis is not essential for D, so I have not included thisinformation.)


--------------------------

I have used both dynamic languages (like Python) and stronglytyped languages (like D, Haskell). I have seen that both havesome advantages and disadvantages. In the Haskell world lot ofpeople use types to guide their coding, but when I am not usingHaskell I prefer dynamic typing in small programs or when Idevise a complex algorithm, and strong static typing in largerprograms or when I already have the basic code written and I wantto be more sure of its correctness. So I like a language like Dthat gives me the freedom to use more precise types or lessprecise types according to my current needs. Optional strongtypes for arrays are meant for situations where I want to be moresure of the code correctness, or in larger programs, or when thecomplexity of a data structure or the intricacy of a piece ofcode require me to put down lot of precise types to avoid losingcontrol of what I am doing.

Strong index types allow you to avoid mixing by mistake indexvariables when you have more than one array, or when you have 2Dor 3D matrices and you need to not mix rows with columns. In my Dprogramming I sometimes mix the indexes, and when I am lucky Ifind the bug at runtime (but if your 2D matrix is a square it'sless immediate to spot the bug at run-time).


Bye,
bearophile

Optionally strongly typed array indexes

Reply via email to