A Value Range Propagation usage example, and more

bearophile via Digitalmars-d Wed, 08 Oct 2014 05:21:15 -0700

This is the first part of a function to convert to base 58 (someletters are missing, like the upper case "I") used in the Bitcoinprotocol:


alias Address = ubyte[1 + 4 + RIPEMD160_digest_len];

char[] toBase58(ref Address a) pure nothrow @safe {
    static immutable symbols = "123456789" ~
                               "ABCDEFGHJKLMNPQRSTUVWXYZ" ~
                               "abcdefghijkmnopqrstuvwxyz";
    static assert(symbols.length == 58);

    auto result = new typeof(return)(34);
    foreach_reverse (ref ri; result) {
        uint c = 0;
        foreach (ref ai; a) {
            c = c * 256 + ai;
            ai = cast(ubyte)(c / symbols.length);
            c %= symbols.length;
        }
        ri = symbols[c];
    }
    ...
}

The D type system isn't smart enough to see that "ai" is alwaysfitting in an ubyte, so I have had to use a cast(ubyte). Butcasts are dangerous and their usage should be minimized, andto!ubyte is slow and makes the function not nothrow. So I'verewritten the code like this with a bit of algebraic rewriting:



char[] toBase58(ref Address a) pure nothrow @safe {
    static immutable symbols = "123456789" ~
                               "ABCDEFGHJKLMNPQRSTUVWXYZ" ~
                               "abcdefghijkmnopqrstuvwxyz";
    static assert(symbols.length == 58);

    auto result = new typeof(return)(34);
    foreach_reverse (ref ri; result) {
        uint c = 0;
        foreach (ref ai; a) {
            immutable d = (c % symbols.length) * 256 + ai;
            ai = d / symbols.length;
            c = d;
        }
        ri = symbols[c % symbols.length];
    }
    ...
}

Now it can be a little slower because the integer division andmodulus has different divisors, so perhaps they can't beimplemented with a little more than a single division, as before(I have not compared the assembly), but for the purposes of thiscode the performance difference is not a problem. Now the D typesystem is able to see that "ai" is always fitting in a ubyte, andthere's no need for a cast. The compiler puts a safe implicitcast. This is awesome.


- - - - - - - - - - - - - -

But of course you often want more :-)

This is another case where the current D type system allows youto avoid a cast:


void main() {
    char['z' - 'a' + 1] arr;

    foreach (immutable i, ref c; arr)
        c = 'a' + i;
}

But if you want to use ranges and functional UFCS chains youcurrently need the cast:



void main() {
    import std.range, std.algorithm, std.array;

    char[26] arr = 26
                   .iota
                   .map!(i => cast(char)('a' + i))
                   .array;
}

In theory this program has the same compile-time information asthe foreach case. In practice foreach is a built-in that enjoysmore semantics than a iota+map.

Currently iota(26) loses the compile-time information about therange, so you can't do (note: the "max" attribute doesn't exists):


void main() {
    import std.range: iota;
    auto r = iota(26);
    enum ubyte m = r.max;
}

Currently the only way to keep that compile-time information isto use a template argument:


void main() {
    import std.range: tIota;
    auto r = tIota!26;
    enum ubyte m = r.max; // OK
}

But even if you write such tIota range, the map!() will lose thecompile-time value range information. And even if you manage towrite a map!() able to do it with template arguments, you havetemplate bloat.

So there's a desire to manage the compile-time information (likevalue range information) of (number) literals without causingtemplate bloat and without the need to use explicit templatearguments.


Bye,
bearophile

A Value Range Propagation usage example, and more

Reply via email to