Re: C++Now! 2012 slides

Peter Alexander Thu, 07 Jun 2012 16:53:49 -0700

On Thursday, 7 June 2012 at 22:29:09 UTC, Andrei Alexandrescuwrote:

Great points, example could be a lot better. Maybe it's timeyou do get started on find(). Destroy or be destroyed.


Ok...


This overload:

R find(alias pred = "a == b", R, E)(R haystack, E needle)
if (isInputRange!R &&
        is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{
    for (; !haystack.empty; haystack.popFront())
    {
        if (binaryFun!pred(haystack.front, needle)) break;
    }
    return haystack;
}

Is fine. In fact it's practically perfect. It's the obvioussolution to a simple problem. The biggest problem is the "istypeof" syntax, but that's not your fault.


Second overload:

R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
if (isForwardRange!R1 && isForwardRange!R2

&& is(typeof(binaryFun!pred(haystack.front,needle.front)) : bool)

        && !isRandomAccessRange!R1)
{

static if (is(typeof(pred == "a == b")) && pred == "a == b"&& isSomeString!R1 && isSomeString!R2

            && haystack[0].sizeof == needle[0].sizeof)
    {

//return cast(R1) find(representation(haystack),representation(needle));

        // Specialization for simple string search
        alias Select!(haystack[0].sizeof == 1, ubyte[],

Select!(haystack[0].sizeof == 2, ushort[],uint[]))

            Representation;
        // Will use the array specialization

return cast(R1) .find!(pred, Representation,Representation)(cast(Representation) haystack, cast(Representation)needle);

    }
    else
    {
        return simpleMindedFind!pred(haystack, needle);
    }
}

As far as I can tell, this is a proxy for various substringsearch implementations.

Problem 1: Why is find overloaded to do substring search? Whydoes it do substring and not subset or subsequence? substring isa completely different algorithm from linear search and even hasdifferent asymptotic running time. This is needlessly overloaded,it adds nothing.

Problem 2: The attempted specialisation is atrocious. It comparesthe predicate string with "a == b". I just did a quick check, andthis means that these two calls DO NOT use the same algorithm:


string a = ..., b = ...;
auto fast = find!"a == b"(a, b);
auto slow = find!"a==b"(a, b);

I have to be honest, I have only just noticed this, but that'sreally sloppy.

It's also a direct symptom of over-complicated code. As a user, Iwould 100% expect these calls to be the same. If I accidentallyused the second version, I would have no warning: my code wouldjust run slower and I'd be none the wiser. Only upon carefulinspection of the source could you discover what this "simple"code does, and you would be horrified like I am now.

The next two overloads of find appears to implement a couple ofreasonably fast substring. Again, it would take me a minute or soto figure out exactly what algorithm was being used.

After that there's a multi-range find. Seems simple enough, butit's yet another overload to consider and I'm not sure it'scommonly used enough to even warrant existence. I'd hate to thinkwhat would happen if I wanted the single-range search butaccidentally added an extra parameter.

In summary: the specialisation detection is shocking, which leadsme to question what other static ifs are accidentally firing orfailing. If the code was more simple, and less overloaded itwould be easy to reason about all this, but it's not. The variousfind implementations span a few hundred lines, and all havenumerous compile time branches and checks. The cyclomaticcomplexity must be huge.


How I'd do things:

- The first version of find would be the only one. No overloads,no specialisation.- substring would be a separate, non-predicated function. Itwould be specialised for strings. I'm too tired now to thinkabout the variety of range specialisations, but I'm sure there'sa more elegant way to handle to combinations.

- Drop the variadic find altogether.

- Get rid of BoyerMooreFinder + find overload, replace with aboyerMoore function.

Re: C++Now! 2012 slides

Reply via email to