On Wednesday, 23 March 2016 at 15:23:38 UTC, Simen Kjaeraas wrote:
Without a bit more detail, it's a bit hard to help.

std.algorithm.splitter has an overload that takes a function instead of a separator:

    import std.algorithm;
    auto a = "a,b;c";
    auto b = a.splitter!(e => e == ';' || e == ',');
    assert(equal(b, ["a", "b", "c"]));

However, not only are the separators lost in the process, it only allows single-element separators. This might be good enough given the information you've divulged, but I'll hazard a guess it isn't.

My next stop is std.algorithm.chunkBy:

    auto a = ["a","b","c", "d", "e"];
    auto b = a.chunkBy!(e => e == "a" || e == "d");
    auto result = [
        tuple(true, ["a"]), tuple(false, ["b", "c"]),
        tuple(true, ["d"]), tuple(false, ["e"])
        ];

No assert here, since the ranges in the tuples are not arrays. My immediate concern is that two consecutive tokens with no intervening values will mess it up. Also, the result looks a bit messy. A little more involved, and according to documentation not guaranteed to work:

bool isToken(string s) {
    return s == "a" || s == "d";
}

bool tokenCounter(string s) {
    static string oldToken;
    static bool counter = true;
    if (s.isToken && s != oldToken) {
        oldToken = s;
        counter = !counter;
    }
    return counter;
}

unittest {
    import std.algorithm;
    import std.stdio;
    import std.typecons;
    import std.array;

    auto a = ["a","b","c", "d", "e", "a", "d"];
    auto b = a.chunkBy!tokenCounter.map!(e=>e[1]);
    auto result = [
        ["a", "b", "c"],
        ["d", "e"],
        ["a"],
        ["d"]
        ];
    writeln(b);
    writeln(result);
}

Again no assert, but b and result have basically the same contents. Also handles consecutive tokens neatly (but consecutive identical tokens will be grouped together).

Hope this helps.

--
  Simen

Thanks Simen,
your tokenCounter is inspirational, for the rest I'll take some time for testing.

But some additional thoughts from my sided:
I get all the lines of the file into one range. Calling array on it should give me an array, but how would I use find to get an index into this array? With the indices I could slice up the array into four slices, no allocation required. If there is no easy way to just get an index instead of an range, I would try to use something like the tokenCounter to find all the indices.



Reply via email to