Re: Does something like std.algorithm.iteration:splitter with multiple seperators exist?

Simen Kjaeraas via Digitalmars-d-learn Wed, 23 Mar 2016 08:26:46 -0700

On Wednesday, 23 March 2016 at 11:57:49 UTC, ParticlePeter wrote:

I need to parse an ascii with multiple tokens. The tokens canbe seen as keys. After every token there is a bunch of linesbelonging to that token, the values.
The order of tokens is unknown.
I would like to read the file in as a whole string, and splitthe string with:
splitter(fileString, [token1, token2, ... tokenN]);
And would like to get a range of strings each starting withtokenX and ending before the next token.
Does something like this exist?
I know how to parse the string line by line and create newstrings and append the appropriate lines, but I don't know howto do this with a lazy result range and new allocations.


Without a bit more detail, it's a bit hard to help.

std.algorithm.splitter has an overload that takes a functioninstead of a separator:


    import std.algorithm;
    auto a = "a,b;c";
    auto b = a.splitter!(e => e == ';' || e == ',');
    assert(equal(b, ["a", "b", "c"]));

However, not only are the separators lost in the process, it onlyallows single-element separators. This might be good enough giventhe information you've divulged, but I'll hazard a guess it isn't.


My next stop is std.algorithm.chunkBy:

    auto a = ["a","b","c", "d", "e"];
    auto b = a.chunkBy!(e => e == "a" || e == "d");
    auto result = [
        tuple(true, ["a"]), tuple(false, ["b", "c"]),
        tuple(true, ["d"]), tuple(false, ["e"])
        ];

No assert here, since the ranges in the tuples are not arrays. Myimmediate concern is that two consecutive tokens with nointervening values will mess it up. Also, the result looks a bitmessy. A little more involved, and according to documentation notguaranteed to work:


bool isToken(string s) {
    return s == "a" || s == "d";
}

bool tokenCounter(string s) {
    static string oldToken;
    static bool counter = true;
    if (s.isToken && s != oldToken) {
        oldToken = s;
        counter = !counter;
    }
    return counter;
}

unittest {
    import std.algorithm;
    import std.stdio;
    import std.typecons;
    import std.array;

    auto a = ["a","b","c", "d", "e", "a", "d"];
    auto b = a.chunkBy!tokenCounter.map!(e=>e[1]);
    auto result = [
        ["a", "b", "c"],
        ["d", "e"],
        ["a"],
        ["d"]
        ];
    writeln(b);
    writeln(result);
}

Again no assert, but b and result have basically the samecontents. Also handles consecutive tokens neatly (but consecutiveidentical tokens will be grouped together).


Hope this helps.

--
  Simen

Re: Does something like std.algorithm.iteration:splitter with multiple seperators exist?

Reply via email to