On Wednesday, 23 March 2016 at 15:23:38 UTC, Simen Kjaeraas wrote:
Without a bit more detail, it's a bit hard to help.
std.algorithm.splitter has an overload that takes a function
instead of a separator:
import std.algorithm;
auto a = "a,b;c";
auto b = a.splitter!(e => e == ';' || e == ',');
assert(equal(b, ["a", "b", "c"]));
However, not only are the separators lost in the process, it
only allows single-element separators. This might be good
enough given the information you've divulged, but I'll hazard a
guess it isn't.
My next stop is std.algorithm.chunkBy:
auto a = ["a","b","c", "d", "e"];
auto b = a.chunkBy!(e => e == "a" || e == "d");
auto result = [
tuple(true, ["a"]), tuple(false, ["b", "c"]),
tuple(true, ["d"]), tuple(false, ["e"])
];
No assert here, since the ranges in the tuples are not arrays.
My immediate concern is that two consecutive tokens with no
intervening values will mess it up. Also, the result looks a
bit messy. A little more involved, and according to
documentation not guaranteed to work:
bool isToken(string s) {
return s == "a" || s == "d";
}
bool tokenCounter(string s) {
static string oldToken;
static bool counter = true;
if (s.isToken && s != oldToken) {
oldToken = s;
counter = !counter;
}
return counter;
}
unittest {
import std.algorithm;
import std.stdio;
import std.typecons;
import std.array;
auto a = ["a","b","c", "d", "e", "a", "d"];
auto b = a.chunkBy!tokenCounter.map!(e=>e[1]);
auto result = [
["a", "b", "c"],
["d", "e"],
["a"],
["d"]
];
writeln(b);
writeln(result);
}
Again no assert, but b and result have basically the same
contents. Also handles consecutive tokens neatly (but
consecutive identical tokens will be grouped together).
Hope this helps.
--
Simen
Thanks Simen,
your tokenCounter is inspirational, for the rest I'll take some
time for testing.
But some additional thoughts from my sided:
I get all the lines of the file into one range. Calling array on
it should give me an array, but how would I use find to get an
index into this array?
With the indices I could slice up the array into four slices, no
allocation required. If there is no easy way to just get an index
instead of an range, I would try to use something like the
tokenCounter to find all the indices.