Derek Parnell wrote:
On Thu, 19 Feb 2009 07:01:56 -0800, Andrei Alexandrescu wrote:
These all put the regex before the string, something many people would
find unsavory.
I don't. To me the regex is what you are looking for so it's like saying
"find this pattern in that string".
Yah, but to most others it's "match this string against that pattern".
Again, regexes have a long history behind them. So probably we need to
have both "find" and "match" with different order of arguments, something .
Anyway, std.algorithm defines find() like this:
find(haystack, needle)
In the least structured case, the haystack is a range and needle is
either an element or another range. But then we can think, hey, we can
think of efficient finds by using a more structured haystack and/or a
more structured needle. So then:
string a = "conoco", b = "co";
// linear find
auto r1 = find(a, b[0]);
// quadratic find
auto r2 = find(a, b);
// organize a in a Boyer-Moore structure; sublinear find
auto r3 = find(boyerMoore(a), b);
I'll actually implement the above, it's pretty nice. Now the question
is, what's the haystack and what's the needle in a regex find?
auto r3 = find("conoco", regex("c[a-z]"));
or
auto r3 = find(regex("c[a-z]"), "conoco");
?
The argument could go both ways:
"Organize the set of 2-char strings starting with 'c' and ending with
'a' to 'z' into a structured haystack, then look for substrings of
"conoco" in that haystack."
versus
"Given the unstructured haystack conoco, look for a structured needle in
it that is any 2-char string starting with 'c' and ending with 'a' to 'z'."
What is the most natural way?
Andrei