On Tuesday, 12 March 2013 at 09:41:08 UTC, Dmitry Olshansky wrote:
*Spoiler*: let's slowly deprecate "g" option in std.regex in a
few years or with any luck a bit faster. The better replacement
is proposed.
For better or worse the current API has retained a (high) level
of compatibility with the old API. That means I've missed the
chance to fix it when I could, and here is the prime problem
(the hardest) I have with it:
foreach(m; match("bleh-blah", "bl[ea]h"))
{
writeln(m.hit);
}
The "quiz" is - how many lines will this print?
The current answer is 1. And that the right solution for all
matches is:
foreach(m; match("bleh-blah", regex("bl[ea]h","g"))
{
writeln(m.hit);
}
Which is not only looks unsightly but also confuses operation
option (find _all_ vs find _first_) with property of a pattern
(like case-insensitivity is). And if regex pattern is defined
elsewhere it could easily introduce a bug (albeit one that's
easy to track, "usually").
To underline the point: std.regex.splitter doesn't take "g"
flag into account at all (it makes no sense there).
I've pondered a couple of solutions in a bug report by
bearophile:
http://d.puremagic.com/issues/show_bug.cgi?id=7260
After all of these ideas born and discarded, here is what I
believe is the way forward out of this mess:
Make "g" indicates only the intended _default_ search mode of
this pattern (global - first match).
User is free to override this default explicitly and in fact
encouraged to do so. The idea of default search mode attached
to the regex pattern is marked as discouraged.
I nearly always forget to include "g" so I welcome any changes
that make make "g" go away. match.first/match.all/etc. is easy
to read and the intent is right up front which I prefer over
tacking a flag argument on the end. matchFirst/matchAll/etc. is
fine too but not nearly as cool :).