18-Dec-2013 22:33, Andrej Mitrovic пишет:
I'm reading through http://www.regular-expressions.info, and there's a
feature that's missing from std.regex,
quoted:

-----
All the characters between the \Q and the \E are interpreted as
literal characters. E.g. \Q*\d+*\E matches the literal text *\d+*. The
\E may be omitted at the end of the regex, so \Q*\d+* is the same as
\Q*\d+*\E.

[snip]
Should this feature be added? I guess there's probably more regex
features missing (I just began reading the page), I'm not sure how
Dmitry feels about adding X number of features though.

All in all I wanted to be principled about what set of features to support. The initial design was:
1. Choose a syntax flavor (ECMAScript)
2. Add some powerful stuff (e.g. unlimited lookbehind, full unicode-support)
3. Add some convenient stuff that is popular enough/easy to implement (named captures). 4. Avoid extensions that complicate engine and preclude optimizations, or heavily depend on implementation. (So no recursion and similar madness)

In that light 'missing' might be on purpose. For instance std.regex doesn't provide 'atomic'(possessive) groups simply because it's a kludge invented for poor (performance of) backtracking engines.

By the end of day any feature is interesting as long as we carefully weight:

- how useful a feature is
- how widespread the syntax/how many precedents in other libraries

against

- how difficult to implement
- does it affect backwards compatibility
- any other hidden costs

I'd be glad to implement well motivated enhancement requests.

P.S. This reminds me to put a roadmap of sorts on where std.regex is going and what to expect.

--
Dmitry Olshansky

Reply via email to