Re: std.regex literal syntax (the \Q…\E escape sequence)

Dmitry Olshansky Wed, 18 Dec 2013 11:25:55 -0800

18-Dec-2013 22:33, Andrej Mitrovic пишет:

I'm reading through http://www.regular-expressions.info, and there's a
feature that's missing from std.regex,
quoted:


-----
All the characters between the \Q and the \E are interpreted as
literal characters. E.g. \Q*\d+*\E matches the literal text *\d+*. The
\E may be omitted at the end of the regex, so \Q*\d+* is the same as
\Q*\d+*\E.


[snip]

Should this feature be added? I guess there's probably more regex
features missing (I just began reading the page), I'm not sure how
Dmitry feels about adding X number of features though.

All in all I wanted to be principled about what set of features tosupport. The initial design was:

1. Choose a syntax flavor (ECMAScript)
2. Add some powerful stuff (e.g. unlimited lookbehind, full unicode-support)

3. Add some convenient stuff that is popular enough/easy to implement(named captures).4. Avoid extensions that complicate engine and preclude optimizations,or heavily depend on implementation. (So no recursion and similar madness)

In that light 'missing' might be on purpose. For instance std.regexdoesn't provide 'atomic'(possessive) groups simply because it's a kludgeinvented for poor (performance of) backtracking engines.


By the end of day any feature is interesting as long as we carefully weight:

- how useful a feature is
- how widespread the syntax/how many precedents in other libraries

against

- how difficult to implement
- does it affect backwards compatibility
- any other hidden costs

I'd be glad to implement well motivated enhancement requests.

P.S. This reminds me to put a roadmap of sorts on where std.regex isgoing and what to expect.


--
Dmitry Olshansky

Re: std.regex literal syntax (the \Q…\E escape sequence)

Reply via email to