On 09/27/2013 08:47 PM, Alex Rousskov wrote: > On 09/27/2013 09:39 AM, Amos Jeffries wrote: >> On 28/09/2013 3:18 a.m., Tsantilas Christos wrote: >>> On 09/27/2013 08:23 AM, Alex Rousskov wrote: >>>> Using approach (2) with flexible RE delimiter, we could write >>>> >>>> acl foo url_regex /ends[) (]/ >>>> or >>>> acl foo url_regex {ends[) (]} >>>> or >>>> acl foo url_regex @ends[) (]@ >>>> >>>> and it will all work without double escaping. >>> >>> >>> Alex, in the "Revised approach to fixing configuration syntax" mail >>> thread you are proposing to use "regex::" prefix for regular >>> expressions. This is required for grammar consistency. >>> This is means that the regex should like : >>> >>> acl foo url_regex regex::/ends[) (]/ >>> or >>> acl foo url_regex regex::{ends[) (]} >>> or >>> acl foo url_regex regex::@ends[) (]@ > > Yes, IF that syntax is adopted. > > >> Okay Alex I think we can agree on that flexible-delimiter syntax to >> avoid escaping. >> >> I also agree with that regex:: prefix. >> >> Is there anything else we have been disagreeing on? > > > As far as REs are concerned, we need to decide > > 1) Whether we want to support the new regex:: syntax at all or keep > using spaceless REs as before (at least for now) while reserving the > regex:: prefix. > > > 2) If we want to support the new regex:: syntax: > > 2a) What characters do we allow as RE delimiters? Perl allows virtually > any non-whitespace character, even #, but we probably want to be more > restrictive.
Any non whitespace character I think is good choice. Else any non-alphanumeric, non-whitespace character. > > 2b) Do we add support for escaping sequences? As discussed a few emails > back, that support is necessary if we want to support arbitrary REs, > which is somewhat important for automated config generators. It is also > needed for (2c). Escaping is important. The user will select the delimiters which requires the less escaping but may he is not able to avoid it: eg select this one regex::#A/test/with/one\#and/many/# instead of this: regex::/A\/test\/with\/one#and\/many\// > > 2c) Do we add support for character sequences so that one can add > special characters and such? This also requires a form of escaping. For > example, here are some of the sequences supported by Perl (we do not > support all of them immediately, of course, but we need to reserve > \-escape if we want them in the future): > >> Sequence Description >> \t tab (HT, TAB) >> \n newline (NL) >> \r return (CR) >> \f form feed (FF) >> \b backspace (BS) >> \a alarm (bell) (BEL) >> \e escape (ESC) >> \x{263A} hex char (example: SMILEY) >> \x1b restricted range hex char (example: ESC) >> \N{name} named Unicode character or character sequence >> \N{U+263D} Unicode character (example: FIRST QUARTER MOON) >> \c[ control char (example: chr(27)) >> \o{23072} octal char (example: SMILEY) >> \033 restricted range octal char (example: ESC) > > We could also try to abuse existing character class [[:class:]] syntax > for those. For example, we can find and replace [[:squid::octal(32):]] > sequences with a space character. Looks good idea to me to support perl syntax ... > > > Note that (2c) applies to "strings" as well, IMO. > > > Alex.