On 07/30/2013 08:03 PM, Alex Rousskov wrote: > On 07/30/2013 10:04 AM, Amos Jeffries wrote: >> On 31/07/2013 3:03 a.m., Tsantilas Christos wrote: >>> 2) If configuration_includes_quoted_values is set to "on" (new style >>> enabled) and token ends on a ( character, consider it as function name. >>> If the token is "parameters" return FunctionParameters type else return >>> FunctionNameUnknown type and parsing fails > > >> This above logic still means that we are requiring all regex patterns to >> be quoted strings whenever they contain () brackets around the final >> segment of pattern. > > To be more accurate with regard to intent, the above implies that folks > should not use new syntax (configuration_includes_quoted_values on) with > regular expressions for now [until we add proper RE support]. > > Quoting REs using regular strings is too painful and error prone. It is > not recommended. > > >> => If we are going to make any regex require special quoting, then I >> think it worthwhile being consistent and making them all require quoting. >> Are we agreed that making regex "-quoted is a good idea? > > We need special syntax for RE IMO. I like what Perl does because it is > m/simple/ and m@flexible@, but perhaps there is a better way. > > >> The '\' escape is already used internally by regex and layering multiple >> levels of \-escaping gets nasty quite quickly.
This is true... Maybe we can limit escaping to a set of characters. For example if single quotes used, escape only the ' character and if double quotes used allow escaping " % and $ characters > > Agreed. > > > >>> 3) If configuration_includes_quoted_values is set to "off" (new style >>> enabled) and token ends on a (, if the token is "parameters" return >>> FunctionParameters type, else do not end the token on '(' but to the >>> next whitespace character. >>> For example for the string "test(test) more" will return as token the >>> "test(test)" >> >> config set to "off" is legacy parser. Typo in your description? Yes! >> >> If quoted-values is OFF. I would be expecting to get the token... >> "test(test) ... notice the missing end-quote since start-quote should >> have been ignored and treated as part of the single word token. > > I agree with Amos here. Legacy parsers should continue function as > before quoted strings changes if possible. We do not even need to (and > perhaps should not) support parameters() syntax in a legacy parser. Is > it possible to give folks true legacy behavior? In my example the quotes should not exist. What I want to say in my example is that for the following line: test(test) more The NextToken will return test(test). However Amos refers to an other case. For the following line: "Simple Tokens" we may want to retrieve the token "Simple Do we have any example where this is required? (Not for regex, for regex we have an exception...) > > >>> For the new style if someone want to use '(' character on a regex for >>> example, he should use quotes: >>> 'test(.*/)' >>> "test(.*/)" > > Double quotes do not work well because REs use backslashes of their own. > It becomes too messy as Amos noted above. I think we should not support > REs in a new parser until we add Perl-like RE quoting (or better). OK. This is something we can do it. We can add a method ConfigParser::NextWord which will return a raw word which will consist by any non whitespace character. Whitespaces can be escaped. Or use an On/Off global variable which enables/disables for the next token this behaviour.... > > >>> If someone does not want interpret macros he should use single quotes: >>> 'test(.*)\/$' > > >> I was going to suggest removing the single-quoted strings support as >> well to avoid letting people quote randomly with either. But with the >> regex problems I think we should leave that in as the method to prevent >> \-unescaping and document that regex patterns be single-quoted strings >> while other tokens be double-quoted. > > I am OK with leaving single-quoted string support in provided we agree > on how backslashes are handled inside single quotes. If they are ignored > (as all other special characters), then please note that REs (and other > single-quoted tokens) will not be able to contain single quotes. If they > are not ignored, we have a problem with REs as with double quotes... > If we limit the escaping to some special chars? Can this work? > > Cheers, > > Alex. > >