On 31/07/2013 5:46 a.m., Tsantilas Christos wrote:
On 07/30/2013 08:03 PM, Alex Rousskov wrote:
On 07/30/2013 10:04 AM, Amos Jeffries wrote:
On 31/07/2013 3:03 a.m., Tsantilas Christos wrote:
2) If configuration_includes_quoted_values is set to "on" (new style
enabled) and token ends on a ( character, consider it as function name.
If the token is "parameters" return FunctionParameters type else return
FunctionNameUnknown type and parsing fails
This above logic still means that we are requiring all regex patterns to
be quoted strings whenever they contain () brackets around the final
segment of pattern.
To be more accurate with regard to intent, the above implies that folks
should not use new syntax (configuration_includes_quoted_values on) with
regular expressions for now [until we add proper RE support].
Quoting REs using regular strings is too painful and error prone. It is
not recommended.
=> If we are going to make any regex require special quoting, then I
think it worthwhile being consistent and making them all require quoting.
Are we agreed that making regex "-quoted is a good idea?
We need special syntax for RE IMO. I like what Perl does because it is
m/simple/ and m@flexible@, but perhaps there is a better way.
The '\' escape is already used internally by regex and layering multiple
levels of \-escaping gets nasty quite quickly.
This is true...
Maybe we can limit escaping to a set of characters.
For example if single quotes used, escape only the ' character and if
double quotes used allow escaping " % and $ characters
Agreed.
3) If configuration_includes_quoted_values is set to "off" (new style
enabled) and token ends on a (, if the token is "parameters" return
FunctionParameters type, else do not end the token on '(' but to the
next whitespace character.
For example for the string "test(test) more" will return as token the
"test(test)"
config set to "off" is legacy parser. Typo in your description?
Yes!
If quoted-values is OFF. I would be expecting to get the token...
"test(test) ... notice the missing end-quote since start-quote should
have been ignored and treated as part of the single word token.
I agree with Amos here. Legacy parsers should continue function as
before quoted strings changes if possible. We do not even need to (and
perhaps should not) support parameters() syntax in a legacy parser. Is
it possible to give folks true legacy behavior?
In my example the quotes should not exist. What I want to say in my
example is that for the following line:
test(test) more
The NextToken will return test(test).
However Amos refers to an other case. For the following line:
"Simple Tokens"
we may want to retrieve the token "Simple
Do we have any example where this is required? (Not for regex, for regex
we have an exception...)
In most places where we accept header values in the legacy parser.
I have seen a few people who have things like "quoted string" for auth
realm, and since the old squid.conf sent the quotes out as-is they ended
up with helpers and browsers security records using the quoted form.
There might be any amount of unknown pain ahead if we suddenly start
stripping quotes off strings in the backward compatible parse mode.
For the new style if someone want to use '(' character on a regex for
example, he should use quotes:
'test(.*/)'
"test(.*/)"
Double quotes do not work well because REs use backslashes of their own.
It becomes too messy as Amos noted above. I think we should not support
REs in a new parser until we add Perl-like RE quoting (or better).
OK. This is something we can do it.
We can add a method ConfigParser::NextWord which will return a raw word
which will consist by any non whitespace character. Whitespaces can be
escaped.
Or use an On/Off global variable which enables/disables for the next
token this behaviour....
Hmm. Might work.
I have in mind to shuffle the ConfigParser methods into groups. One for
the parser mechanics and one group for the type-specific GetX() methods.
Some of those methods are provided in ConfigParser already, the rest are
mixed up with src/Parsing.cc.
A ConfigParser::RegexPattern() method should be perfectly fine to add.
Amos