I would concur with Alex that (4) is preferable: It does not break old
configurations, re-uses existing mechanisms and allows to apply it only
when/where required. I have one more option for your consideration:
escaping with a backtick (e.g., `n) instead of a backslash. This
approach is used, e.g., in PowerShell.
5a. Recognize just `n escape sequence in squid.conf regexes.
5b. Recognize all '`'-based escape sequences in squid.conf regexes.
Pros: Easier upgrade: backtick is rare in regular expressions (compared
to '%' or '/'), probably there is no need to convert old regexes at all.
Pros: Simplicity: no double-escaping is required (as in (1b)).
Cons: Though it should be straightforward to specify common escape
sequences, such as `n, `r or `t, we still need to devise a way of
providing arbitrary character (i.e., its code) in this way.
HTH,
Eduard.
On 20.01.2022 00:32, Alex Rousskov wrote:
Here is a fairly representative sample:
1a. Recognize just \n escape sequence in squid.conf regexes
Pros: Simple.
Cons: Converting old regexes[1] requires careful checking[2].
Cons: Cannot detect typos in escape sequences. \r is accepted.
Cons: Cannot address other, similar use cases (e.g., ASCII CR).
1b. Recognize all C escape sequences in squid.conf regexes
Pros: Can detect typos -- unsupported escape sequences.
Cons: Poor readability: Double-escaping of all for-regex backslashes!
Cons: Converting old regexes requires non-trivial automation.
2a. Recognize %byte{n} logformat-like sequence in squid.conf regexes
Pros: Simple.
Cons: Converting old regexes[1] requires careful checking[3].
Cons: Cannot detect typos in logformat-like sequences.
Cons: Does not support other advanced use cases (e.g., %tr).
2b. Recognize %byte{n} and logformat sequences in squid.conf regexes
Pros: Can detect typos -- unsupported logformat sequences.
Cons: The need to escape % in regexes will surprise admins.
Cons: Converting old regexes requires (simple) automation.
3. Use composition to combine regexes and some special strings:
regex1 + "\n" + regex2
or
regex1 + %byte{10} + regex2
Pros: Old regexes can be safely used without any conversions.
Cons: Requires new, complex composition expressions/syntax.
Cons: A bit difficult to read.
Cons: Requires a lot of development.
4. Use 2b but only when regex is given to a special function:
substitute_logformat_codes(regex)
Pros: Old regexes can be safely used without any conversions.
Pros: New regexes do not need to escape % (by default).
Pros: Extendable to old regex configuration contexts.
Pros: Extendable to non-regex configuration contexts.
Pros: Reusing the existing parameters(...)-like call syntax.
Cons: A bit more difficult to read than 1a or 2a.
Cons: Duplicates "quoted string" approach in some directives[4].
Cons: Requires arguing about the new function name:-).
Given all the pros and cons, I think we should use option 4 above.
Do you see any better options?
_______________________________________________
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev