Hi all, I've been looking into how Augeas handles escapes within regexps. I think I've come across a significant set of problems, both with the bundled lenses and the way Augeas does its escaping and unescaping.
I first noticed the problem when attempting to use a slash within a bracket expression in a regexp: /[\/]/ This should match the slash only, but when used in Augeas it would also match backslash. POSIX regular expressions place no significance on backslashes within bracket expressions. Since Augeas does not understand the \/ escape at all, both characters are added to the character class. Further investigation showed that many of the bundled lenses assume that escape sequences like \. and \- would be unescaped. For instance, in Rx we have: let email_addr = /[A-Za-z0-9_\+\.-]+@[A-Za-z0-9_\.-]+/ This allows backslashes in email addresses, when it is clear the intent was only to escape the regexp metacharacters. I have split fixes for all this into 6 patches. Patches 1 to 3 fix the Cgconfig, Cron and FAI_DiskConfig lenses respectively. Here the use of backslashes was actually producing incorrect ranges in character classes. For example, in Cgconfig: let id = /[a-zA-Z0-9_\-\/\.]+/ contains a range from backslash to backslash, and the hyphen never even made it into the character class. Patch 4 goes through all the remaining cases of escaping inside bracket expressions, with the exception of \/ and \\. This fixes things like that email_addr regexp above. Patch 5 fixes the escape() and unescape() functions in internal.c. The key here is that the C-style escapes: \a \b \t \n \v \f \r are common to both strings and regexps, but other escapes are not. These "extra" escapes are passed through to these functions via an extra parameter. For strings we allow the extra escapes: \" \\ as before. For regexps we use: \/ \\ Patch 6 removes \\ from this list of "extra" escapes for regexps. The idea here is that it removes the need to use quadruple-escape in certain cases. To match "backslash followed by any character", for instance, previously one had to use: /\\\\./ Now it is sufficient to use: /\\./ However this is somewhat of a backward-incompatible change -- lenses with the quadruple-escape need to be updated. For this reason I have kept this patch separate, since I am not sure if such a change would be acceptable. I believe these set of patches greatly simplify the way escapes work in Augeas. The rules can be summarized as follows: * \a, \b, \t, \n, \v, \f, \r are always treated as C-style escapes, and are replaced with their respective control characters. * In strings, \" and \\ can be used to represent " and \ respectively. * In regexps, \/ can be used to represent /. If patch 6 is omitted, \\ can be used to represent \ as well. Questions or comments regarding these patches would be greatly appreciated. - Michael Michael Chapman (6): Cgconfig: Fix parsing of group names Cron: Fix parsing of numeric fields FAI_DiskConfig: Fix invalid escape sequence \s Fix escape sequences in bracket expressions Fix regular expression escaping Don't require backslashes to be escaped in regexps lenses/aliases.aug | 2 +- lenses/cgconfig.aug | 2 +- lenses/cgrules.aug | 2 +- lenses/cron.aug | 2 +- lenses/darkice.aug | 2 +- lenses/debctrl.aug | 6 +++--- lenses/dhclient.aug | 2 +- lenses/dhcpd.aug | 2 +- lenses/dnsmasq.aug | 2 +- lenses/exports.aug | 2 +- lenses/fai_diskconfig.aug | 6 +++--- lenses/gdm.aug | 2 +- lenses/grub.aug | 2 +- lenses/httpd.aug | 14 +++++++------- lenses/inetd.aug | 6 +++--- lenses/inifile.aug | 2 +- lenses/interfaces.aug | 2 +- lenses/iptables.aug | 2 +- lenses/keepalived.aug | 2 +- lenses/modprobe.aug | 6 +++--- lenses/openvpn.aug | 2 +- lenses/pg_hba.aug | 2 +- lenses/phpvars.aug | 2 +- lenses/properties.aug | 2 +- lenses/rx.aug | 2 +- lenses/shellvars.aug | 4 ++-- lenses/shellvars_list.aug | 4 ++-- lenses/solaris_system.aug | 2 +- lenses/spacevars.aug | 2 +- lenses/sudoers.aug | 22 +++++++++++----------- lenses/sysconfig.aug | 2 +- lenses/syslog.aug | 4 ++-- lenses/tests/test_cgconfig.aug | 4 ++-- lenses/wine.aug | 10 +++++----- lenses/xml.aug | 4 ++-- src/augeas.c | 2 +- src/get.c | 2 +- src/internal.c | 26 ++++++++++++++++++-------- src/internal.h | 8 ++++++-- src/lens.c | 14 +++++++------- src/lexer.l | 4 ++-- src/regexp.c | 4 ++-- tests/modules/pass_cont_line.aug | 2 +- 43 files changed, 106 insertions(+), 92 deletions(-) -- 1.7.6.4 _______________________________________________ augeas-devel mailing list augeas-devel@redhat.com https://www.redhat.com/mailman/listinfo/augeas-devel