On 1 July 2012 22:56, Lionel Cons <lionelcons1...@googlemail.com> wrote: > On 27 June 2012 19:24, Glenn Fowler <g...@research.att.com> wrote: >> >> On Wed, 27 Jun 2012 18:15:06 +0200 Roland Mainz wrote: >>> On Wed, Jun 27, 2012 at 6:04 PM, Glenn Fowler <g...@research.att.com> wrote: >>> > On Wed, 27 Jun 2012 17:43:06 +0200 Roland Mainz wrote: >>> >> How can I quote '-' in a ~(Ex)-style pattern [...] that it exactly >>> >> matches a '-' latter ? >>> >> I've tried the following pattern but the result is wrong (it should >>> >> match "hello-world" and "foo-bar"): >>> >> -- snip -- >>> >> $ ~/bin/ksh -c 's="hello-world foo-bar" ; >>> >> dummy="${s//~(Ex)([_\-[:alnum:]]+)/D}" ; print -v .sh.match' >>> >> ( >>> >> ( >>> >> hello >>> >> world >>> >> foo >>> >> bar >>> >> ) >>> >> ( >>> >> hello >>> >> world >>> >> foo >>> >> bar >>> >> ) >>> >> ) >>> >> -- snip -- >>> >> I tried to quote the '\' with a 2nd '\' without success (e.g. we get >>> >> the same wrong output/matches) >>> >> -- snip -- >>> >> $ ~/bin/ksh -c 's="hello-world foo-bar" ; >>> >> dummy="${s//~(Ex)([_\-[:alnum:]]+)/D}" ; print -v .sh.match' >>> >> ... >>> >> -- snip -- >>> > >>> >> Looking via dbx/gdb at the strings passed to the regex engine it looks >>> >> like ksh93 is either passing no '\' to |_ast_regcomp()| (in the case >>> >> of "~(Ex)([_\-[:alnum:]]+)") or it passes two '\' to |_ast_regcomp()| >>> >> (in the case of "~(Ex)([_\\-[:alnum:]]+)") ... it looks like a bug in >>> >> the ksh93 quoting mechanism for ~(E) patterns... ;-( >>> > >>> >> The only working workaround I found is to use \x<hex> to avoid having >>> >> to use \ to quote the '-' (the output below is IMO the expected one >>> >> for "${s//~(Ex)([_\-[:alnum:]]+)/D}"): >>> >> -- snip -- >>> >> $ ~/bin/ksh -c 's="hello-world foo-bar" ; >>> >> dummy="${s//~(Ex)([_\x2d[:alnum:]]+)/D}" ; print -v .sh.match' >>> >> ( >>> >> ( >>> >> hello-world >>> >> foo-bar >>> >> ) >>> >> ( >>> >> hello-world >>> >> foo-bar >>> >> ) >>> >> ) >>> >> -- snip -- >>> > >>> > its regex syntax and doesn't need a quote >>> > at http://pubs.opengroup.org/onlinepubs/9699919799/ set 9.3.5 item 7 >>> > from that it looks like >>> > * if you want literal ']' use one of >>> > []...] >>> > [^]...] >> >>> I know... >> >>> > * if you want literal '-' place it last >>> > [...-] >> >>> ... I didn't know that... ;-/ >>> Thanks... :-) >> >>> ... but could you still check why ksh93 "swallows" the single '\' but >>> passes two '\' as "\\" to |_ast_regcomp()|, please ? Is this intended >>> or somehow a bug or sideeffect ? >> >> its a side effect or the conflict betwee ksh and regex quoting >> if a side has to win it will be ksh in that context >> dgk can give more detail on how tricky that part is because >> ksh can't be expected to know all of the intricacies of each ~(...) RE syntax >> at some point when an RE gets complex enough it will have to be placed in a >> var >> then referencing it as $the_re is guaranteed to get sh and RE quoting right >> (or at least pass what everquoting is present down to regex) > > I don't think this is going to be useful. Either ksh can be expected > to know all of the egrep syntax or knows nothing and passes the > pattern through unscathed after user has provided sufficient \ escapes > to prevent clashes with ksh syntax. > The current situation of "guessing" which side - ksh or ere - will win > is NOT acceptable. > > Try to see it from the point of a POSIX standardisation committee or a > code generator which will generate ksh93 code. The POSIX committee > won't accept a fuzzy situation as it is right now and a code generator > can't be expected to do a trial&error procedure like it is required > right now until a pattern fits the needs of ksh's guesswork. > > if the situation can't be improved then I'd suggest to remove the > whole ~(E) feature. While I see the very usefulness the current > implementation is completely unacceptable.
So what will be done here? If nothing can be done I'll post a patch to wrap ~(E) support in SHOPT_EXPERIMENTAL_PATTERN_MATCHING so we can disable this on production machines. Lionel _______________________________________________ ast-developers mailing list ast-developers@research.att.com https://mailman.research.att.com/mailman/listinfo/ast-developers