Steffen Nurpmeso wrote in <20240916150420.FRIEjRZF@steffen%sdaoden.eu>: ... |Having said that, Arnold Robbins of GNU awk quite enthusiastically |posted somewhere (i cannot find nowhere where he did, actually) |that he is about to change the implementation of its (regex of |GNUlib) regular expression engine (alternatively) to a newly |written one [2], ie, from Mike Haertel, famous for his |implementation of GNU grep. He has opened an issue for itself in |order to support "non-greedy repetition operators" already [3]. | | [2] https://github.com/mikehaertel/minrx | [3] https://github.com/mikehaertel/minrx/issues/12
In his ALGORITHM.txt he by the way says ... There is one potential refinement in the translation scheme that has not yet been discussed that pertains to the associativity of concatenation. Consider a regular expression of the form ABC, where A, B, and C are subpatterns all of which can match material of variable length. POSIX requires finding the leftmost longest overall solution to ABC. But if for a particular search string there are multiple solutions for the lengths of A, B, and C that yield the the same leftmost-longest overall solution for ABC, which should be chosen? Either we can try to maximize AB at the expense of shortening C, or else we can try to maximize A at the expense of shortening BC. It turns out the POSIX standard is ambiguous about this situation. The grammar in the standard for concatenated regular expressions is a left-associative grammar. However, there is an example in the rationale (not officially part of the standard...) that assumes concatenation is is right-associative. The SNFA approach can implement either option for the associativity of concatenation.[.] ... MinRX currently implements left-associative concatentation, since it is cheaper at both runtime and compile time, and is also consistent with the behaviour of the repeated self-concatenation in duplication operators. If there is popular demand for right-associative concatenation, or if a future edition of the standard explicitly specifies right-associative concatenation, that can be implemented in MinRX with a modest code change. So there go the specialists, and they have a problem.. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)