Steffen Nurpmeso wrote, on 20 Sep 2024: > > Hello, for your possible interest. > > Steffen Nurpmeso via austin-group-l at The Open Group wrote in > <20240916152314.Do4nw6pA@steffen%sdaoden.eu>: [...] > | It turns out the POSIX standard is ambiguous about this situation. > | The grammar in the standard for concatenated regular expressions is a > | left-associative grammar. However, there is an example in the rationale > | (not officially part of the standard...) that assumes concatenation is > | is right-associative. > > I was about to open a clarification issue, but could not find the > quoted rationale, yet i got my hands on Mike Haertel's email > address and thought i ask him. > He was so nice to answer and he says that the above arose from > memory from something read in the past, and that he is unable to > find the exact quote now. > So, he is about to search some more, and says he will revise the > above paragraph in case he cannot find it. Since i track his > repository either a clarification issue will be opened, or not.
I assume the example Mike refers to above is this in A.9.1: For example, in the ERE "(a.*b)(a.*b)", the two identical subexpressions would match four and six characters, respectively, of accbaccccb. I think this is required by the normative text (elsewhere than the grammar), not assumed by the example as Mike says. The relevant text is in the definition of "matched" in 9.1: Consistent with the whole match being the longest of the leftmost matches, each subpattern, from left to right, shall match the longest possible string. and it goes on to give an example: For example, matching the BRE "\(.*\).*" against "abcdef", the subexpression "(\1)" is "abcdef" -- Geoff Clare <g.cl...@opengroup.org> The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England