On Wed, Apr 23, 2014 at 06:15:44PM -0700, Asmus Freytag wrote: > On 4/23/2014 4:41 PM, Ilya Zakharevich wrote: > >>> GREED) Given any close-delimiter marked as “non-matching”, its > >>> pre-context does not contain any open-delimiter which could > >>> match it. > >>> > >>> Here pre-context of a position is a concatenation of substrings of > >>> the > >>> initial string: > >>> • Take the most deeply nested “matched pair” containing the position > >>> (if none, the whole string); > >>> • take the part of the string inside this pair AND before the > >>> position; > >>> • remove all “matched” pairs completely contained insidde this > >>> substring > >>> together with what they enclose.
> >>Can you explain why, if you make "pre-context" simply the part of the > >>whole string that precedes the unmatched close-delimiter, the words > >>"which could match it" are insufficient? > >Aha, this means that my description is INCOMPLETE: you got a wrong > >impression what “match” means! Everywhere, this word means exactly > >the same as in the MATCH rule: that Unicode codepoints match following > >Unicode properties. > >This is non-recursive definition. All rules are independent. > That explains why you repeat most of the other constraints in your > pre-context. Frankly speaking, I do not see any such repetition. > For a static definition, would it have been simpler to break the > definition into > two - say a "tentative parsing" (all conditions but greed) and > "selected parsing", > which the could be defined as the parsing that starts closest to the left. I do not see how: to know whether a closing delimiter may be matched or not, it is not enough to know “tentative” parsing of what preceeds it; one must know the **actual** parsing. Eventually, you would end with either a recursive definition, or a definition of a “process” of parsing. Anyway, I’ve written my portion of definitions which combine “tentative” stuff with “best choice” of tentative variants. One ends with monsters like http://perldoc.perl.org/perlre.html#Combining-RE-Pieces (and, Eli, the fact that I wrote it does not imply that I must like it :-[ ). In the case of Perl RExes, there is no alternative. IMO, if there IS a way to define what a “standalone” GOOD THING is, it is __much__ better than the “best of many” way. Definiting it as “the best of potentially good things” requires the reader to imagine first ALL the potentially good things; only when this (otherwise not very useful) universe has settled down in the reader’s mind they would be able to pick up the best guy… Ilya _______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

