Moritz Lenz wrote:
Am 27.04.2010 06:31, schrieb Stéphane Payrard:
When doing an analyse of a sample parse tree, I note that it is
cluttered by the reduction of optional subrules
to generate a zero length parse subtree. That is, rules with a '?'
quantifier matching zero time.
Currently the ? quantifier is just syntactic sugar for the general
quantifier ** 0..1.
Would that behave the same?
Are you proposing that all quantifiers that match zero times should not
appear in the parse tree? Then <twigil>+ could either lead to no capture
at all, a single value or a list - not very nice.
Or rather special-casing the 0..1 quantifier?
There's also another problem with your approach: If you have
in your regex, and it matches the empty string, it is still a successful
match - yet with your proposal, it's impossible to distinguis a
successful zero-width match from an unsuccessful match (which can happen
Seems to me that it's possible to have your cake and eat it. If you look
at this as an optimization problem, and note that it's expensive to have
these array objects littered around, then perhaps the answer is to use a
new class (that still "does Array") whose length is guaranteed to be
either zero or one. That's probably cheaper than a full (zero-one-many)
array object, yet the user doesn't necessarily see any difference.
Yet, at the same time, Stéphane's useability issue can be addressed by
allowing that the node associated with the ? quantifier in the parse
tree does, in fact, have some additional capabilities beyond a basic
**0..1 quantifier. I see no reason to be dogmatically bound to the idea
that ? is nothing more than syntactic sugar.