Jonathan Scott Duff wrote:
On Tue, May 24, 2005 at 11:24:50PM -0400, Jeff 'japhy' Pinyan wrote:

I wish <!prop X> was allowed. I don't see why <!...> has to be confined to zero-width assertions.


I don't either actually. One thing that occurred to me while responding
to your original email was that <!foo> might have slightly wrong
huffmanization.  Is zero-width the common case?  If not, we could use
character doubling for emphasis:  <!foo> consumes, while <!!foo> is
zero-width.
But that's just a random rambling on my part. I trust @Larry has put
wee more thought into it than I.  :-)

But what would a consuming <!...> mean?  As it can have no internal
backtracking points (it only has them if it fails), it would match (and
consume) the whole rest of the string, then if there were any more to
the pattern, would immediately backtrack back out left of itself.  Thus
it would be semantically identical to the zero-width version. So zero-width is really the only possibility for <!...>.

Now <prop X> is a character class just like <+digit> and so
under the new character class syntax, would probably be written
<+prop X> or if the white space is a problem, then maybe <+prop:X>
(or <+prop(X)> as Larry gets the colon :-), but that is a pretty
adverbial case so ':' maybe okay) with the complemented case being
<-prop:X>.  Actually the 'prop' may be unnecessary at all, as we know
we're in the character class sub-language because we saw the '<+', '<-'
or '<[', so we could just define the various Unicode character property
codes (I.e., Lu, Ll, Zs, etc) as pre-defined character class names just
like 'digit' or 'letter'.

BTW, as a matter of terminology, <-digit> should probably be called the
complement of <+digit> instead of the negation so as not to confuse it with the <!...> negative zero-width assertion case.

--
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Reply via email to