Re: Perl 5's non-greedy matching can be TOO greedy!

2000-12-15 Thread Jarkko Hietaniemi

Please give it a rest.  I think everybody got it by now.  Everybody
understands how the current implementation works and what the
semantics are, and you disagree with the current semantics.  I think
that's the end of story since changing current default semantics is
simply not an option.  We can't break all the existing programs that
depend on the current stinginess semantics, period.  Now move on.

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: Perl 5's non-greedy matching can be TOO greedy!

2000-12-15 Thread Jarkko Hietaniemi

 More generally, it seems to me that you're hung up on the description 
 of "*?" as "shortest possible match".  That's an ambiguous 

Yup, that's a bit confusing.  It's really "start matching as soon as
possible, and stop matching as soon as possible".  (The usual greedy
one is, of course, "keep matching as long as possible".)  The initial
invariant part, "start as soon as possible", is the de facto and de
jure (at least POSIX 1003.2, but probably also Single Unix)
definition, and therefore rather non-negotiable.

 simplification of what "*?" means.  It might better be described as 
 "match until you find a match for the rest of the regex" ('d' in your 

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: XML/HTML-specific ? and ? operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Jarkko Hietaniemi

On Thu, Sep 07, 2000 at 03:42:01PM -0400, Eric Roode wrote:
 Richard Proctor wrote:
 
 I think what is needed is something along the line of :
 
$re = qz{ '(' \$re ')'
 | \$re \$re
 | [^()]+
};

 Where qz is some hypothetical new quoting syntax
 
 Well, we currently have qr{}, and ??{} does something like your \$re.
 
 Warning: radical ideas ahead.
 
 What would be useful, would be to leave REs the hell alone; they're 
 great as-is, and are only getting hairier and hairier. What would be
 useful, would be to create a new non-regular pattern-matching/parsing
 "language" within Perl, that combines the best of Perl REs, lex, 
 SNOBOL, Icon, state machines, and what have you. 

Agreed.  "Yet another quoting construct", "yet another \construct",
"yet another (? construct".  Argh, please, no.  Make all the above and
all we've learned from Parse::RecDescent et alia to collide at light
speed and see what new cool particles will spring forth.


-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: XML/HTML-specific ? and ? operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Jarkko Hietaniemi

On Wed, Sep 06, 2000 at 03:47:57PM -0700, Randal L. Schwartz wrote:
  "Mark-Jason" == Mark-Jason Dominus [EMAIL PROTECTED] writes:
 
 Mark-Jason I have some ideas about how to do this, and I will try to
 Mark-Jason write up an RFC this week.
 
 "You want Icon, you know where to find it..." :)

Hey, it's one of the few languages we haven't yet stolen a neat
feature or few from...  (I don't really count the few regex thingies
as full-fledged stealing, more like an experimental sleight-of-hand.)

 But yes, a way that allows programmatic backtracking sort of "inside out"
 from a regex would be nice.

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen