Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Bart Lateur

On 30 Aug 2000 02:13:38 -, Perl6 RFC Librarian wrote:

Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

Why?

What's next, replace the regex syntax with something that more closely
ressembles the rest of Perl?

Regexes are a language within the language. And not a tiny one.

So, if regexes are such a completely different sublanguage, I can see
the m// and s/// syntax as just a link between these two entirely
different worlds. I don't care that it has a weird syntax itself. That,
by itself, simply stresses the fact that regexes are indeed "different".

-- 
Bart.



Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Bart Lateur

On Thu, 14 Sep 2000 08:47:24 -0700, Nathan Wiger wrote:

One thing to remember is that regex's are already used in a number of
functions:

   @results = grep /^VAR=\w+/, @values;

You are being mislead. You might just as well say that length() is being
used in other functions:

@results = grep length, @values;

This is not a regex as a prarameter. It is an expression. Execution is
postponed, just as for length(). The code is equivalent to

@results = grep { /^VAR=\w+/ } @values;

a block containing an expression. Now, to beconsequent, you should turn
this into:

@results = grep { match /^VAR=\w+/ } @values;

or

@results = grep match /^VAR=\w+/, @values;

Er... with the problem, that you no longer know what function the
argument list @values belongs to. 

@results = grep macth(/^VAR=\w+/), @values;

Is this really worth it?


   @array = split /\s*:\s*/, $input;

You are correct here.

I'm opposed to an obligation to replace m// and s///. I won't mind the
ability to give a prototype of "regex" to functions, or even
*additional* functions, match and subst.

Bare regexes should stay.

As for tr///: this doesn't even use a regex. It looks a bit like one,
but it's an entirely different thing. And to me, the argument of tr///
is actually two arguments; a source character list, and a replacement
character list.

As for s///: same thing, two arguments, a regex, and a substitution
string or function. Your example function

OLD:($str = $_) =~ s/\d+/func/ge;
NEW:$str = subst /\d+/func/ge;

should really be

$str = subst /\d+/g, \func;

although I have the impression that the //g modifier is in the wrong
place.
 
-- 
Bart.



Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Nathan Wiger

 I'm opposed to an obligation to replace m// and s///. I won't mind the
 ability to give a prototype of "regex" to functions, or even
 *additional* functions, match and subst.

As the RFC basically proposes. The idea is that s///, tr///, and m//
would stay, seemingly unchanged. But they'd actually just be shortcuts
to the new builtins. These new builtins can act on lists, be
prototyped/overridden, be more easily chained together without
in-betweener variables. Basically, they get all the benefits normal
functions get, while still being 100% backwards compatible.

-Nate



negative variable-length lookbehind example

2000-09-14 Thread Hugo

In RFC 72, Peter Heslin gives this example:
:Imagine a very long input string containing data such as this:
:
:... GCAAGAATTGAACTGTAG ...
:
:If you want to match text that matches /GA+C/, but not when it
:follows /G+A+T+/, you cannot at present do so easily.

I haven't tried to work it out exactly, but I think you can
achieve this (and fairly efficiently) with something like:
  /
(?: ^ |  # else we won't match at start
  (?: (? G+ A+ T+) | (.) )*
  (?(1) | . )
)
G A+ C
  /x

This requires that the regexp engine reliably leaves $1 unset if
we took the G+A+T+ branch last time through the (...)*, which
has been an area of many bugs and no little discussion in perl5;
I'm not sure of the status of that in latest perls.

It isn't particularly relevant to this proposal since there are
other combinations that can't be resolved in this way; I thought
it might be of interest nonetheless.

Hugo