Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Nathan Wiger

> I'm opposed to an obligation to replace m// and s///. I won't mind the
> ability to give a prototype of "regex" to functions, or even
> *additional* functions, match and subst.

As the RFC basically proposes. The idea is that s///, tr///, and m//
would stay, seemingly unchanged. But they'd actually just be shortcuts
to the new builtins. These new builtins can act on lists, be
prototyped/overridden, be more easily chained together without
in-betweener variables. Basically, they get all the benefits normal
functions get, while still being 100% backwards compatible.

-Nate



Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Bart Lateur

On Thu, 14 Sep 2000 08:47:24 -0700, Nathan Wiger wrote:

>One thing to remember is that regex's are already used in a number of
>functions:
>
>   @results = grep /^VAR=\w+/, @values;

You are being mislead. You might just as well say that length() is being
used in other functions:

@results = grep length, @values;

This is not a regex as a prarameter. It is an expression. Execution is
postponed, just as for length(). The code is equivalent to

@results = grep { /^VAR=\w+/ } @values;

a block containing an expression. Now, to beconsequent, you should turn
this into:

@results = grep { match /^VAR=\w+/ } @values;

or

@results = grep match /^VAR=\w+/, @values;

Er... with the problem, that you no longer know what function the
argument list @values belongs to. 

@results = grep macth(/^VAR=\w+/), @values;

Is this really worth it?


>   @array = split /\s*:\s*/, $input;

You are correct here.

I'm opposed to an obligation to replace m// and s///. I won't mind the
ability to give a prototype of "regex" to functions, or even
*additional* functions, match and subst.

Bare regexes should stay.

As for tr///: this doesn't even use a regex. It looks a bit like one,
but it's an entirely different thing. And to me, the argument of tr///
is actually two arguments; a source character list, and a replacement
character list.

As for s///: same thing, two arguments, a regex, and a substitution
string or function. Your example function

OLD:($str = $_) =~ s/\d+/&func/ge;
NEW:$str = subst /\d+/&func/ge;

should really be

$str = subst /\d+/g, \&func;

although I have the impression that the //g modifier is in the wrong
place.
 
-- 
Bart.



Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Nathan Wiger

> What's next, replace the regex syntax with something that more closely
> ressembles the rest of Perl?

No.
 
> Regexes are a language within the language. And not a tiny one.

I know... :-)

> So, if regexes are such a completely different sublanguage, I can see
> the m// and s/// syntax as just a link between these two entirely
> different worlds. I don't care that it has a weird syntax itself. That,
> by itself, simply stresses the fact that regexes are indeed "different".

One thing to remember is that regex's are already used in a number of
functions:

   @results = grep /^VAR=\w+/, @values;
   @array = split /\s*:\s*/, $input;

So I don't think there's that much difference just standardizing on this
as:

   @output = trade /$input/$output/, @strings;
   $num = match /\w+/, @these;

Plus you can now easily work with arrays, which you can't currently.

That being said, please read the whole RFC, since it provides a means
for 100% backwards-compatible syntax. That way, existing regex lovers
(myself included) can stuff use the "true Perl way". ;-) Plus, since =~
winds up being generalized (RFC 170) it is extensible to other
operations as well.

-Nate



Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Bart Lateur

On 30 Aug 2000 02:13:38 -, Perl6 RFC Librarian wrote:

>Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

Why?

What's next, replace the regex syntax with something that more closely
ressembles the rest of Perl?

Regexes are a language within the language. And not a tiny one.

So, if regexes are such a completely different sublanguage, I can see
the m// and s/// syntax as just a link between these two entirely
different worlds. I don't care that it has a weird syntax itself. That,
by itself, simply stresses the fact that regexes are indeed "different".

-- 
Bart.



RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-08-29 Thread Perl6 RFC Librarian

This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

=head1 VERSION

   Maintainer: Nathan Wiger <[EMAIL PROTECTED]>
   Date: 27 Aug 2000
   Last-Modified: 29 Aug 2000
   Version: 2
   Mailing List: [EMAIL PROTECTED]
   Number: 164
   Status: Developing

=head1 CHANGES

   1. Added 100% backwards-compatible syntax

   2. Included C replacement for C

   3. Expanded examples and contexts

=head1 ABSTRACT

Several people (including Larry) have expressed a desire to get rid of
C<=~> and C. This RFC proposes a way to replace C, C,
and C with three new builtins, C, C, and C.
It also proposes a way to allow full backwards-compatible syntax.

=head1 DESCRIPTION

=head2 Overview

Everyone knows how C<=~> and C work. Several proposals, such as RFCs
135 and 138, attempt to fix some stuff with the current pattern-matching
syntax. Most proposals center around minor modifications to C and
C.

This RFC proposes that C, C, and C be dropped from the
language, and instead be replaced with new C, C, and
C builtins, with the following syntaxes:

   $res [, $res] = match /pat/flags, $str [, $str];
   $res [, $res] = subst /pat/new/flags, $str [, $str];
   $res [, $res] = trade /pat/new/flags, $str [, $str];

These subs are designed to mirror the format of C, making them
more consistent. Unlike the current forms, these return the modified
string, leaving the input C<$str> alone.

Context modifies the return values just as Perl 5 context does, with
some extensions:

   1. If called in a void context, they act on and modify C<$_>,
  consistent with current behavior.

   2. If called in a scalar context, C returns the number
  of matches (like now), and the rest return the first (or
  only) string.

   3. If called in a list context, a list of the modified strings
  will be returned.

   4. If called in a numeric context, they all return the number
  of substitutions made.

Extra arguments can be dropped, consistent with C and many other
builtins:

   match;  # all defaults (pattern is /\w+/)
   match /pat/;# match $_
   match /pat/, $str;  # match $str
   match /pat/, @strs; # match any of @strs

   subst;  # strip leading/trailing whitespace
   subst /pat/new/;# sub on $_
   subst /pat/new/, $str;  # sub on $str
   subst /pat/new/, @strs; # return array of modified strings

   trade;  # nothing
   trade /pat/new/;# tr on $_
   trade /pat/new/, $str;  # tr on $str
   trade /pat/new/, @str;  # return array of modified strings

These new builtins eliminate the need for C<=~> and C altogether,
since they are functions just like C, C, C, and so
on. There are also shortcut forms, see below.

Sometimes examples are easiest, so here are some examples of the new
syntax:

   Perl 5   Perl 6
    --
   if ( /\w+/ ) { } if ( match ) { }
   die "Bad!" if ( $_ !~ /\w+/ );   die "Bad!" if ( ! match ); 
   ($res) = m#^(.*)$#g; ($res) = match #^(.*)$#g;

   next if /\s+/ || /\w+/;  next if match /\s+/ or match /\w+/;
   next if ($str =~ /\s+/) ||   next if match /\s+/, $str or 
   ($str =~ /\w+/)  match /\w+/, $str;
   next unless $str =~ /^N/;next unless match /^N/, $str;
   
   $str =~ s/\w+/$bob/gi;   $str = subst /\w+/$bob/gi, $str;
   s/\w+/this/; subst /\w+/this/; 

   tr/a-z/Z-A/; trade /a-z/Z-A/;
   $new =~ tr/a/b/; $new = trade /a/b/, $new;


   # Some become easier and more consistent...

   ($str = $_) =~ s/\d+/&func/ge;   $str = subst /\d+/&func/ge;
   ($new = $old) =~ tr/a/z/;$new = trade /a/z/, $old;


   # And these are pretty cool...   

   foreach (@old) { @new = subst /hello/X/gi, @old;
  s/hello/X/gi;
  push @new, $_;
   }

   foreach (@str) { @new = trade /a-z/A-Z/, @str;
  tr/a-z/A-Z/;
  push @new, $_;
   }

   foreach (@str) { print "Got it" if match /\w+/, @str;
  if (/\w+/) { $gotit = 1 };
   }
   print "Got it" if $gotit;

This gives us a cleaner, more consistent syntax. In addition, it makes
several things easier, is more easily extensible:

   &callsomesub(subst(/old/new/gi, $mystr));
   $str = subst /old/new/i, $r->getsomeval;

and is easier to read English-wise. However, it requires too much
typing. For that reason, we include the shortcut form as well:

=head2 Shortcut Form

RFC 139 describes a way that the C syntax can be expanded to any
function. So, to gain backwards compatibility, we simply allow this
syntax along with the shortcut function names C, C, and C [1]:

   Shortcut FormBuiltin
    --
   s/\w+/W/g;