Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
> I'm opposed to an obligation to replace m// and s///. I won't mind the > ability to give a prototype of "regex" to functions, or even > *additional* functions, match and subst. As the RFC basically proposes. The idea is that s///, tr///, and m// would stay, seemingly unchanged. But they'd actually just be shortcuts to the new builtins. These new builtins can act on lists, be prototyped/overridden, be more easily chained together without in-betweener variables. Basically, they get all the benefits normal functions get, while still being 100% backwards compatible. -Nate
Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
On Thu, 14 Sep 2000 08:47:24 -0700, Nathan Wiger wrote: >One thing to remember is that regex's are already used in a number of >functions: > > @results = grep /^VAR=\w+/, @values; You are being mislead. You might just as well say that length() is being used in other functions: @results = grep length, @values; This is not a regex as a prarameter. It is an expression. Execution is postponed, just as for length(). The code is equivalent to @results = grep { /^VAR=\w+/ } @values; a block containing an expression. Now, to beconsequent, you should turn this into: @results = grep { match /^VAR=\w+/ } @values; or @results = grep match /^VAR=\w+/, @values; Er... with the problem, that you no longer know what function the argument list @values belongs to. @results = grep macth(/^VAR=\w+/), @values; Is this really worth it? > @array = split /\s*:\s*/, $input; You are correct here. I'm opposed to an obligation to replace m// and s///. I won't mind the ability to give a prototype of "regex" to functions, or even *additional* functions, match and subst. Bare regexes should stay. As for tr///: this doesn't even use a regex. It looks a bit like one, but it's an entirely different thing. And to me, the argument of tr/// is actually two arguments; a source character list, and a replacement character list. As for s///: same thing, two arguments, a regex, and a substitution string or function. Your example function OLD:($str = $_) =~ s/\d+/&func/ge; NEW:$str = subst /\d+/&func/ge; should really be $str = subst /\d+/g, \&func; although I have the impression that the //g modifier is in the wrong place. -- Bart.
Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
> What's next, replace the regex syntax with something that more closely > ressembles the rest of Perl? No. > Regexes are a language within the language. And not a tiny one. I know... :-) > So, if regexes are such a completely different sublanguage, I can see > the m// and s/// syntax as just a link between these two entirely > different worlds. I don't care that it has a weird syntax itself. That, > by itself, simply stresses the fact that regexes are indeed "different". One thing to remember is that regex's are already used in a number of functions: @results = grep /^VAR=\w+/, @values; @array = split /\s*:\s*/, $input; So I don't think there's that much difference just standardizing on this as: @output = trade /$input/$output/, @strings; $num = match /\w+/, @these; Plus you can now easily work with arrays, which you can't currently. That being said, please read the whole RFC, since it provides a means for 100% backwards-compatible syntax. That way, existing regex lovers (myself included) can stuff use the "true Perl way". ;-) Plus, since =~ winds up being generalized (RFC 170) it is extensible to other operations as well. -Nate
Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
On 30 Aug 2000 02:13:38 -, Perl6 RFC Librarian wrote: >Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade() Why? What's next, replace the regex syntax with something that more closely ressembles the rest of Perl? Regexes are a language within the language. And not a tiny one. So, if regexes are such a completely different sublanguage, I can see the m// and s/// syntax as just a link between these two entirely different worlds. I don't care that it has a weird syntax itself. That, by itself, simply stresses the fact that regexes are indeed "different". -- Bart.
RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade() =head1 VERSION Maintainer: Nathan Wiger <[EMAIL PROTECTED]> Date: 27 Aug 2000 Last-Modified: 29 Aug 2000 Version: 2 Mailing List: [EMAIL PROTECTED] Number: 164 Status: Developing =head1 CHANGES 1. Added 100% backwards-compatible syntax 2. Included C replacement for C 3. Expanded examples and contexts =head1 ABSTRACT Several people (including Larry) have expressed a desire to get rid of C<=~> and C. This RFC proposes a way to replace C, C, and C with three new builtins, C, C, and C. It also proposes a way to allow full backwards-compatible syntax. =head1 DESCRIPTION =head2 Overview Everyone knows how C<=~> and C work. Several proposals, such as RFCs 135 and 138, attempt to fix some stuff with the current pattern-matching syntax. Most proposals center around minor modifications to C and C. This RFC proposes that C, C, and C be dropped from the language, and instead be replaced with new C, C, and C builtins, with the following syntaxes: $res [, $res] = match /pat/flags, $str [, $str]; $res [, $res] = subst /pat/new/flags, $str [, $str]; $res [, $res] = trade /pat/new/flags, $str [, $str]; These subs are designed to mirror the format of C, making them more consistent. Unlike the current forms, these return the modified string, leaving the input C<$str> alone. Context modifies the return values just as Perl 5 context does, with some extensions: 1. If called in a void context, they act on and modify C<$_>, consistent with current behavior. 2. If called in a scalar context, C returns the number of matches (like now), and the rest return the first (or only) string. 3. If called in a list context, a list of the modified strings will be returned. 4. If called in a numeric context, they all return the number of substitutions made. Extra arguments can be dropped, consistent with C and many other builtins: match; # all defaults (pattern is /\w+/) match /pat/;# match $_ match /pat/, $str; # match $str match /pat/, @strs; # match any of @strs subst; # strip leading/trailing whitespace subst /pat/new/;# sub on $_ subst /pat/new/, $str; # sub on $str subst /pat/new/, @strs; # return array of modified strings trade; # nothing trade /pat/new/;# tr on $_ trade /pat/new/, $str; # tr on $str trade /pat/new/, @str; # return array of modified strings These new builtins eliminate the need for C<=~> and C altogether, since they are functions just like C, C, C, and so on. There are also shortcut forms, see below. Sometimes examples are easiest, so here are some examples of the new syntax: Perl 5 Perl 6 -- if ( /\w+/ ) { } if ( match ) { } die "Bad!" if ( $_ !~ /\w+/ ); die "Bad!" if ( ! match ); ($res) = m#^(.*)$#g; ($res) = match #^(.*)$#g; next if /\s+/ || /\w+/; next if match /\s+/ or match /\w+/; next if ($str =~ /\s+/) || next if match /\s+/, $str or ($str =~ /\w+/) match /\w+/, $str; next unless $str =~ /^N/;next unless match /^N/, $str; $str =~ s/\w+/$bob/gi; $str = subst /\w+/$bob/gi, $str; s/\w+/this/; subst /\w+/this/; tr/a-z/Z-A/; trade /a-z/Z-A/; $new =~ tr/a/b/; $new = trade /a/b/, $new; # Some become easier and more consistent... ($str = $_) =~ s/\d+/&func/ge; $str = subst /\d+/&func/ge; ($new = $old) =~ tr/a/z/;$new = trade /a/z/, $old; # And these are pretty cool... foreach (@old) { @new = subst /hello/X/gi, @old; s/hello/X/gi; push @new, $_; } foreach (@str) { @new = trade /a-z/A-Z/, @str; tr/a-z/A-Z/; push @new, $_; } foreach (@str) { print "Got it" if match /\w+/, @str; if (/\w+/) { $gotit = 1 }; } print "Got it" if $gotit; This gives us a cleaner, more consistent syntax. In addition, it makes several things easier, is more easily extensible: &callsomesub(subst(/old/new/gi, $mystr)); $str = subst /old/new/i, $r->getsomeval; and is easier to read English-wise. However, it requires too much typing. For that reason, we include the shortcut form as well: =head2 Shortcut Form RFC 139 describes a way that the C syntax can be expanded to any function. So, to gain backwards compatibility, we simply allow this syntax along with the shortcut function names C, C, and C [1]: Shortcut FormBuiltin -- s/\w+/W/g;