RFC 166 (disambiguator)
Richard Proctor suggests that (?) will match the empty string. Then it can be inserted into regexes to separate elements that need to be separated. For example, /$foo(?)bar/ interpolates the value of $foo and then looks for that pattern followed by 'bar'. You cannot simply write /$foobar/ because then Perl tries to interpolate $foobar, which is not what you wanted. 1. You can already write /${foo}bar/ to get what you wanted. This solution already works inside of double-quoted strings. (?) would not work inside of double-quoted strings. 2. You can already write /$foo(?:)bar/ to get what you wanted. This is almost identical to what Richard proposed anyway. It is really not clear to me that this problem needs to be solved any better than it is already. I suggest that this section be removed from the RFC. Mark-Jason Dominus [EMAIL PROTECTED] I am boycotting Amazon. See http://www.plover.com/~mjd/amazon.html for details.
Re: RFC 110 (v3) counting matches
On Mon, 28 Aug 2000, Mark-Jason Dominus wrote: But there is no convenient way to run the loop once for each date and split the dates into pieces: # WRONG while (($mo, $dy, $yr) = ($string =~ /(\d\d)-(\d\d)-(\d\d)/g)) { ... } What I use in a script of mine is: while ($string =~ /(\d\d)-(\d\d)-(\d\d)/g) { ($mo, $dy, $yr) = ($1, $2, $3); } Although this, of course, also requires that you know the number of backreferences. The real problem I was trying to discuss was not this particular application. I was trying to point out a larger problem, which is that there are several regex features that are enabled or disabled depending on what context the match is in, so that if you want one scalar-context feature and one list-context feature at the same time, there is no direct way to do it. Nicer would be to be able to assign from @matchdata or something like that :) I agree. There are many operations that would be simpler if there was a magic array that contained ($1, $2, $3, ...). If anyone wants to write an RFC on this, I will help.
Re: RFC 110 (v2) counting matches
On Tue, 29 Aug 2000 08:47:25 -0400, Mark-Jason Dominus wrote: m/.../Count,Insensitive (instead of m/.../ti) That would escape the problem that we are running out of letters and also the problem that the current letters are hard to remember. Yes, but wouldn't this give us backward compatibility problems? For example, code like $result = m/(.)/Insensitive, ord $1; No, because that is presently a syntax error. The one you have to watch out for is: $result = m/(.)/s,Insensitive, ord $1; And, I don't really see the need for the comma. m/.../CountInsensitive (instead of m/.../ti) I guess, but to me CountInsensitive looks like one option, not two.
Re: RFC 110 (v3) counting matches
On Tue, 29 Aug 2000 08:51:29 -0400, Mark-Jason Dominus wrote: There are many operations that would be simpler if there was a magic array that contained ($1, $2, $3, ...). If anyone wants to write an RFC on this, I will help. Heh. I once complained about the lack of such an array, in comp.lang.perl.misc, *years* ago. My practical problem was something like this, in a translation program. $phrase is one of many patterns in a table, to look for English phrases, %translate contains the French translations. interpolate() is a sub that fills in the parameters -- the numbers in the string): $_ = "It is 5 past 10." $phrase = 'it is (\d+) past (\d+)'; s/^$phrase/interpolate($translate{$phrase}, $1, $2)/ie; The problem is that with variable patterns, you *don't know* how many paren groups there are. The solution they came upo with, was @+ and @-. I still can't work with those. An array of matches, (e.g. @) would be a lot easier. It could also be a lot slower; see the discussion on $ for this. (mystery: how can filling in $ be a lot slower than filling in $1?) -- Bart.
Re: RFC 110 (v2) counting matches
On Tue, 29 Aug 2000 09:00:43 -0400, Mark-Jason Dominus wrote: And, I don't really see the need for the comma. m/.../CountInsensitive (instead of m/.../ti) I guess, but to me CountInsensitive looks like one option, not two. That goes fot this too. : m/.../iCount (instead of m/.../it) -- Bart.
Re: RFC 110 (v3) counting matches
That empty list to force the proper context irks me. How about a modifier to the RE that forces it (this would solve the "counting matches" problem too). $string =~ m{ (\d\d) - (\d\d) - (\d\d) (?{ push @dates, makedate($1,$2,$3) }) }gxl; $count = $string =~ m/foo/gl; # always list context The reason why not is because you're adding a special case hack to one particular place, rather than promoting a general mechanism that can be everywhere. Tell me: which is better and why. 1) A regex switch to specify scalar context, as in a mythical /r: push(@got, /bar/r) 2) A general mechanism, say for example, "scalar": push(@got, scalar /bar/) Obviously the "scalar" is better, because it does not require that a new switch be learnt, nor is its use restricted to pattern matching. Furthermore, it's inarguably more mnemonic for the sense of "match this scalarishly". Likewise, to force list context (a far less common operation, mind you), it is a bad idea to have what amounts to a special argument to just one function to this. What happens to the next function you want to do this to? How about if I want to force getpwnam() into list context and get back a scalar result? $count = getpwnam("tchrist")/l; $count = getpwnam("tchrist", LIST); $count = getpwnam("tchrist")-as_list; All of those, frankly, suck. This is much better: $count = () = getpwnam("tchrist"); It's better because * You don't have to invent anything new, whether syntactically or mnemonically. The sucky solution all require modification of Perl's very syntax. With the list assignment, you just need to learn how to use what you *already have*. I could say as much for (?{...}). Think how many of the suggestions on these lists can be dealt with simply through using existing features that the suggesting party was unaware of. * It's a general mechanism that isn't tailored for this particular function call. Special-purpose solutions are often inferior to general-purpose ones, because the latter are more likely to be creatively usable in a fashion unforeseen by the author. * What could possibly be more intuitive for the action of acting as though one were assigning to a list than doing that very thing itself? Since () is the canonical list (it's empty, after all), this follows directly and requires on special knowledge whatsoever. --tom
Re: RFC 110 (v2) counting matches
If we want to use uppercase, make these unique as well. That gives us many more combinations, and is not necessarily confusing: m//f - fast match m//F - first match m//i - case-insentitive m//I - ignore whitespace And so on. This seems like a much more productive use, otherwise we're just wasting characters. Larry's on record as preferring not to have us going down the road of using distinct upper and lower case regex switches. The distance between //c and //C, say, is far too narrow. --tom
Overlapping RFCs 135 138 164
RFC135: Require explicit m on matches, even with ?? and // as delimiters. C?...? and C/.../ are what makes Perl hard to tokenize. Requiring them to be written Cm?...? and Cm/.../ would solve this. (Nathan Torkington) RFC138: Eliminate =~ operator. Replace EXPR =~ m/.../ with m/.../ EXPR, and similarly for s/// and tr///. Force an explicit dereference when using qr/.../. Disallow the implicit treatment of a string as a regular expression to match against. (Steve Fink) RFC164: Replace =~, !~, m//, and s/// with match() and subst() Several people (including Larry) have expressed a desire to get rid of C=~ and C!~. This RFC proposes a way to replace Cm// and Cs/// with two new builtins, Cmatch() and Csubst(). (Nathan Widger) I would like to see these three RFCs merged into one if this is appropriate. I am calling on the three authors to discuss in private email how this may be done. I hope that the discussion will result in the withdrawal at least two of the three RFCs, and that this private discussion produces a new RFC. The new RFC should discuss the points raised by all three existing RFCs, should investigate several solutions in parallel, and should compare them with one another and contrast the benefits and drawbacks of each one. Mark-Jason Dominus [EMAIL PROTECTED] I am boycotting Amazon. See http://www.plover.com/~mjd/amazon.html for details.
Re: Overlapping RFCs 135 138 164
Mark-Jason Dominus wrote: RFC135: Require explicit m on matches, even with ?? and // as delimiters. This one is along a different line from these two: RFC138: Eliminate =~ operator. RFC164: Replace =~, !~, m//, and s/// with match() and subst() Which I could see unifying. I'd ask people to wait until v2 of RFC 164 comes up. It may well include everything from RFC 138 already. -Nate
Re: RFC 165 (v1) Allow Varibles in tr///
Mark-Jason Dominus wrote: I think the reason this hasn't been done before it because it's *not* quite straightforward. Before everyone gets tunnel vision, let me point out one thing: Accepting variables in tr// makes no sense. It defeats the purpose of tr/// - extremely fast, known transliterations. tr///e is the same as s///g: tr/$foo/$bar/e == s/$foo/$bar/g I don't think this RFC accomplishes anything, personally. -Nate
Re: RFC 110 (v2) counting matches
Mark-Jason Dominus wrote: It occurs to me that since none of the capital letters are taken, we could adopt the convention that a capital letter as a regex modifier will introduce a *word* which continues up to the next comma. Excelsior! -- David Nicol 816.235.1187 [EMAIL PROTECTED] Yum, sidewalk eggs!
Re: RFC 165 (v1) Allow Varibles in tr///
tr///e is the same as s///g: tr/$foo/$bar/e == s/$foo/$bar/g I suggest you read up on tr///, sir. You are completely wrong. --tom
Re: RFC 165 (v1) Allow Varibles in tr///
Tom Christiansen wrote: tr///e is the same as s///g: tr/$foo/$bar/e == s/$foo/$bar/g I suggest you read up on tr///, sir. You are completely wrong. Yep, sorry. I tried to hit cancel and hit send instead. I'll shut up now. -Nate
Re: RFC 110 (v3) counting matches
p.s. Has anybody already suggested that we ought to have a nicer solution to execute perl code inside a string, replacing "${\(...)}" and "@{[...]}", which also won't ever win a beauty contest? Oops, wrong mailing list. The first one doesn't work, and never did. You want @{[]} and @{[scalar ]} instead. "Doesn't work"? print "The sum of 1 + 2 is ${\(1+2)}.\n"; -- The sum of 1 + 2 is 3. I'm surprised your wouldn't have known this. The principle is the same: "${...}" expects a scalar reference inside the block, and '\' provides one. Of course, there shouldn't be a real multi-element list inside the parens, but just one scalar. And often, the parens aren't needed. I'm surprised that you still don't understand. Notice what I showed you for the replacement above: @{[scalar ]}. Using ${\(...)} doesn't work in the sense that contrary to popular belief, it fails to provide a scalar context to the contents of those parens. Thus ${ \( fn() ) } is still calling fn() in list context, not scalar context. Witness: sub fn { sprintf "called in %s context", wantarray ? "list" : "scalar" } print "Test 1: "; print "@{ [fn()] }\n"; print "Test 2: "; print "${ \(fn()) }\n"; print "Test 3: "; print "@{ [scalar fn()] }\n"; That, when executed, yields: Test 1: called in list context Test 2: called in list context Test 3: called in scalar context *That's* why test 2 "doesn't work". --tom
Re: Overlapping RFCs 135 138 164
($foo = $bar) =~ s/x/y/; will never make much sense to me. What about these, which are much the same thing in that they all use the lvaluability of assignment: chomp($line = STDIN); ($foo = $bar) += 10; ($foo += 3) *= 2; func($diddle_me = $protect_me); $n = select($rout=$rin, $wout=$win, $eout=$ein, 2.5); --tom
Re: Overlapping RFCs 135 138 164
What about these, which are much the same thing in that they all use the lvaluability of assignment: And don't forget: for (@new = @old) { s/foo/bar/ } --tom
RFC 170 (v1) Generalize =~ to a special-purpose assignment operator
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Generalize =~ to a special-purpose assignment operator =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 29 Aug 2000 Mailing List: [EMAIL PROTECTED] Version: 1 Number: 170 Status: Developing Requires: RFC 164 =head1 ABSTRACT Currently, C=~ is only available for use in specific builtin pattern matches. This is too bad, because it's really a neat operator. This RFC proposes a simple way to make it more general-purpose. =head1 DESCRIPTION First off, this assumes RFC 164. Second, it requires you drop any knowledge of how C=~ currently works. Finally, it runs directly counter to RFC 139, which proposes another application for C=~. This RFC proposes a simple use for C=~: as a last-argument rvalue duplicator. What this means is that an expression such as this: $value = dostuff($arg1, $arg2, $value); Could now be rewritten as: $value =~ dostuff($arg1, $arg2); And C$value would be implicitly transferred over to the right side as the last argument. It's simple, but it makes what is being operated on very obvious. This enables us to rewrite the following constructs: ($name) = split /\s+/, $name; $string = quotemeta($string); @array = reverse @array; @vals = sort { $a = $b } @vals; $string = s/\s+/SPACE/, $string;# RFC 164 $matches = m/\w+/, $string; # RFC 164 @strs = s/foo/bar/gi, @strs;# RFC 164 As the shorter and more readable: ($name) =~ split /\s+/; $string =~ quotemeta; @array =~ reverse; @vals =~ sort { $a = $b }; $string =~ s/\s+/SPACE/;# looks familiar $string =~ m/\w+/; # this too [1] @strs =~ s/foo/bar/gi; # cool extension It's a simple solution, true, but it has a good amount of flexibility and brevity. It could also be the case that multiple values could be called and returned, so that: ($name, $email) = special_parsing($name, $email); Becomes: ($name, $email) =~ special_parsing; Again, it's simple, but seems to have useful applications. =head1 IMPLEMENTATION Simplistic (hopefully). =head1 MIGRATION This introduces new functionality, which allows backwards compatibility for regular expressions. As such, it should require no special translation of code. This RFC assumes RFC 164 will be adopted (which it may not be) for changes to regular expressions. True void contexts may also render some parts of this moot, in which case coming up with a more advanced use for C=~ may be desirable. =head1 NOTES [1] That m// one doesn't quite work right, but that's a special case that I would suggest should be caught by some other part of the grammar to maintain backwards compatability (like bare //). =head1 REFERENCES RFC 164: Replace =~, !~, m//, and s/// with match() and subst() RFC 139: Allow Calling Any Function With A Syntax Like s///