Re: Perl 5's non-greedy matching can be TOO greedy!

2000-12-15 Thread Tom Christiansen
More generally, it seems to me that you're hung up on the description of "*?" as "shortest possible match". That's an ambiguous Yup, that's a bit confusing. It's really "start matching as soon as possible, and stop matching as soon as possible". (The usual greedy one is, of course, "keep

Re: Perl 5's non-greedy matching can be TOO greedy!

2000-12-15 Thread Tom Christiansen
Have you thought it through NOW, on a purely semantic level (in isolation from implementation issues and historical precedent), I've said it before, and I'll say it again: you keep using the word "semantic", but I do not think you know what that word means. --tom

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-28 Thread Tom Christiansen
I consider recursive regexps very useful: $a = qr{ (? [^()]+ ) | \( (??{ $a }) \) }; Yes, they're "useful", but darned tricky sometimes, and in ways other than simple regex-related stuff. For example, consider what happens if you do my $regex = qr{ (? [^()]+ ) | \( (??{ $regex }) \)

Re: \z vs \Z vs $

2000-09-20 Thread Tom Christiansen
"TC" == Tom Christiansen [EMAIL PROTECTED] writes: Could you explain what the problem is? TC /$/ does not only match at the end of the string. TC It also matches one character fewer. This makes TC code like $path =~ /etc$/ "wrong". Sorry, I'm missing it. I know. On

Re: \z vs \Z vs $

2000-09-20 Thread Tom Christiansen
That was my second thought. I kinda like it, because //s would have two effects: + let . match a newline too (current) + let /$/ NOT accept a trailing newline (new) Don't forget /s's other meaning. --tom

\z vs \Z vs $

2000-09-19 Thread Tom Christiansen
What can be done to make $ work "better", so we don't have to make people use /foo\z/ to mean /foo$/? They'll keep writing the $ for things that probably oughtn't abide optional newlines. Remember that /$/ really means /(?=\n?\z)/. And likewise with \Z. --tom

Re: XML/HTML-specific ? and ? operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Tom Christiansen
I am working on an RFC to allow boolean logic ( and || and !) to apply a number of patterns to the same substring to allow easier mining of information out of such constructs. What, you don't like: :-) $pattern = $conjunction eq "AND" ? join('' = map { "(?=.*$_)" }

Re: copying and s/// (was Re: Overlapping RFCs 135 138 164)

2000-08-30 Thread Tom Christiansen
Uri Guttman wrote: TC ($this = $that) =~ s/foo/bar/; TC for (@these = @those) { s/foo/bar/ } TC You can't really do those in one step without it. RFC 164 v2 has a new syntax that lets you do the above or, if you want: $this = s/foo/bar/, $that; @these = s/foo/bar/,

Re: RFC 110 (v3) counting matches

2000-08-29 Thread Tom Christiansen
That empty list to force the proper context irks me. How about a modifier to the RE that forces it (this would solve the "counting matches" problem too). $string =~ m{ (\d\d) - (\d\d) - (\d\d) (?{ push @dates, makedate($1,$2,$3) }) }gxl; $count =

Re: RFC 110 (v2) counting matches

2000-08-29 Thread Tom Christiansen
If we want to use uppercase, make these unique as well. That gives us many more combinations, and is not necessarily confusing: m//f - fast match m//F - first match m//i - case-insentitive m//I - ignore whitespace And so on. This seems like

Re: RFC 165 (v1) Allow Varibles in tr///

2000-08-29 Thread Tom Christiansen
tr///e is the same as s///g: tr/$foo/$bar/e == s/$foo/$bar/g I suggest you read up on tr///, sir. You are completely wrong. --tom

Re: RFC 110 (v3) counting matches

2000-08-29 Thread Tom Christiansen
p.s. Has anybody already suggested that we ought to have a nicer solution to execute perl code inside a string, replacing "${\(...)}" and "@{[...]}", which also won't ever win a beauty contest? Oops, wrong mailing list. The first one doesn't work, and never did. You want @{[]} and

Re: Overlapping RFCs 135 138 164

2000-08-29 Thread Tom Christiansen
($foo = $bar) =~ s/x/y/; will never make much sense to me. What about these, which are much the same thing in that they all use the lvaluability of assignment: chomp($line = STDIN); ($foo = $bar) += 10; ($foo += 3) *= 2; func($diddle_me = $protect_me); $n =

Re: Overlapping RFCs 135 138 164

2000-08-29 Thread Tom Christiansen
What about these, which are much the same thing in that they all use the lvaluability of assignment: And don't forget: for (@new = @old) { s/foo/bar/ } --tom

Re: RFC 110 (v3) counting matches

2000-08-28 Thread Tom Christiansen
Have you ever wanted to count the number of matches of a patten? s///g returns the number of matches it finds. m//g just returns 1 for matching. Counts can be made using s//$/g but this is wastefull, or by putting some counting loop round a m//g. But this all seams rather messy. It's

Re: RFC 164 (v1) Replace =~, !~, m//, and s/// with match() and subst()

2000-08-28 Thread Tom Christiansen
Simple solution. If you want to require formats such as m/.../ (which I actually think is a good idea), then make it part of -w, -W, -ww, or -WW, which would be a perl6 enhancement of strictness. That's like having "use strict" enable mandatory perlstyle compliance checks, and rejecting the

Re: RFC 158 (v1) Regular Expression Special Variables

2000-08-25 Thread Tom Christiansen
those early perl3 scripts by lwall floating around in /etc were poorly written. i am glad they are finally out of the distribution. Those weren't the scripts I was thinking about, and it is *NOT* ipso facto true that something which uses $ or $` is poorly written. --tom

Re: RFC 145 (v2) Brace-matching for Perl Regular Expressions

2000-08-25 Thread Tom Christiansen
All in all, though, you're right that neither set of features is particularly well-known/used outside of p5p followers. At least from what I've seen. Virtually every person I've worked with since 5.6 came out has been surprised and amazed at the REx eval stuff. The completely reworked regex

Re: RFC 144 (v1) Behavior of empty regex should be simple

2000-08-24 Thread Tom Christiansen
I propose that this 'last successful match' behavior be discarded entirely, and that an empty pattern always match the empty string. I don't see a consideration for simply s/successful// above, which has also been talked about. Thas would also match expected usage based upon existing editors.

Re: RFC 150 (v1) Extend regex syntax to provide for return of a hash of matched subpatterns

2000-08-24 Thread Tom Christiansen
This is useful in that it would stop being number dependent. For example, you can't now safely say /$var (foo) \1/ and guarantee for arbitrary contents of $var that your you have the right number backref anymore. If I recall correctly, the Python folks addressed this. One might check