Re: Perl 5's "non-greedy" matching can be TOO greedy!

2000-12-16 Thread Bart Lateur
On Fri, 15 Dec 2000 13:42:44 -0700, Kevin Walker wrote: >Deven seems to be advocating thinking about regular expressions >without worrying too much about the implementation, even at a fairly >abstract level. Here's a counter example: /dc/ Shouldn't a non-greed

Re: RFC 348 (v2) Regex assertions in plain Perl code

2000-10-08 Thread Bart Lateur
On Tue, 3 Oct 2000 01:08:31 -0400, James Mastros wrote: >It'd be somwhat useful, I think, if you could return somthing like \matched >to >let paren-catching of the ?{} thingy have somthing other then "". (Remember, >a ref is always true.) > >For example, that would let you parse somthing inside a

Re: RFC 332 (v1) Regex: Make /$/ equivalent to /\z/ under the '/s' modifier

2000-10-07 Thread Bart Lateur
On Sat, 7 Oct 2000 00:00:56 -0400, Bennett Todd wrote: >this proposal is hammering out a little >bit of irregularity, removing a subtle difference between the >behavior of $ at the end and ^ at the beginning under /s. I offer >this as another argument in favour of RFC 332. That was the basic ide

Re: RFC 331 (v2) Consolidate the $1 and C<\1> notations

2000-10-02 Thread Bart Lateur
On Mon, 2 Oct 2000 12:46:06 -0700 (PDT), Dave Storrs wrote: > Well, the main reason is that @/ worked best for my particular >brain. But you cannot use it in an ordinary regex, can you? There's no way you can put $/[1] between slashes in s/.../.../. BAckslashing it doesn't work. >@& >woul

Re: redraft (v2) for RFC 348 Regex assertions in plain Perl code

2000-10-01 Thread Bart Lateur
On Sun, 01 Oct 2000 18:43:27 +0100, Hugo wrote: >:This makes the implementation very tricky. I >:wouldn't be surprised if precisely this feature is the main reason why >:the current implementation is so notoriously unstable. > >I'm not aware of any instability caused by this. The instability is >

redraft (v2) for RFC 348 Regex assertions in plain Perl code

2000-10-01 Thread Bart Lateur
=head1 TITLE Regex assertions in plain Perl code =head1 VERSION Maintainer: Bart Lateur <[EMAIL PROTECTED]> Date: 28 Sep 2000 Mailing List: <[EMAIL PROTECTED]> Number: 348 Version: 2 Status: Developing (candidate for freeze) =head1 ABSTRACT Likely the most justifiab

redraft for v2: RFC 332 Regex: Make /$/ equivalent to /\z/ under the '/s' modifier

2000-10-01 Thread Bart Lateur
=head1 TITLE Regex: Make /$/ equivalent to /\z/ under the '/s' modifier =head1 VERSION Maintainer: Bart Lateur <[EMAIL PROTECTED]> Date: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 332 Version: 2 Status: Developing (redraft) =head1 ABSTRACT To most Perlers

Re: RFC 332 (v1) Regex: Make /$/ equivalent to /\z/ under the '/s' modifier

2000-10-01 Thread Bart Lateur
On Thu, 28 Sep 2000 23:54:20 +0100, Hugo wrote: >I still like the idea of $$, as I described it in the original thread. >I've seen no comments for or against at this time. I intend to put this in the RFC: Hugo prefers to add an alternative, like /$$/, wich would behave like this. But an al

Re: RFC 331 (v2) Consolidate the $1 and C<\1> notations

2000-10-01 Thread Bart Lateur
On 1 Oct 2000 06:26:40 -, Perl6 RFC Librarian wrote: >If you did the following: > >C<"Bilbo Baggins" =~ /((\w+)\s+(\w+))/> > >Then @/ would contain the following: > >C<$/[0]> the compiled equivalent of C, > >C<$/[1]> the string "Bilbo Baggins" > >C<$/[2]> the string "Bilbo" > >C<$/[3]> the s

Re: RFC 331 (v1) Consolidate the $1 and C<\1> notations

2000-09-30 Thread Bart Lateur
On 28 Sep 2000 20:57:39 -, Perl6 RFC Librarian wrote: >Currently, C<\1> and $1 have only slightly different meanings within a >regex. Let's consolidate them together, eliminate the differences, and >settle on $1 as the standard. I wrote this before, but apparently you didn't hear it. Let me

Re: RFC 72 (v4) Variable-length lookbehind.

2000-09-30 Thread Bart Lateur
On 30 Sep 2000 19:50:27 -, Perl6 RFC Librarian wrote: >In Perl6, lookbehind in regular expressions should be extended to permit >not only fixed-length, but also variable-length lookbehind. I see no mention of negative lookbehind. As I wrote before, in: /(?

Re: RFC 316 (v1) Regex modifier for support of chunk processing and prefix matching

2000-09-30 Thread Bart Lateur
On Tue, 26 Sep 2000 11:55:32 +1100 (EST), Damian Conway wrote: >Wouldn't this interact rather badly with the /gc option (which also leaves >C set on failure)? Yes. The easy way out is disallow combining /gc wit h/z. But, since this typically one of the applications it is aimed for, I should fin

Re: regexp RFCs: freeze 'em or lose 'em

2000-09-30 Thread Bart Lateur
On Sat, 30 Sep 2000 13:55:40 +0100, Hugo wrote: >The RFCs listed below are still listed as 'developing'. The deadline is >given as 1st October, but I'm not sure where the precise cutoff point >is - Nat, can you confirm? > >As I understand it, RFCs not frozen by the deadline will be treated as >wi

Re: RFC 348 (v1) Regex assertions in plain Perl code

2000-09-30 Thread Bart Lateur
On Sat, 30 Sep 2000 00:57:47 +0100, Hugo wrote: >:"local" inside embedded code will no longer be supported, nor will >:consitional regexes. The Perl5 -> Perl6 translator should warn if it >:ever encounters one of these. > >I'm not convinced that removing either of these are necessary to the >main

Re: RFC 316 (v1) Regex modifier for support of chunk processing and prefix matching

2000-09-30 Thread Bart Lateur
On Sat, 30 Sep 2000 00:23:13 +0100, Hugo wrote: >This is a strength of RFC 93 however, since in that context we >don't need to restart the match each time we go off to fetch more >data. In that situation if we run out of data after the 1234E2+2 >we fail the attempt to widen the \d+, match forward

Re: RFC 316 (v1) Regex modifier for support of chunk processing and prefix matching

2000-09-29 Thread Bart Lateur
On Fri, 29 Sep 2000 13:19:47 +0100, Hugo wrote: >I think that involves >rewriting your /p example something like: > if (/^$pat$/z) { >print "found a complete match"; > } elsif (defined pos) { >print "found a prefix match"; > } else { >print "not a match"; > } Except that this isn

Re: RFC 316 (v1) Regex modifier for support of chunk processing and prefix matching

2000-09-29 Thread Bart Lateur
On Fri, 29 Sep 2000 00:29:31 +0100, Hugo wrote: >:I originally had thought of providing a separate, dedicated regex >:modifier, just for the match prefix, but I don't think too many people >:need this that desperately. You can easily build a working application >:with just the '/z' modifier. I

Re: RFC 112 (v3) Asignment within a regex

2000-09-29 Thread Bart Lateur
On Fri, 29 Sep 2000 01:02:40 +0100, Hugo wrote: >It also isn't clear what parts of the expression are interpolated at >compile time; what should the following leave in %foo? > > %foo = (); > $bar = "one"; > "twothree" =~ / (?$bar=two) (?$foo{$bar}=three) /x; It's not just that. You act as if

Re: RFC 332 (v1) Regex: Make /$/ equivalent to /\z/ under the '/s' modifier

2000-09-28 Thread Bart Lateur
On Thu, 28 Sep 2000 23:54:20 +0100, Hugo wrote: >We thought of a few other possibilities too. I think it is a shame you >did not mention them, and explain why your proposal is better. Let me think on it. Is $$ the only alternative, or did I miss more? I don't think I've even seen this $$ mentio

Re: is \1 vs $1 a necessary distinction?

2000-09-28 Thread Bart Lateur
On Wed, 27 Sep 2000 10:34:48 -0500, Jonathan Scott Duff wrote: >If $1 could be made to work properly on the LHS of s///, I'd vote for >that being The Way. I disagree, because \1 is different from a variable $foo in at least two ways: * $foo is compiled into /$foo/ before anything is matched. \

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Bart Lateur
On Tue, 26 Sep 2000 13:32:37 -0400, Michael Maraist wrote: > >I can't believe that there currently isn't a means of killing a back-track >based on perl-code. Looking through perlre it seems like you're right. There is, but as MJD wrote: "it ain't pretty". Now, semantic checks or assertions wou

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Bart Lateur
On 25 Sep 2000 20:14:52 -, Perl6 RFC Librarian wrote: >Remove C, C and friends. I'm putting the finishing touches on an RFC to drop (?{...}) and replace it with something far more localized, hence cleaner: assertions, also in Perl code. That way, /(?

Re: RFC 110 (v6) counting matches

2000-09-21 Thread Bart Lateur
On 20 Sep 2000 21:37:03 -, Perl6 RFC Librarian wrote: >Bart Lateur: '()=' is not perfect. It is also butt ugly. It is a "dirty hack". Please don't hold this against me! I was arguing for a cleaner looking generic alternative for "()=", the now defunct

Re: \z vs \Z vs $

2000-09-21 Thread Bart Lateur
On Wed, 20 Sep 2000 15:16:20 -0600, Tom Christiansen wrote: >>That was my second thought. I kinda like it, because //s would have two >>effects: > >> + let . match a newline too (current) > >> + let /$/ NOT accept a trailing newline (new) > >Don't forget /s's other meaning. I gather you're talki

Re: \z vs \Z vs $

2000-09-20 Thread Bart Lateur
On Wed, 20 Sep 2000 10:03:08 +0100, Hugo wrote: >In <12839.969393548@chthon>, Tom Christiansen writes: >:What can be done to make $ work "better", so we don't have to >:make people use /foo\z/ to mean /foo$/? They'll keep writing >:the $ for things that probably oughtn't abide optional newlines.

Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Bart Lateur
On Thu, 14 Sep 2000 08:47:24 -0700, Nathan Wiger wrote: >One thing to remember is that regex's are already used in a number of >functions: > > @results = grep /^VAR=\w+/, @values; You are being mislead. You might just as well say that length() is being used in other functions: @result

Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()

2000-09-14 Thread Bart Lateur
On 30 Aug 2000 02:13:38 -, Perl6 RFC Librarian wrote: >Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade() Why? What's next, replace the regex syntax with something that more closely ressembles the rest of Perl? Regexes are a language within the language. And not a tiny

Re: RFC 166 (v1) Additions to regexs

2000-09-13 Thread Bart Lateur
On Tue, 12 Sep 2000 19:01:35 -0400, Mark-Jason Dominus wrote: >I don't know what you mean, but you're mistaken, because it means to >interpolate @foo as in a double-quoted string. Which is precisely the meaning he wants for it, with $" set to '|'. I wonder if we're not trying too hard. What if,

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Bart Lateur
On 06 Sep 2000 18:04:18 -0700, Randal L. Schwartz wrote: >I think the -1 indexing for "end of array" came from there. Or at >least, it was in Perl long before it was in Python, and it was in Icon >before it was in Perl, so I had always presumed Larry had seen Icon. >Larry? Do not assume that th

Re: RFC 72 (v3) Variable-length lookbehind: the regexp engine should also go backward.

2000-09-05 Thread Bart Lateur
On Sat, 2 Sep 2000 15:16:20 -0400, Peter Heslin wrote: >> This looks more natural to me: >> >> /(?`!G+A+T+)GA+C/ >Your version is closer to the way lookbehind works now, so this syntax >might be thought to be clearer; I should add to the RFC an explicit >note about this. Look at your orig

Re: RFC 72 (v3) Variable-length lookbehind: the regexp engine should also go backward.

2000-09-02 Thread Bart Lateur
On 1 Sep 2000 20:50:20 -, Perl6 RFC Librarian wrote: >Imagine a very long input string containing data such as this: > >... GCAAGAATTGAACTGTAG ... > >If you want to match text that matches /GA+C/, but not when it >follows /G+A+T+/, you cannot at present do so easily. Under this >proposal

Re: RFC 165 (v1) Allow Varibles in tr///

2000-08-30 Thread Bart Lateur
On Wed, 30 Aug 2000 10:44:55 -0400, Stephen P. Potter wrote: >| Memory usage is irrelevant compared with speed. > >That's interesting. I could swear I've seen people make a tradeoff before, >rather than always just implementing the fastest solution. Nothing is >irrelevant (except resistance, if

Re: RFC 165 (v1) Allow Varibles in tr///

2000-08-30 Thread Bart Lateur
On Wed, 30 Aug 2000 11:05:41 +0200, Bart Lateur wrote: >Some processors, >like Intel's x86, even have a special machine instruction to do that: >XLAT. In the meantime, I found an x86 instruction set reference on the web. Here is the description for XLAT: <http

Re: RFC 170 (v1) Generalize =~ to a special-purpose assignment operator

2000-08-30 Thread Bart Lateur
On 30 Aug 2000 02:07:24 -, Perl6 RFC Librarian wrote: >Generalize =~ to a special-purpose assignment operator Personally, I think of it as the "apply to" operator, as opposed to the "assign to" operator: = assign the RHS to the LHS =~ apply the RHS to the LHS pe

Re: RFC 165: Allow variables in a tr///

2000-08-30 Thread Bart Lateur
On Wed, 30 Aug 2000 00:08:46 -0400, Stephen P. Potter wrote: >Personally, I would say that q/.../ and friends were a bad idea. A lot of >non-gurus see /.../ (whatever comes before it) and their first impression >is that it has something to do with regex. I would suggest that anything >that isn'

Re: RFC 165 (v1) Allow Varibles in tr///

2000-08-30 Thread Bart Lateur
On Wed, 30 Aug 2000 00:31:24 -0400, Stephen P. Potter wrote: >For every tr/// in a program, >256 bytes have to be allocated? Even if I only do something like tr/a/A/? >Is this really the optimal solution for this Speedwise, it is. You don't have to do any tests on the bytes. All you have to do

Re: RFC 165 (v1) Allow Varibles in tr///

2000-08-29 Thread Bart Lateur
On Tue, 29 Aug 2000 20:36:46 -0400, Bryan C.Warnock wrote: >> The way tr/// works is that a 256-byte table is constructed at compile >> time that say for each input character what output character is > >Speaking of which, what's going to happen when there are more than 256 >values to map? A bigg

Re: RFC 110 (v3) counting matches

2000-08-29 Thread Bart Lateur
On Tue, 29 Aug 2000 16:21:35 -0600, Tom Christiansen wrote: >>p.s. Has anybody already suggested that we ought to have a nicer >>solution to execute perl code inside a string, replacing "${\(...)}" and >>"@{[...]}", which also won't ever win a beauty contest? Oops, wrong >>mailing list. > >The f

Re: RFC 110 (v3) counting matches

2000-08-29 Thread Bart Lateur
On Tue, 29 Aug 2000 11:13:47 -0600, Tom Christiansen wrote: >We already *HAVE* a token set that forces list context, thank you >very much. It's called "()=". I'm glad you like it. $_ = 'a!a!a!a!a!a'; $count = () = split /!/; print $count; --> 1 '()=' is not pe

Re: RFC 110 (v3) counting matches

2000-08-29 Thread Bart Lateur
On Tue, 29 Aug 2000 08:56:16 -0700, Nathan Wiger wrote: >> $count = () = getpwnam("tchrist"); > >Hmmm. I agree a general purpose mechanism is good, but in this case we >already have "scalar" so why not "list"? > >$count = list getpwnam("tchrist"); > >While I agree that /l is bad, I think

Re: RFC 110 (v2) counting matches

2000-08-29 Thread Bart Lateur
On Tue, 29 Aug 2000 09:00:43 -0400, Mark-Jason Dominus wrote: >> And, I don't really see the need for the comma. >> >> m/.../CountInsensitive (instead of m/.../ti) > >I guess, but to me CountInsensitive looks like one option, not two. That goes fot this too. : m/.../iCount

Re: RFC 110 (v3) counting matches

2000-08-29 Thread Bart Lateur
On Tue, 29 Aug 2000 08:51:29 -0400, Mark-Jason Dominus wrote: >There are many operations that would be simpler if there was >a magic array that contained ($1, $2, $3, ...). If anyone wants to >write an RFC on this, I will help. Heh. I once complained about the lack of such an array, in comp.lan

Re: RFC 110 (v2) counting matches

2000-08-29 Thread Bart Lateur
On Tue, 29 Aug 2000 08:47:25 -0400, Mark-Jason Dominus wrote: >m/.../Count,Insensitive (instead of m/.../ti) > >That would escape the problem that we are running out of letters and >also the problem that the current letters are hard to remember. Yes, but wouldn't this give us backward

Re: RFC 110 (v2) counting matches

2000-08-27 Thread Bart Lateur
On 27 Aug 2000 19:01:45 -, Perl6 RFC Librarian wrote: >m//g just returns 1 for matching. Er... but in a scalar context, m//g DOES only match once! If you want more, repeat the match. Or use it in a list context, then it will try to match them all. $_ = "abaabbbababbbabbaaa";