Re: what (?x) are in use? (was RFC 145 (alternate approach))

2000-09-11 Thread Mark-Jason Dominus
> In theory, all letters should be reserved to map to future flags for > the same reason. My recollection is that Larry specifically mandated this, and that's why (?p...) was changed to (??...) in 5.6.0.

what (?x) are in use? (was RFC 145 (alternate approach))

2000-09-11 Thread Hugo
[resent, 'cos I can't spell "perl6"] Richard Proctor wrote: :The whole (?x set of thingies is getting complicated... The list of what is :used at present (and in current suggestions is: : :Current Use in perl5 : :(?# comment :(?imsx flags :(?-imsx flags That's actually (?iogcmsx and (?-iog

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread David L. Nicol
Bart Lateur wrote: > > On 06 Sep 2000 18:04:18 -0700, Randal L. Schwartz wrote: > > >I think the -1 indexing for "end of array" came from there. Or at > >least, it was in Perl long before it was in Python, and it was in Icon > >before it was in Perl, so I had always presumed Larry had seen Icon

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Michael Maraist
- Original Message - From: "Jonathan Scott Duff" <[EMAIL PROTECTED]> Subject: Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach)) > How about qy() for Quote Yacc :-) This stuff is starting to look > more and more like we&#x

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Mark-Jason Dominus
> I think what is needed is something along the line of : Joe McMahon and I are working on something along these lines.

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Jarkko Hietaniemi
On Thu, Sep 07, 2000 at 03:42:01PM -0400, Eric Roode wrote: > Richard Proctor wrote: > > > >I think what is needed is something along the line of : > > > > $re = qz{ '(' \$re ')' > >| \$re \$re > >| [^()]+ > > }; > > > >Where qz is

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Damian Conway
> What would be useful, would be to leave REs the hell alone; they're > great as-is, and are only getting hairier and hairier. Amen! > What would be useful, would be to create a new non-regular > pattern-matching/parsing "language" within Perl, that combines > the best of Perl

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Jonathan Scott Duff
On Thu, Sep 07, 2000 at 08:20:42PM +0100, Richard Proctor wrote: > I think what is needed is something along the line of : > >$re = qz{ '(' \$re ')' > | \$re \$re > | [^()]+ >}; > > Where qz is some hypothetical new quoting s

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Eric Roode
Richard Proctor wrote: > >I think what is needed is something along the line of : > > $re = qz{ '(' \$re ')' >| \$re \$re >| [^()]+ > }; > >Where qz is some hypothetical new quoting syntax Well, we currently have qr{}, and ??{} do

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Richard Proctor
On Wed 06 Sep, Mark-Jason Dominus wrote: > > I've been thinking the same thing. It seems to me that the attempts to > shoehorn parsers into regex syntax have either been unsuccessful > (yielding an underpowered extension) or illegible or both. > >SNOBOL: > parenstring = '(' *parenstrin

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-07 Thread Bart Lateur
On 06 Sep 2000 18:04:18 -0700, Randal L. Schwartz wrote: >I think the -1 indexing for "end of array" came from there. Or at >least, it was in Perl long before it was in Python, and it was in Icon >before it was in Perl, so I had always presumed Larry had seen Icon. >Larry? Do not assume that th

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Randal L. Schwartz
> "Jarkko" == Jarkko Hietaniemi <[EMAIL PROTECTED]> writes: >> "You want Icon, you know where to find it..." :) Jarkko> Hey, it's one of the few languages we haven't yet stolen a Jarkko> neat feature or few from... (I don't really count the few Jarkko> regex thingies as full-fledged stealin

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Jarkko Hietaniemi
On Wed, Sep 06, 2000 at 03:47:57PM -0700, Randal L. Schwartz wrote: > > "Mark-Jason" == Mark-Jason Dominus <[EMAIL PROTECTED]> writes: > > Mark-Jason> I have some ideas about how to do this, and I will try to > Mark-Jason> write up an RFC this week. > > "You want Icon, you know where to find

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Mark-Jason Dominus
> > "Mark-Jason" == Mark-Jason Dominus <[EMAIL PROTECTED]> writes: > > Mark-Jason> I have some ideas about how to do this, and I will try to > Mark-Jason> write up an RFC this week. > > "You want Icon, you know where to find it..." :) That's exactly my motivation. It seems to me that tryi

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Randal L. Schwartz
> "Mark-Jason" == Mark-Jason Dominus <[EMAIL PROTECTED]> writes: Mark-Jason> I have some ideas about how to do this, and I will try to Mark-Jason> write up an RFC this week. "You want Icon, you know where to find it..." :) But yes, a way that allows programmatic backtracking sort of "inside

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Mark-Jason Dominus
> >...My point is that I think we're approaching this > >the wrong way. We're trying to apply more and more parser power into what > >classically has been the lexer / tokenizer, namely our beloved > >regular-expression engine. I've been thinking the same thing. It seems to me that the attempts

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread David Corbin
Jonathan Scott Duff wrote: > > On Wed, Sep 06, 2000 at 08:40:37AM -0700, Nathan Wiger wrote: > > What if we added special XML/HTML-parsing ?< and ?> operators? > > What if we just provided deep enough hooks into the RE engine that > specialized parsing constructs like these could easily be added

Re: RFC 145 (alternate approach)

2000-09-06 Thread Michael Maraist
- Original Message - From: "Richard Proctor" <[EMAIL PROTECTED]> Sent: Tuesday, September 05, 2000 1:49 PM Subject: Re: RFC 145 (alternate approach) > On Tue 05 Sep, David Corbin wrote: > > Nathan Wiger wrote: > > > But, how about a new ?m

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Tom Christiansen
>I am working on an RFC >to allow boolean logic ( && and || and !) to apply a number of patterns to >the same substring to allow easier mining of information out of such >constructs. What, you don't like: :-) $pattern = $conjunction eq "AND" ? join('' => map { "(?=.*$_)" }

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Richard Proctor
On Wed 06 Sep, David Corbin wrote: > Nathan Wiger wrote: > > > > > It would be useful (and increasingly more common) to be able to match > > > qr|<\s*(\w+)([^>]*)>| to qr|<\s*/\1\s*>|, and handle the case where > > > those can nest as well. Something like > > > > > > match this with > > >

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Tom Christiansen
>...My point is that I think we're approaching this >the wrong way. We're trying to apply more and more parser power into what >classically has been the lexer / tokenizer, namely our beloved >regular-expression engine. >A great deal of string processing is possible with perls enhanced NFA >engin

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Nathan Wiger
David Corbin wrote: > > m:(?['' => '').*(?]): > > or more generically > > m:(?['<\w+>' => '').*(?]): I think these are good; but I do also like the idea of "automatic reversing" by default, since that's a common operation. Let's combine the ideas, as Richard suggests. How about: 1. When a

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Michael Maraist
- Original Message - From: "Jonathan Scott Duff" <[EMAIL PROTECTED]> Subject: Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach)) > On Wed, Sep 06, 2000 at 08:40:37AM -0700, Nathan Wiger wrote: > > What if we added

Re: RFC 145 (alternate approach)

2000-09-06 Thread Richard Proctor
On Tue 05 Sep, Nathan Wiger wrote: >"normal" "reversed" >-- --- >103301 >99aa99 >(( )) ><+ +> >{{[!<_ _>!]}} >{__A1( )A1__} > > That is, when a bracket is encountered, the

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Jonathan Scott Duff
On Wed, Sep 06, 2000 at 08:40:37AM -0700, Nathan Wiger wrote: > What if we added special XML/HTML-parsing ?< and ?> operators? What if we just provided deep enough hooks into the RE engine that specialized parsing constructs like these could easily be added by those who need them? -Scott -- Jon

Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread David Corbin
Nathan Wiger wrote: > > > It would be useful (and increasingly more common) to be able to match > > qr|<\s*(\w+)([^>]*)>| to qr|<\s*/\1\s*>|, and handle the case where those > > can nest as well. Something like > > > > match this with > > > > not this but > >this. > > I suspec

XML/HTML-specific ?< and ?> operators? (was Re: RFC 145 (alternate approach))

2000-09-06 Thread Nathan Wiger
> It would be useful (and increasingly more common) to be able to match > qr|<\s*(\w+)([^>]*)>| to qr|<\s*/\1\s*>|, and handle the case where those > can nest as well. Something like > > match this with > > not this but >this. I suspect this is going to need a ?[ and ?] of its

Re: RFC 145 (alternate approach)

2000-09-06 Thread Buddha Buck
At 09:05 AM 9/6/00 -0400, David Corbin wrote: >I'd suggest also, that (?[) (with no specified brackets) have the >default meaning >of the "four standard brackets" : > >(?['('=>')','{'=>'}','['=>']','<'=>'>') > >Note also the subtle syntax change. We are either dealing with strings >or with patter

Re: RFC 145 (alternate approach)

2000-09-06 Thread David Corbin
I'd suggest also, that (?[) (with no specified brackets) have the default meaning of the "four standard brackets" : (?['('=>')','{'=>'}','['=>']','<'=>'>') Note also the subtle syntax change. We are either dealing with strings or with patterns. The consensus seems to be against patterns (I can

Re: RFC 145 (alternate approach)

2000-09-05 Thread David L. Nicol
David Corbin wrote: > > I've got some vague ideas on solving all of these, I'll go into if > > people like the basic concept enough. not just in regexes, but in general, a way to extend the set of bratches that Perl knows about would be very nice. for instance it is very difficult for people us

Re: RFC 145 (alternate approach)

2000-09-05 Thread Nathan Wiger
Nathan Wiger wrote: > >"normal" "reversed" >-- --- >{__A1( )A1__} That should be: {__A1( )1A__} Why you would delimit text this way I have no idea, but it could still work... -Nate

Re: RFC 145 (alternate approach)

2000-09-05 Thread Nathan Wiger
Richard Proctor wrote: > > No ?] should match the closest ?[ it should nest the ?[s bound by any > brackets in the regex and act accordingly. Good point. > Also this does not work as a definition of simple bracket matching as you > need ( to match ) not ( to match (. A ?[ list should specify

Re: RFC 145 (alternate approach)

2000-09-05 Thread Richard Proctor
On Tue 05 Sep, Nathan Wiger wrote: > Eric Roode wrote: > Now *that* sounds cool, I like it! > > What if the RFC only suggested the addition of two new constructs, (?[) > and (?]), which did nested matches. The rest would be bound by standard > regex constructs and your imagination! > > That is,

Re: RFC 145 (alternate approach)

2000-09-05 Thread Jonathan Scott Duff
On Tue, Sep 05, 2000 at 02:12:23PM -0400, Eric Roode wrote: > Unfortunately, as Richard Proctor pointed out, ?m is taken. Perhaps > (?[list|of|openers) and (?]list|of|closers) ? That breaks the visual meaning of "|" as alternation if the RE engine is to be smart enough to match the closers wi

Re: RFC 145 (alternate approach)

2000-09-05 Thread Nathan Wiger
Eric Roode wrote: > > Unfortunately, as Richard Proctor pointed out, ?m is taken. Perhaps > (?[list|of|openers) and (?]list|of|closers) ? > > Does that look too bizarre, with the lone square bracket in each? > Or does that serve to make it mnemonic (which is my intention)? Actually, I persona

Re: RFC 145 (alternate approach)

2000-09-05 Thread Eric Roode
I think David's on to something good here. A major problem with holding the bracket-matching possibilities in a special variable (or a pair of them) is that one can't figure out what the RE is going to do just by looking at it -- you have to look elsewhere. Nathan Wiger wrote: >I think it's cool

Re: RFC 145 (alternate approach)

2000-09-05 Thread Richard Proctor
On Tue 05 Sep, David Corbin wrote: > Nathan Wiger wrote: > > > > But, how about a new ?m operator? > > > >/(?m<<|[).*?(?M>>|])/; > > > > Let's combine yor operator with my example from above where everything > inside the (?m) or the ?(M) > fits the syntax of a RE. > > /(?m(<<)|\[)

Re: RFC 145 (alternate approach)

2000-09-05 Thread Richard Proctor
On Tue 05 Sep, David Corbin wrote: > Nathan Wiger wrote: > > > > But, how about a new ?m operator? > > > >/(?m<<|[).*?(?M>>|])/; > > > > Let's combine yor operator with my example from above where everything > inside the (?m) or the ?(M) > fits the syntax of a RE. > > /(?m(<<)|\[)

Re: RFC 145 (alternate approach)

2000-09-05 Thread David Corbin
Nathan Wiger wrote: > > I think it's cool too, I don't like the @^g and ^@G either. But I worry > about the double-meaning of the []'s in your solution, and the fact that > these: > >/\m[...]...\M/; >/\d[...]...\D/; Well, it's not really a double meaning. It's a set of characters, just

Re: RFC 145 (alternate approach)

2000-09-05 Thread Nathan Wiger
I think it's cool too, I don't like the @^g and ^@G either. But I worry about the double-meaning of the []'s in your solution, and the fact that these: /\m[...]...\M/; /\d[...]...\D/; Will work so differently. Maybe another character like ()'s that takes a list: /\m(<<,[).*?\M(>>,])/;

Re: RFC 145 (alternate approach)

2000-09-05 Thread David Corbin
I never saw one comment on this, and the more I think about it, the more I like it. So, I thought I'd throw it back out one more time...(If I get no comments this time, I'll be quiet :) David Corbin wrote: > > I haven't given this a WHOLE lot of thought, so please, shoot it full > of holes. > >

RFC 145 (alternate approach)

2000-08-25 Thread David Corbin
I haven't given this a WHOLE lot of thought, so please, shoot it full of holes. I certainly like the goal of this RFC, but I dislike the idea that the specification for what chacters are going to match are specified outside of the RE. I want to be able specify a character, set of characters or