perl6-language-regex

Summary report 20000831

RFC 72: The regexp engine should go backward as well as
        forward. (Peter Heslin)

This topic did not attract much discussion until the very end of the
week.  I sent the author a detailed critique, to which he has not
responded. 

RFC 93: Regex: Support for incremental pattern matching  (Damian Conway)

The only disucssion was John Porter's suggestion that lazily evaluated
strings might generalize this proposal.

RFC 110: counting matches  (Richard Proctor)

An extensive side discussion of

        $count = () = m/PAT/g;

developed, including an excursion off into context issues.  I have
asked the author to take this idiom into account in the next version
of the RFC.

I pointed out that there are several features of the match operator
that are enabled or disabled depending on the context in which the
match takes place, and this makes it difficult to combine certain
operations.  For example, there is no good way to get the iterative
semantics of "while (m/.../g)" and the backreference assignment
semantics of "@backrefs = m/.../" at the same time.  However, no
proposal was advanced about how to fix this.  However, it was
suggested that there should be an array which contains all the
backrefence variables ($1, $2, $3, ...).  I would appreciate an RFC on
this.  

Richard wanted to enable his 'counting' feature with a new regex
modifier, /t.  He wanted to use /c, but /c was already taken.  I
suggested that the modifier namespace could be enlarged by adopting
the convention that a capital letter begins the name of a full-word
option.  For example, m/.../Count would be the same as m/.../t, and
m/.../Count,i would be the same as m/.../ti.  I did not submit and
RFC.  


RFC 112: Assignment within a regex  (Richard Proctor)

Very little discussion.  This should be compared with RFC 150 and
perhaps be folded into it.

RFC 135: Require explicit m on matches, even with ?? and // as delimiters.
         (Nathan Torkington)

This RFC should probably have been transferred from the perl6-language
group, but hasn't been.

RFC 138: Eliminate =~ operator.  (Steve Fink)

I started off the discussion by pointing out that the RFC did not
contain any compelling rationale for eliminating =~ at all, but Larry
said that he had been thinking about ways to eliminate =~, which is a
compelling rationale by itself.  I raised several other technical
points.

There was a discussion about whether the =~ operator is intuitive or not.

I suggested that RFC 135, 138, and 164 could all be combined.  Stephen
Potter added that RFC156 could also be combined with these.  Nathan
Wiger said that the next version of RFC 164 might cover everything
proposed by RFC 138.  Steve mentioned some reasons why he did not like
the ideas in RFC 164, but Nathan said that he would like the new
version of RFC 164 better.

RFC 144: Behavior of empty regex should be simple  (Mark Dominus)

There was little discussion.  Tom proposed some additional discussion,
which I added in version 2.  There was no discussion of version 2.

RFC 145: Brace-matching for Perl Regular Expressions  (Eric Roode)

I sent a technical critique.  Eric incorporated several of my
suggestions into version 2 and also added a number of clarifications.
I still wonder if there isn't a more general and more powerful way to
accomplish the same thing, possibly by stealing features from SNOBOL.

RFC 150: Extend regex syntax to provide for return of a hash of
         matched subpatterns  (Kevin Walker)

Tom said he thought this would be useful, and suggested that Kevin
look at the way Python had addressed the same problem.  Kevin said he
would do that, but has not reported back.

RFC 158: Regular Expression Special Variables  (Uri Guttman)

There was only a little discussion of the proposal itself.  There was
some discussion of the regex engine internals (whether $& is stored as
offsets into the original string, or whether the data is actually
copied to a separate buffer) and what the performance penalty is for
using $&.

RFC 164: Replace =~, !~, m//, s///, and tr// with match(), subst(),
         and trade()  (Nathan Wiger)

There were several criticisms of version 1.  Jonathan Scott Duff
pointed out that 'subst' is very close to 'substr'.

There was a long discussion about whether it was wise to drastically
change fundamental and common Perl idioms such as /.../.

Nathan submitted a second version that added backward compatibility
for the syntax, which depends on the adoption of RFC 139 and 170.
There did not appear to be any discussion of the new version.  I would
like to know whether Steve Fink thinks that his proposal of RFC 138 is
suitably included here.

RFC 165: Allow variables in tr///  (Richard Proctor)

Stephen Potter suggested simply replacing the tr/// syntax with a
regular function call tr($search, $replace, ...) which takes care of
the goal of this RFC very neatly.  Nathan Wiger pointed out that this
was covered by RFC 164.

I pointed out that the implementation would have construct the
translation table at run-time, and that this brings in the same issues
as when a regex is constructed at run time.  For example, a new tr///o
option becomes desirable for the same reaosn the m//o is desirable.
Tom and I had a discussion of these issues, but there do not appear
to be any issues here that do not also come up in connection with
interpolated regexes.

There was a sidetrack about the implementation of tr/// in the
presence of Unicode strings.

RFC 166: Additions to regexs  (Richard Proctor)

This RFC unfortunately proposes three totally unrelated changes.

Richard proposed a 'does not match' operator, with the example that

        /a(?^b)c/

would match ac, axc, a---c, but not abc, a-b-c, or abbbc.  But did not
include a complete enough explanation of what it would do to enable
anyone to implement it.  (Nobody has been able to produce a sensible
description of such an operator, which is probably why Perl doesn't
have one yet.)  Richard said he would tighten up the definition, but
version 2 has not appeared yet.

Richard also proposed a (?) operator that would match the empty
string.  You would use this in cases like /$foo(?)bar/ where it is
inappropriate to abut $foo and bar.  It was pointed out that
/${foo}bar/ and /$foo(?:)bar/ already work for this purpose.  Richard
agreed that this was what he wanted.

The third proposal was that (?@foo) be taken to interpolate the string
(join "|", @foo).  There was no discussion of this.
 
RFC 170: Generalize =~ to a special-purpose assignment operator
         (Nathan Wiger)

This is probably the most interesting and far-reaching RFC proposed
this week, but there was essentially no discussion.



Mark-Jason Dominus                                               [EMAIL PROTECTED]
I am boycotting Amazon. See http://www.plover.com/~mjd/amazon.html for details.

Reply via email to