subject:"Working with a regex using positional captures stored in a variable"

Re: Working with a regex using positional captures stored in a variable

2021-03-22 Thread Ralph Mellor

On Mon, Mar 22, 2021 at 2:50 PM yary  wrote:
>
> how to get all nested captures worked for me ... not sure I agree with the 
> design

I think the current flattening aspect is awkward in a couple ways:

* Having to specify ``. I half like Brad's suggestion.

* Having to write `.pairs` to extract the captures.

Are your reservations about the design related to the above awkwardnesses?

I can see scope for improvement on the above in years to come.

Some other commentary:

For regexing, it makes sense to return single match objects or flat lists.
Devs can then manually add structure to those results if they wish. For
parsing, it makes sense to go the other way around, returning nested
structures that devs can manually flatten if they wish (as I did).

For data structures in general, for some tasks, it makes sense to flatten
by default and let devs add structure if they wish, and for other tasks it
makes sense to maintain structure by default and let devs flatten if they
wish. A PL does have to pick one or the other, either on a general basis,
or on a feature-by-feature basis.

Perl generally focused on the flat-by-default approach, with regexes
fitting in that scheme, and has stayed true to its roots, albeit with some
evolution toward improvements for structured data in recent years as
devs have contributed additions to the language.

Raku generally focused on the structured-data-by-default approach.
It also unified regexes and parsing. The natural outcome was a bias
toward structure (parse trees). I anticipate there will be improvements
for flattening structured data as time passes.

--
love raiph

Re: Working with a regex using positional captures stored in a variable

2021-03-22 Thread yary

Hi all,

Thanks Bill for posting your results from my samples. Seems like we both
get lots of warnings/errors from our REPL's, me even with 2021.02.01. I
suspect there must be something going on with what the REPL is trying to
print, after all it does want to display the results of every line. I
haven't looked at that since the first post, was more interested in the
capturing-or-not rules displayed by the code run from a file than tracking
down the REPL output. Maybe I'll look into it again next weekend, need to
get back to $work today!

Thanks Raiph for the continued examples, this one in particular showing how
to get all nested captures worked for me. While I'm not sure I agree with
the design, it is consistent, and perhaps I will internalize this over time.

On Fri, Mar 19, 2021 at 7:14 PM Ralph Mellor 
wrote: ...

>
> A Raku equivalent:
> my $word = '(\w+)';
> my $AwithB = "$word ' with ' $word";
> my $regex = "$AwithB .* 'is ' $word";
> $_ = 'Interpolating regexes with arbitrary captures is fun!';
> .say for m//..pairs;
> displays:
>  0 => ｢regexes｣
>  1 => ｢arbitrary｣
>  2 => ｢fun｣
> 
> > Raku example:
> >
> > my $word = /(\w+)/;
> > my $AwithB = /$word' with '$word/;
> If you interpolate by using `$abc...` or `<$abc...>` instead of ``,
> Raku will by default not capture. And the non-capturing is nested, so
> throwing away those captures also throws away the corresponding
> capture within `$word`.

-y

Re: Working with a regex using positional captures stored in a variable

2021-03-20 Thread William Michels via perl6-users

Hi Yary,

I ran your Raku code in a script (on MacOS) and in the REPL (MacOS with
Linenoise). All results below with Rakudo_2020.10:

#Script:

my $word = /(\w+)/;
my $AwithB = /$word' with '$word/;
$_= 'Interpolating regexes with arbitrary captures is fun!';
say "Nested rx";
dd m/$AwithB.*'is '$word/;

say "shallow rx";
dd m/$word' with '$word.*'is '$word/;

say "no interpolation";
dd m/(\w+)' with '(\w+).*'is '(\w+)/;

#Script Result:

admin@mbook:~$ raku yary_named.p6
Nested rx
Match $/ = Match.new(:orig("Interpolating regexes with arbitrary captures
is fun!"), :from(14), :pos(52))
shallow rx
Match $/ = Match.new(:orig("Interpolating regexes with arbitrary captures
is fun!"), :from(14), :pos(52))
no interpolation
Match $/ = Match.new(:orig("Interpolating regexes with arbitrary captures
is fun!"), :from(14), :pos(52), :list((Match.new(:orig("Interpolating
regexes with arbitrary captures is fun!"), :from(14), :pos(21)),
Match.new(:orig("Interpolating regexes with arbitrary captures is fun!"),
:from(27), :pos(36)), Match.new(:orig("Interpolating regexes with arbitrary
captures is fun!"), :from(49), :pos(52)

REPL:

admin@mbook:~$ raku
Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.
Implementing the 퐑퐚퐤퐮™ programming language v6.d.
Built on MoarVM version 2020.10.

To exit type 'exit' or '^D'
> my $word = /(\w+)/;
/(\w+)/
> my $AwithB = /$word' with '$word/;
Regex object coerced to string (please use .gist or .raku to do that)
  in any metachar at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any termseq at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any quote:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any quote at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any value:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any value at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any term:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any term at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1

Regex object coerced to string (please use .gist or .raku to do that)
  in any metachar at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any termseq at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any quote:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any quote at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any value:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any value at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any term:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any term at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1

> $_= 'Interpolating regexes with arbitrary captures is fun!';
Interpolating regexes with arbitrary captures is fun!
> say "Nested rx";
Nested rx
> dd m/$AwithB.*'is '$word/;
Regex object coerced to string (please use .gist or .raku to do that)
  in any metachar at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any termseq at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any quote at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any value:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any value at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any term:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any term at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1

Regex object coerced to string (please use .gist or .raku to do that)
  in any metachar at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any termseq at /Users/admin/rakudo/rakudo-
2020.10/install/share/nqp/lib/NQPP6QRegex.moarvm line 1
  in any quote at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any value:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any value at
/Users/admin/rakudo/rakudo-2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm
line 1
  in any term:sym at /Users/admin/rakudo/rakudo-
2020.10/install/share/perl6/lib/Perl6/Grammar.moarvm line 1
  in any term at

Re: Working with a regex using positional captures stored in a variable

2021-03-20 Thread Ralph Mellor

On Wed, Mar 17, 2021 at 7:17 PM William Michels via perl6-users
 wrote:
>
> ("If the first character inside is anything other than an alpha it doesn't 
> capture").
> It should be added to the Raku Docs ASAP.

Fyi, here's how Larry Wall expressed it 15-18 years ago:

> A leading alphabetic character means it's capturing

(from https://design.raku.org/S05.html#line_1422)

--
love, raiph

Re: Working with a regex using positional captures stored in a variable

2021-03-19 Thread Ralph Mellor

On Fri, Mar 19, 2021 at 6:12 PM yary  wrote:
>
> I don't know how to get the result.

> DB<1> $word = qr/(\w+)/;
> DB<2> $AwithB = qr/$word with $word/
> DB<3> $_ = 'Interpolating regexes with arbitrary captures is fun!'
> DB<4> x /$AwithB.*is $word/

A Raku equivalent:

my $word = '(\w+)';
my $AwithB = "$word ' with ' $word";
my $regex = "$AwithB .* 'is ' $word";
$_ = 'Interpolating regexes with arbitrary captures is fun!';

.say for m//..pairs;

displays:

 0 => ｢regexes｣
 1 => ｢arbitrary｣
 2 => ｢fun｣

> Raku example:
>
> my $word = /(\w+)/;
> my $AwithB = /$word' with '$word/;

If you interpolate by using `$abc...` or `<$abc...>` instead of ``,
Raku will by default not capture. And the non-capturing is nested, so
throwing away those captures also throws away the corresponding
capture within `$word`.

> Where my expectation differs from the behavior in my example
> is Raku's discarding the capture groups of the interpolated regexes.

It only discards them if you tell it to discard them.

If a `<...>` construct begins with a letter, it'll capture. If not, it won't.

--
love, raiph

Re: Working with a regex using positional captures stored in a variable

2021-03-19 Thread yary

My current expectations are a little different than any others previously
expressed and I don't know how to get the result. I am no longer
considering named captures from Regex's interpolated inside  and am now looking at directly interpolating them.

Perl example:

DB<1> *$word = qr/(\w+)/;*

DB<2> *$AwithB = qr/$word with $word/*

DB<3> *$_ = 'Interpolating regexes with arbitrary captures is fun!'*

DB<4> *x /$AwithB.*is $word/*

0  'regexes'

1  'arbitrary'

2  'fun'

That was simple and I like the results of the capture groups being
first-level.

Raku example:

my $word = /(\w+)/;
my $AwithB = /$word' with '$word/;
$_= 'Interpolating regexes with arbitrary captures is fun!';
say "Nested rx";
dd m/$AwithB.*'is '$word/;

say "shallow rx";
dd m/$word' with '$word.*'is '$word/;

say "no interpolation";
dd m/(\w+)' with '(\w+).*'is '(\w+)/;
# code end results below

Nested rx

Match $/ = Match.new(:orig("Interpolating regexes with arbitrary captures
is fun!"), :from(14), :pos(52))

shallow rx

Match $/ = Match.new(:orig("Interpolating regexes with arbitrary captures
is fun!"), :from(14), :pos(52))

no interpolation

Match $/ = Match.new(:orig("Interpolating regexes with arbitrary captures
is fun!"), :from(14), :pos(52), :list((Match.new(:orig("Interpolating
regexes with arbitrary captures is fun!"), :from(14), :pos(21)),
Match.new(:orig("Interpolating regexes with arbitrary captures is fun!"),
:from(27), :pos(36)), Match.new(:orig("Interpolating regexes with arbitrary
captures is fun!"), :from(49), :pos(52)

Run against

Welcome to 퐑퐚퐤퐮퐝퐨™ v2021.02.1.

Implementing the 퐑퐚퐤퐮™ programming language v6.d.

Built on MoarVM version 2021.02.

What I see from that example code is Raku matching all the regex's as I
expect regardless of nesting them, all without the named capture grouping
angle-brackets. Which is what the documentation suggests from its example-

my $string   = 'Is this a regex or a string: 123\w+False$pattern1 ?';
my $regex= /\w+/;
say $string.match: / $regex /;#  [4] OUTPUT: «｢Is｣␤»

Where my expectation differs from the behavior in my example is Raku's
discarding the capture groups of the interpolated regexes. The overall
match works, in all cases :from(14) to :pos(52), but Raku treats the
groupings inside the interpolations as non-capturing.

-y

On Thu, Mar 18, 2021 at 6:08 PM Ralph Mellor 
wrote:

> On Thu, Mar 18, 2021 at 12:59 AM yary  wrote:
> >
> > As it is I get same kinds of errors in the REPL, perhaps it is MacOS
> > with Linenoise that's mucking that up.
>
> I can confirm your new test code also works fine in both program
> and repl forms in 2020.12.
>
> Though obviously the case you mark as "interesting" still doesn't do
> any sub-capturing. Which is to be expected if you know that aspect
> of Raku's regex language.
>
> > I had hoped that by directly interpolating $rd and $rw they would
> > fill in the top-level match object and fill in $0, $1 – but it has the
> > same issue as Joe's original example.
>
> Are you just saying that your original expectations were the same
> as Joe's, but you now understand that's not how Raku regexes
> work, but it's trivial to get the same result? Or are you saying you
> don't know how to get the same result?
>
> --
> love, raiph
>

Re: Working with a regex using positional captures stored in a variable

2021-03-18 Thread Ralph Mellor

On Thu, Mar 18, 2021 at 12:59 AM yary  wrote:
>
> As it is I get same kinds of errors in the REPL, perhaps it is MacOS
> with Linenoise that's mucking that up.

I can confirm your new test code also works fine in both program
and repl forms in 2020.12.

Though obviously the case you mark as "interesting" still doesn't do
any sub-capturing. Which is to be expected if you know that aspect
of Raku's regex language.

> I had hoped that by directly interpolating $rd and $rw they would
> fill in the top-level match object and fill in $0, $1 – but it has the
> same issue as Joe's original example.

Are you just saying that your original expectations were the same
as Joe's, but you now understand that's not how Raku regexes
work, but it's trivial to get the same result? Or are you saying you
don't know how to get the same result?

--
love, raiph

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread yary

Thanks raiph for everything!

Including getting me to upgrade my Raku, "Welcome to Rakudo(tm) v2021.02.1.

Implementing the Raku(tm) programming language v6.d.

Built on MoarVM version 2021.02."


As it is I get same kinds of errors in the REPL, perhaps it is MacOS with
Linenoise that's mucking that up. The code in a file does better, matching
the docs. Still I had hoped to work around these issues by interpolating
with the bare variables not inside of angle-brackets. Here's the test code-


my $str = ' grr huh yeah 388 boo! ';

say $str ~~ / (\d+) \s (\w+) /; # 388 boo

say "Match 0=$0, 1=$1"; # Match 0=388, 1=boo

say  "=== below is the interesting case";

my $rd = /(\d+)/;

my $rw = /(\w+)/;

say $str.match: / $rd \s $rw /; # 388 boo

say "Match 0=$0, 1=$1"; # Match 0=, 1=

say "=== below shows literal string matching";

my $sd = '(\d+)';

my $sw = '(\w+)';

$str ~= "$sd $sw";

say $str.match: / ($sd) \s ($sw) /; # (\d+) (\w+)

say "Match 0=$0, 1=$1"; # Match 0=(\d+), 1=(\w+)

I had hoped that by directly interpolating $rd and $rw they would fill in
the top-level match object and fill in $0, $1 – but it has the same issue
as Joe's original example. It matches the right text but doesn't fill in
the top-level match.

-y


On Wed, Mar 17, 2021 at 6:55 PM Ralph Mellor 
wrote:

> > 1. The list you posted is fantastic ("If the first character inside is
> anything other
> > than an alpha it doesn't capture"). It should be added to the Raku Docs
> ASAP.
>
> Not the list, right? Just the rule. (There are dozens of kinds of
> assertions. No one
> is going to remember the list.) If you were to add just the line you
> suggest then
> you'd be able to do it ASAP.
>
> > 2. There are some shortcuts that don't seem to follow a set pattern. For
> example
> > a named capture can be accessed using $ instead of $/ ;
> > the "/' can be elided. Do you have a method you can share for remembering
> > these sorts of shortcuts? Or are they disfavored?
>
> I know you're asking Brad, but fwiw my overall method for mnemonics is to
> stay
> creative and bring in visual and audio elements (like images and rhyming)
> and
> weave them into a little made up story that fits in with an overall
> adventure story
> about Raku. The elements would be ones that work for a given individual
> for a
> given aspect of a given feature of Raku, with the story being made up by
> the
> person doing the learning.
>
> Thus, for example, I might note that the @ symbol looks something like a
> `0`
> and is sounded out as "at" and have a kid tell me a little story they
> imagine
> getting added to the twitter profile of someone they know about programming
> that involves the fact that array indexing is `0` based, thus giving
> them a strong
> reminder of the latter aspect.
>
> Fwiw I'm not seeing much value in developing one for eliding the `/`.
> If a dev doesn't
> know they can elide the `/` when writing code, then no harm done; just
> leave it in.
> If a dev is *reading* code and sees syntax of the form `$` and
> wonders what
> it is, they can type `$<` into the search box in the doc and get a
> match. I personally
> found it really easy to remember because it's so simple and used so
> frequently. I
> think mnemonics
>
> > 3. Finally, I've never seen in the Perl6/Raku literature the motto you
> cite:
> > "One of the mottos of Raku, is that it is ok to confuse a new programmer,
> > it is not ok to confuse an expert."
>
> I think that's a reasonable distillation of Larry's perspective. It's
> another way
> of expressing that Python is good as a first language, Raku as a last one.
>
> Consider the options:
>
> * Ok to confuse a new programmer or an expert. (Not a good option.)
>
> * Ok to confuse an expert, not Ok to confuse a new programmer. ScratchJr?
>
> * Not Ok to confuse a new programmer or an expert. ScratchJr?
>
> > [ The motto I prefer is from Larry Wall: "...easy things should stay
> easy,
> > hard things should get easier, and impossible things should get hard...
> ."
>
> I like that one too. I daresay I prefer it too. But for it to work, it
> really needs
> to be Ok to confuse a new programmer but not Ok to confuse an expert.
>
> Note that easy things being easy does not mean that new programmers
> won't get confused. For example, a new Raku programmer who is used
> to Perl might  be confused that `<$foo>` does not capture in Raku, even
> though it does in Perl. But it's still easy to capture; you write
> ``.
>
> But when *experts* are *systematically* confused, i.e. *all* experts just
> keep falling for the same trap, then impossible things won't just be hard,
> they'll stop happening.
>
> Note that I say that as a non-expert in many, many areas of Raku. What
> I *do* know is that whenever I encounter something that surprises me,
> and keep an open mind about what's going on, no matter how annoyed
> I am or convinced Raku is being stupid, I almost always eventually arrive
> at an interim conclusion it's appropriate as is.
>
> *Almost* always.

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread Ralph Mellor

> 1. The list you posted is fantastic ("If the first character inside is 
> anything other
> than an alpha it doesn't capture"). It should be added to the Raku Docs ASAP.

Not the list, right? Just the rule. (There are dozens of kinds of
assertions. No one
is going to remember the list.) If you were to add just the line you
suggest then
you'd be able to do it ASAP.

> 2. There are some shortcuts that don't seem to follow a set pattern. For 
> example
> a named capture can be accessed using $ instead of $/ ;
> the "/' can be elided. Do you have a method you can share for remembering
> these sorts of shortcuts? Or are they disfavored?

I know you're asking Brad, but fwiw my overall method for mnemonics is to stay
creative and bring in visual and audio elements (like images and rhyming) and
weave them into a little made up story that fits in with an overall
adventure story
about Raku. The elements would be ones that work for a given individual for a
given aspect of a given feature of Raku, with the story being made up by the
person doing the learning.

Thus, for example, I might note that the @ symbol looks something like a `0`
and is sounded out as "at" and have a kid tell me a little story they imagine
getting added to the twitter profile of someone they know about programming
that involves the fact that array indexing is `0` based, thus giving
them a strong
reminder of the latter aspect.

Fwiw I'm not seeing much value in developing one for eliding the `/`.
If a dev doesn't
know they can elide the `/` when writing code, then no harm done; just
leave it in.
If a dev is *reading* code and sees syntax of the form `$` and wonders what
it is, they can type `$<` into the search box in the doc and get a
match. I personally
found it really easy to remember because it's so simple and used so
frequently. I
think mnemonics

> 3. Finally, I've never seen in the Perl6/Raku literature the motto you cite:
> "One of the mottos of Raku, is that it is ok to confuse a new programmer,
> it is not ok to confuse an expert."

I think that's a reasonable distillation of Larry's perspective. It's
another way
of expressing that Python is good as a first language, Raku as a last one.

Consider the options:

* Ok to confuse a new programmer or an expert. (Not a good option.)

* Ok to confuse an expert, not Ok to confuse a new programmer. ScratchJr?

* Not Ok to confuse a new programmer or an expert. ScratchJr?

> [ The motto I prefer is from Larry Wall: "...easy things should stay easy,
> hard things should get easier, and impossible things should get hard... ."

I like that one too. I daresay I prefer it too. But for it to work, it
really needs
to be Ok to confuse a new programmer but not Ok to confuse an expert.

Note that easy things being easy does not mean that new programmers
won't get confused. For example, a new Raku programmer who is used
to Perl might  be confused that `<$foo>` does not capture in Raku, even
though it does in Perl. But it's still easy to capture; you write ``.

But when *experts* are *systematically* confused, i.e. *all* experts just
keep falling for the same trap, then impossible things won't just be hard,
they'll stop happening.

Note that I say that as a non-expert in many, many areas of Raku. What
I *do* know is that whenever I encounter something that surprises me,
and keep an open mind about what's going on, no matter how annoyed
I am or convinced Raku is being stupid, I almost always eventually arrive
at an interim conclusion it's appropriate as is.

*Almost* always. And always an *interim* conclusion at best. That is to
say, I retain an eternally open mind toward all such things. So if someone
were to add a table to the doc listing all the assertion types, noting which
ones capture and which ones don't, rather than just the one rule ("starts
with an alpha") I'd be open minded about what the outcome would be.
Likewise if Brad reveals some master method he has for coming up with
a mnemonic covering dropping the `/` in `$`. And if it turns out Larry
has never said something directly to the effect that Brad has mentioned,
I'd be surprised but curious why I was surprised.

--
love, raiph

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread Ralph Mellor

And when I cut/paste from the doc, the number example works too,
in both script and repl.

On Wed, Mar 17, 2021 at 10:33 PM Ralph Mellor  wrote:
>
> Er, by wfm I mean it matches ｢Is｣ as the code suggests.
>
> On Wed, Mar 17, 2021 at 10:32 PM Ralph Mellor  wrote:
> >
> > Works for me in Rakudo 2020.12.
> >
> > On Wed, Mar 17, 2021 at 9:33 PM yary  wrote:
> > >
> > > The "Interpolation" section of the raku docs use strings as the elements 
> > > of building up a larger regex from smaller pieces, but the example that 
> > > looks fruitful isn't working in my raku. This is taken from 
> > > https://docs.raku.org/language/regexes#Regex_interpolation
> > >
> > > > my $string   = 'Is this a regex or a string: 123\w+False$pattern1 ?';
> > >
> > > Is this a regex or a string: 123\w+False$pattern1 ?
> > >
> > > > my $regex= /\w+/;
> > >
> > > /\w+/
> > >
> > > > say $string.match: / $regex /;
> > >
> > > Regex object coerced to string (please use .gist or .raku to do that)
> > >
> > >  ... and more error lines, and no result when the docs show matching 
> > > '123':
> > >
> > > ｢｣
> > >
> > >
> > > $ raku -v
> > >
> > > Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.
> > >
> > > Implementing the 퐑퐚퐤퐮™ programming language v6.d.
> > >
> > > Built on MoarVM version 2020.10.
> > >
> > >
> > >
> > > -y
> > >
> > >
> > > On Wed, Mar 17, 2021 at 3:17 PM William Michels via perl6-users 
> > >  wrote:
> > >>
> > >> Dear Brad,
> > >>
> > >> 1. The list you posted is fantastic ("If the first character inside is 
> > >> anything other than an alpha it doesn't capture"). It should be added to 
> > >> the Raku Docs ASAP.
> > >>
> > >> 2. There are some shortcuts that don't seem to follow a set pattern. For 
> > >> example a named capture can be accessed using $ instead of 
> > >> $/ ; the "/' can be elided. Do you have a method you can share 
> > >> for remembering these sorts of shortcuts? Or are they disfavored?
> > >>
> > >> > say ~$ if 'abc' ~~ / $ = [ \w+ ] /;
> > >> abc
> > >> >
> > >> [ Above from the example at 
> > >> https://docs.raku.org/syntax/Named%20captures ].
> > >>
> > >> 3. Finally, I've never seen in the Perl6/Raku literature the motto you 
> > >> cite: "One of the mottos of Raku, is that it is ok to confuse a new 
> > >> programmer, it is not ok to confuse an expert." Do you have a citation?
> > >>
> > >> [ The motto I prefer is from Larry Wall: "...easy things should stay 
> > >> easy, hard things should get easier, and impossible things should get 
> > >> hard... ." Citation: https://www.perl.com/pub/2000/10/23/soto2000.html/ 
> > >> ].
> > >>
> > >> Best Regards,
> > >>
> > >> Bill.
> > >>
> > >>
> > >>
> > >> On Sat, Mar 13, 2021 at 4:47 PM Brad Gilbert  wrote:
> > >>>
> > >>> It makes <…> more consistent precisely because <$pattern> doesn't 
> > >>> capture.
> > >>>
> > >>> If the first character inside is anything other than an alpha it 
> > >>> doesn't capture.
> > >>> Which is a very simple description of when it captures.
> > >>>
> > >>>  doesn't capture because of the ｢?｣
> > >>>  doesn't capture because of the ｢!｣
> > >>> <.ws> doesn't capture because of the ｢.｣
> > >>> <> doesn't capture because of the ｢&｣
> > >>> <$pattern> doesn't capture because of the ｢$｣
> > >>> <$0> doesn't capture because of the ｢$｣
> > >>> <@a> doesn't capture because of the ｢@｣
> > >>> <[…]> doesn't capture because of the ｢[｣
> > >>> <-[…]> doesn't capture because of the ｢-]
> > >>> <:Ll> doesn't capture because of the ｢:｣
> > >>>
> > >>> For most of those, you don't actually want it to capture.
> > >>> With ｢.｣ the whole point is that it doesn't capture.
> > >>>
> > >>>  does capture because it starts with an alpha
> > >>>  does capture because it starts with an alpha
> > >>>
> > >>> $0 = <$pattern> doesn't capture to $, but does capture to 
> > >>> $0
> > >>> $ = <$pattern> captures because of $ =
> > >>>
> > >>> It would be a mistake to just make <$pattern> capture.
> > >>> Consistency is perhaps Raku's most important feature.
> > >>>
> > >>> One of the mottos of Raku, is that it is ok to confuse a new 
> > >>> programmer, it is not ok to confuse an expert.
> > >>> An expert in Raku understands the deep fundamental ways that Raku is 
> > >>> consistent.
> > >>> So breaking consistency should be very carefully considered.
> > >>>
> > >>> In this case, there is very little benefit.
> > >>> Even worse, you then have to come up with some new syntax to prevent it 
> > >>> from capturing when you don't want it to.
> > >>> That new syntax wouldn't be as guessible as it currently is. Which 
> > >>> again would confuse experts.
> > >>>
> > >>> If anyone seriously suggests such a change, I will vehemently fight to 
> > >>> prevent it from happening.
> > >>>
> > >>> I would be more likely to accept <=$pattern> being added as a synonym 
> > >>> to .
> > >>>
> > >>> On Sat, Mar 13, 2021 at 3:30 PM Joseph Brenner  
> > >>> wrote:
> > 
> >  Thanks much for your answer on this.  I

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread Ralph Mellor

Er, by wfm I mean it matches ｢Is｣ as the code suggests.

On Wed, Mar 17, 2021 at 10:32 PM Ralph Mellor  wrote:
>
> Works for me in Rakudo 2020.12.
>
> On Wed, Mar 17, 2021 at 9:33 PM yary  wrote:
> >
> > The "Interpolation" section of the raku docs use strings as the elements of 
> > building up a larger regex from smaller pieces, but the example that looks 
> > fruitful isn't working in my raku. This is taken from 
> > https://docs.raku.org/language/regexes#Regex_interpolation
> >
> > > my $string   = 'Is this a regex or a string: 123\w+False$pattern1 ?';
> >
> > Is this a regex or a string: 123\w+False$pattern1 ?
> >
> > > my $regex= /\w+/;
> >
> > /\w+/
> >
> > > say $string.match: / $regex /;
> >
> > Regex object coerced to string (please use .gist or .raku to do that)
> >
> >  ... and more error lines, and no result when the docs show matching '123':
> >
> > ｢｣
> >
> >
> > $ raku -v
> >
> > Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.
> >
> > Implementing the 퐑퐚퐤퐮™ programming language v6.d.
> >
> > Built on MoarVM version 2020.10.
> >
> >
> >
> > -y
> >
> >
> > On Wed, Mar 17, 2021 at 3:17 PM William Michels via perl6-users 
> >  wrote:
> >>
> >> Dear Brad,
> >>
> >> 1. The list you posted is fantastic ("If the first character inside is 
> >> anything other than an alpha it doesn't capture"). It should be added to 
> >> the Raku Docs ASAP.
> >>
> >> 2. There are some shortcuts that don't seem to follow a set pattern. For 
> >> example a named capture can be accessed using $ instead of 
> >> $/ ; the "/' can be elided. Do you have a method you can share for 
> >> remembering these sorts of shortcuts? Or are they disfavored?
> >>
> >> > say ~$ if 'abc' ~~ / $ = [ \w+ ] /;
> >> abc
> >> >
> >> [ Above from the example at https://docs.raku.org/syntax/Named%20captures 
> >> ].
> >>
> >> 3. Finally, I've never seen in the Perl6/Raku literature the motto you 
> >> cite: "One of the mottos of Raku, is that it is ok to confuse a new 
> >> programmer, it is not ok to confuse an expert." Do you have a citation?
> >>
> >> [ The motto I prefer is from Larry Wall: "...easy things should stay easy, 
> >> hard things should get easier, and impossible things should get hard... ." 
> >> Citation: https://www.perl.com/pub/2000/10/23/soto2000.html/ ].
> >>
> >> Best Regards,
> >>
> >> Bill.
> >>
> >>
> >>
> >> On Sat, Mar 13, 2021 at 4:47 PM Brad Gilbert  wrote:
> >>>
> >>> It makes <…> more consistent precisely because <$pattern> doesn't capture.
> >>>
> >>> If the first character inside is anything other than an alpha it doesn't 
> >>> capture.
> >>> Which is a very simple description of when it captures.
> >>>
> >>>  doesn't capture because of the ｢?｣
> >>>  doesn't capture because of the ｢!｣
> >>> <.ws> doesn't capture because of the ｢.｣
> >>> <> doesn't capture because of the ｢&｣
> >>> <$pattern> doesn't capture because of the ｢$｣
> >>> <$0> doesn't capture because of the ｢$｣
> >>> <@a> doesn't capture because of the ｢@｣
> >>> <[…]> doesn't capture because of the ｢[｣
> >>> <-[…]> doesn't capture because of the ｢-]
> >>> <:Ll> doesn't capture because of the ｢:｣
> >>>
> >>> For most of those, you don't actually want it to capture.
> >>> With ｢.｣ the whole point is that it doesn't capture.
> >>>
> >>>  does capture because it starts with an alpha
> >>>  does capture because it starts with an alpha
> >>>
> >>> $0 = <$pattern> doesn't capture to $, but does capture to $0
> >>> $ = <$pattern> captures because of $ =
> >>>
> >>> It would be a mistake to just make <$pattern> capture.
> >>> Consistency is perhaps Raku's most important feature.
> >>>
> >>> One of the mottos of Raku, is that it is ok to confuse a new programmer, 
> >>> it is not ok to confuse an expert.
> >>> An expert in Raku understands the deep fundamental ways that Raku is 
> >>> consistent.
> >>> So breaking consistency should be very carefully considered.
> >>>
> >>> In this case, there is very little benefit.
> >>> Even worse, you then have to come up with some new syntax to prevent it 
> >>> from capturing when you don't want it to.
> >>> That new syntax wouldn't be as guessible as it currently is. Which again 
> >>> would confuse experts.
> >>>
> >>> If anyone seriously suggests such a change, I will vehemently fight to 
> >>> prevent it from happening.
> >>>
> >>> I would be more likely to accept <=$pattern> being added as a synonym to 
> >>> .
> >>>
> >>> On Sat, Mar 13, 2021 at 3:30 PM Joseph Brenner  wrote:
> 
>  Thanks much for your answer on this.  I think this is the sort of
>  trick I was looking for:
> 
>  Brad Gilbert wrote:
> 
>  > You can put it back in as a named
> 
>  > > $input ~~ / 
>  > ｢9 million｣
>  >  pattern => ｢9 million｣
>  >   0 => ｢9｣
>  >   1 => ｢million｣
> 
>  That's good enough, I guess, though you need to know about the
>  issue... is there some reason it shouldn't happen automatically,
>

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread Ralph Mellor

Works for me in Rakudo 2020.12.

On Wed, Mar 17, 2021 at 9:33 PM yary  wrote:
>
> The "Interpolation" section of the raku docs use strings as the elements of 
> building up a larger regex from smaller pieces, but the example that looks 
> fruitful isn't working in my raku. This is taken from 
> https://docs.raku.org/language/regexes#Regex_interpolation
>
> > my $string   = 'Is this a regex or a string: 123\w+False$pattern1 ?';
>
> Is this a regex or a string: 123\w+False$pattern1 ?
>
> > my $regex= /\w+/;
>
> /\w+/
>
> > say $string.match: / $regex /;
>
> Regex object coerced to string (please use .gist or .raku to do that)
>
>  ... and more error lines, and no result when the docs show matching '123':
>
> ｢｣
>
>
> $ raku -v
>
> Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.
>
> Implementing the 퐑퐚퐤퐮™ programming language v6.d.
>
> Built on MoarVM version 2020.10.
>
>
>
> -y
>
>
> On Wed, Mar 17, 2021 at 3:17 PM William Michels via perl6-users 
>  wrote:
>>
>> Dear Brad,
>>
>> 1. The list you posted is fantastic ("If the first character inside is 
>> anything other than an alpha it doesn't capture"). It should be added to the 
>> Raku Docs ASAP.
>>
>> 2. There are some shortcuts that don't seem to follow a set pattern. For 
>> example a named capture can be accessed using $ instead of 
>> $/ ; the "/' can be elided. Do you have a method you can share for 
>> remembering these sorts of shortcuts? Or are they disfavored?
>>
>> > say ~$ if 'abc' ~~ / $ = [ \w+ ] /;
>> abc
>> >
>> [ Above from the example at https://docs.raku.org/syntax/Named%20captures ].
>>
>> 3. Finally, I've never seen in the Perl6/Raku literature the motto you cite: 
>> "One of the mottos of Raku, is that it is ok to confuse a new programmer, it 
>> is not ok to confuse an expert." Do you have a citation?
>>
>> [ The motto I prefer is from Larry Wall: "...easy things should stay easy, 
>> hard things should get easier, and impossible things should get hard... ." 
>> Citation: https://www.perl.com/pub/2000/10/23/soto2000.html/ ].
>>
>> Best Regards,
>>
>> Bill.
>>
>>
>>
>> On Sat, Mar 13, 2021 at 4:47 PM Brad Gilbert  wrote:
>>>
>>> It makes <…> more consistent precisely because <$pattern> doesn't capture.
>>>
>>> If the first character inside is anything other than an alpha it doesn't 
>>> capture.
>>> Which is a very simple description of when it captures.
>>>
>>>  doesn't capture because of the ｢?｣
>>>  doesn't capture because of the ｢!｣
>>> <.ws> doesn't capture because of the ｢.｣
>>> <> doesn't capture because of the ｢&｣
>>> <$pattern> doesn't capture because of the ｢$｣
>>> <$0> doesn't capture because of the ｢$｣
>>> <@a> doesn't capture because of the ｢@｣
>>> <[…]> doesn't capture because of the ｢[｣
>>> <-[…]> doesn't capture because of the ｢-]
>>> <:Ll> doesn't capture because of the ｢:｣
>>>
>>> For most of those, you don't actually want it to capture.
>>> With ｢.｣ the whole point is that it doesn't capture.
>>>
>>>  does capture because it starts with an alpha
>>>  does capture because it starts with an alpha
>>>
>>> $0 = <$pattern> doesn't capture to $, but does capture to $0
>>> $ = <$pattern> captures because of $ =
>>>
>>> It would be a mistake to just make <$pattern> capture.
>>> Consistency is perhaps Raku's most important feature.
>>>
>>> One of the mottos of Raku, is that it is ok to confuse a new programmer, it 
>>> is not ok to confuse an expert.
>>> An expert in Raku understands the deep fundamental ways that Raku is 
>>> consistent.
>>> So breaking consistency should be very carefully considered.
>>>
>>> In this case, there is very little benefit.
>>> Even worse, you then have to come up with some new syntax to prevent it 
>>> from capturing when you don't want it to.
>>> That new syntax wouldn't be as guessible as it currently is. Which again 
>>> would confuse experts.
>>>
>>> If anyone seriously suggests such a change, I will vehemently fight to 
>>> prevent it from happening.
>>>
>>> I would be more likely to accept <=$pattern> being added as a synonym to 
>>> .
>>>
>>> On Sat, Mar 13, 2021 at 3:30 PM Joseph Brenner  wrote:

 Thanks much for your answer on this.  I think this is the sort of
 trick I was looking for:

 Brad Gilbert wrote:

 > You can put it back in as a named

 > > $input ~~ / 
 > ｢9 million｣
 >  pattern => ｢9 million｣
 >   0 => ｢9｣
 >   1 => ｢million｣

 That's good enough, I guess, though you need to know about the
 issue... is there some reason it shouldn't happen automatically,
 using the variable name to label the captures?

 I don't think this particular gotcha is all that well
 documented, though I guess there's a reference to this being a
 "known trap" in the documentation under "Regex interpolation"--
 but that's the sort of remark that makes sense only after you know
 what its talking about.

 I have to say, my first reaction was

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread yary

The "Interpolation" section of the raku docs use strings as the elements of
building up a larger regex from smaller pieces, but the example that looks
fruitful isn't working in my raku. This is taken from
https://docs.raku.org/language/regexes#Regex_interpolation

> my $string   = 'Is this a regex or a string: 123\w+False$pattern1 ?';

Is this a regex or a string: 123\w+False$pattern1 ?

> my $regex= /\w+/;

/\w+/

> say $string.match: / $regex /;

Regex object coerced to string (please use .gist or .raku to do that)
 ... and more error lines, and no result when the docs show matching '123':

｢｣


$ raku -v

Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.

Implementing the 퐑퐚퐤퐮™ programming language v6.d.

Built on MoarVM version 2020.10.


-y


On Wed, Mar 17, 2021 at 3:17 PM William Michels via perl6-users <
perl6-us...@perl.org> wrote:

> Dear Brad,
>
> 1. The list you posted is fantastic ("If the first character inside is
> anything other than an alpha it doesn't capture"). It should be added to
> the Raku Docs ASAP.
>
> 2. There are some shortcuts that don't seem to follow a set pattern. For
> example a named capture can be accessed using $ instead of
> $/ ; the "/' can be elided. Do you have a method you can share for
> remembering these sorts of shortcuts? Or are they disfavored?
>
> > say ~$ if 'abc' ~~ / $ = [ \w+ ] /;
> abc
> >
> [ Above from the example at https://docs.raku.org/syntax/Named%20captures
> ].
>
> 3. Finally, I've never seen in the Perl6/Raku literature the motto you
> cite: "One of the mottos of Raku, is that it is ok to confuse a new
> programmer, it is not ok to confuse an expert." Do you have a citation?
>
> [ The motto I prefer is from Larry Wall: "...easy things should stay easy,
> hard things should get easier, and impossible things should get hard... ."
> Citation: https://www.perl.com/pub/2000/10/23/soto2000.html/ ].
>
> Best Regards,
>
> Bill.
>
>
>
> On Sat, Mar 13, 2021 at 4:47 PM Brad Gilbert  wrote:
>
>> It makes <…> more consistent precisely because <$pattern> doesn't capture.
>>
>> If the first character inside is anything other than an alpha it doesn't
>> capture.
>> Which is a very simple description of when it captures.
>>
>>  doesn't capture because of the ｢?｣
>>  doesn't capture because of the ｢!｣
>> <.ws> doesn't capture because of the ｢.｣
>> <> doesn't capture because of the ｢&｣
>> <$pattern> doesn't capture because of the ｢$｣
>> <$0> doesn't capture because of the ｢$｣
>> <@a> doesn't capture because of the ｢@｣
>> <[…]> doesn't capture because of the ｢[｣
>> <-[…]> doesn't capture because of the ｢-]
>> <:Ll> doesn't capture because of the ｢:｣
>>
>> For most of those, you don't actually want it to capture.
>> With ｢.｣ the whole point is that it doesn't capture.
>>
>>  does capture because it starts with an alpha
>>  does capture because it starts with an alpha
>>
>> $0 = <$pattern> doesn't capture to $, but does capture to $0
>> $ = <$pattern> captures because of $ =
>>
>> It would be a mistake to just make <$pattern> capture.
>> Consistency is perhaps Raku's most important feature.
>>
>> One of the mottos of Raku, is that it is ok to confuse a new programmer,
>> it is not ok to confuse an expert.
>> An expert in Raku understands the deep fundamental ways that Raku is
>> consistent.
>> So breaking consistency should be very carefully considered.
>>
>> In this case, there is very little benefit.
>> Even worse, you then have to come up with some new syntax to prevent it
>> from capturing when you don't want it to.
>> That new syntax wouldn't be as guessible as it currently is. Which again
>> would confuse experts.
>>
>> If anyone seriously suggests such a change, I will vehemently fight to
>> prevent it from happening.
>>
>> I would be more likely to accept <=$pattern> being added as a synonym to
>> .
>>
>> On Sat, Mar 13, 2021 at 3:30 PM Joseph Brenner  wrote:
>>
>>> Thanks much for your answer on this.  I think this is the sort of
>>> trick I was looking for:
>>>
>>> Brad Gilbert wrote:
>>>
>>> > You can put it back in as a named
>>>
>>> > > $input ~~ / 
>>> > ｢9 million｣
>>> >  pattern => ｢9 million｣
>>> >   0 => ｢9｣
>>> >   1 => ｢million｣
>>>
>>> That's good enough, I guess, though you need to know about the
>>> issue... is there some reason it shouldn't happen automatically,
>>> using the variable name to label the captures?
>>>
>>> I don't think this particular gotcha is all that well
>>> documented, though I guess there's a reference to this being a
>>> "known trap" in the documentation under "Regex interpolation"--
>>> but that's the sort of remark that makes sense only after you know
>>> what its talking about.
>>>
>>> I have to say, my first reaction was something like "if they
>>> couldn't get this working right, why did they put it in?"
>>>
>>>
>>> On 3/11/21, Brad Gilbert  wrote:
>>> > If you interpolate a regex, it is a sub regex.
>>> >
>>> > If you have something like a sigil, then the

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread William Michels via perl6-users

Dear Brad,

1. The list you posted is fantastic ("If the first character inside is
anything other than an alpha it doesn't capture"). It should be added to
the Raku Docs ASAP.

2. There are some shortcuts that don't seem to follow a set pattern. For
example a named capture can be accessed using $ instead of
$/ ; the "/' can be elided. Do you have a method you can share for
remembering these sorts of shortcuts? Or are they disfavored?

> say ~$ if 'abc' ~~ / $ = [ \w+ ] /;
abc
>
[ Above from the example at https://docs.raku.org/syntax/Named%20captures ].

3. Finally, I've never seen in the Perl6/Raku literature the motto you
cite: "One of the mottos of Raku, is that it is ok to confuse a new
programmer, it is not ok to confuse an expert." Do you have a citation?

[ The motto I prefer is from Larry Wall: "...easy things should stay easy,
hard things should get easier, and impossible things should get hard... ."
Citation: https://www.perl.com/pub/2000/10/23/soto2000.html/ ].

Best Regards,

Bill.



On Sat, Mar 13, 2021 at 4:47 PM Brad Gilbert  wrote:

> It makes <…> more consistent precisely because <$pattern> doesn't capture.
>
> If the first character inside is anything other than an alpha it doesn't
> capture.
> Which is a very simple description of when it captures.
>
>  doesn't capture because of the ｢?｣
>  doesn't capture because of the ｢!｣
> <.ws> doesn't capture because of the ｢.｣
> <> doesn't capture because of the ｢&｣
> <$pattern> doesn't capture because of the ｢$｣
> <$0> doesn't capture because of the ｢$｣
> <@a> doesn't capture because of the ｢@｣
> <[…]> doesn't capture because of the ｢[｣
> <-[…]> doesn't capture because of the ｢-]
> <:Ll> doesn't capture because of the ｢:｣
>
> For most of those, you don't actually want it to capture.
> With ｢.｣ the whole point is that it doesn't capture.
>
>  does capture because it starts with an alpha
>  does capture because it starts with an alpha
>
> $0 = <$pattern> doesn't capture to $, but does capture to $0
> $ = <$pattern> captures because of $ =
>
> It would be a mistake to just make <$pattern> capture.
> Consistency is perhaps Raku's most important feature.
>
> One of the mottos of Raku, is that it is ok to confuse a new programmer,
> it is not ok to confuse an expert.
> An expert in Raku understands the deep fundamental ways that Raku is
> consistent.
> So breaking consistency should be very carefully considered.
>
> In this case, there is very little benefit.
> Even worse, you then have to come up with some new syntax to prevent it
> from capturing when you don't want it to.
> That new syntax wouldn't be as guessible as it currently is. Which again
> would confuse experts.
>
> If anyone seriously suggests such a change, I will vehemently fight to
> prevent it from happening.
>
> I would be more likely to accept <=$pattern> being added as a synonym to
> .
>
> On Sat, Mar 13, 2021 at 3:30 PM Joseph Brenner  wrote:
>
>> Thanks much for your answer on this.  I think this is the sort of
>> trick I was looking for:
>>
>> Brad Gilbert wrote:
>>
>> > You can put it back in as a named
>>
>> > > $input ~~ / 
>> > ｢9 million｣
>> >  pattern => ｢9 million｣
>> >   0 => ｢9｣
>> >   1 => ｢million｣
>>
>> That's good enough, I guess, though you need to know about the
>> issue... is there some reason it shouldn't happen automatically,
>> using the variable name to label the captures?
>>
>> I don't think this particular gotcha is all that well
>> documented, though I guess there's a reference to this being a
>> "known trap" in the documentation under "Regex interpolation"--
>> but that's the sort of remark that makes sense only after you know
>> what its talking about.
>>
>> I have to say, my first reaction was something like "if they
>> couldn't get this working right, why did they put it in?"
>>
>>
>> On 3/11/21, Brad Gilbert  wrote:
>> > If you interpolate a regex, it is a sub regex.
>> >
>> > If you have something like a sigil, then the match data structure gets
>> > thrown away.
>> >
>> > You can put it back in as a named
>> >
>> > > $input ~~ / 
>> > ｢9 million｣
>> >  pattern => ｢9 million｣
>> >   0 => ｢9｣
>> >   1 => ｢million｣
>> >
>> > Or as a numbered:
>> >
>> > > $input ~~ / $0 = <$pattern>
>> > ｢9 million｣
>> >  0 => ｢9 million｣
>> >   0 => ｢9｣
>> >   1 => ｢million｣
>> >
>> > Or put it in as a lexical regex
>> >
>> > > my regex pattern { (\d+) \s+ (\w+) }
>> > > $input ~~ /   /
>> > ｢9 million｣
>> >  pattern => ｢9 million｣
>> >   0 => ｢9｣
>> >   1 => ｢million｣
>> >
>> > Or just use it as the whole regex
>> >
>> > > $input ~~ $pattern # variable
>> > ｢9 million｣
>> >  0 => ｢9｣
>> >  1 => ｢million｣
>> >
>> > > $input ~~  # my regex pattern /…/
>> > ｢9 million｣
>> >  0 => ｢9｣
>> >  1 => ｢million｣
>> >
>> > On Thu, Mar 11, 2021 at 2:29 AM Joseph Brenner 
>> wrote:
>> >
>> >> Does this behavior

Re: Working with a regex using positional captures stored in a variable

2021-03-17 Thread Joseph Brenner

And once again, thanks much for the explication of all this...
But even after thinking it over, the current state-of-affairs on
this really doesn't strike me as being okay.

As I'm sure everyone here knows, over in perl-land the main trick
you have for creating regexes from components is lexical
interpolation, so something like this:

my $r1 = qr{ (\d+) }x;
my $r2 = qr{ (\w+) }x;
$str =~ m/$r1 \s+ $r2/x;

behaves exactly the same as

   $str =~ m/ (\d+) \s+ (\w+) /x;

A direct translation of this approach to Raku doesn't really
work:

my $r1 = rx{ (\d+) };
my $r2 = rx{ (\w+) };
$str ~~ m/<$r1> \s+ <$r2>/;

And it doesn't work in a potentially insidious way: it can *look*
like it's working and it certainly doesn't throw any warnings.
You might use it for some time before noticing there's a feature
missing.

So as is, /<$regex>/ construct treats the contents of $regex as a
regex-- *except* that it ignores some key features of regexes. It
silently throws away some information.

Now, it is true that there are other ways of doing regex
composition in Raku that work much better, but I don't think
that's really the issue: more than one way to do it is fine as
long as they all actually work.

> I would be more likely to accept <=$pattern> being added as a synonym to 
> .

That could be an improvement. I was thinking something like
<:$pattern>, in analogy to colon pairs.

(At the very least: this alternate way would get documented, and
then we'd have to distinguish between it and the other one, and
explain that its missing a feature.)


On 3/13/21, Brad Gilbert  wrote:
> It makes <…> more consistent precisely because <$pattern> doesn't capture.
>
> If the first character inside is anything other than an alpha it doesn't
> capture.
> Which is a very simple description of when it captures.
>
>  doesn't capture because of the ｢?｣
>  doesn't capture because of the ｢!｣
> <.ws> doesn't capture because of the ｢.｣
> <> doesn't capture because of the ｢&｣
> <$pattern> doesn't capture because of the ｢$｣
> <$0> doesn't capture because of the ｢$｣
> <@a> doesn't capture because of the ｢@｣
> <[…]> doesn't capture because of the ｢[｣
> <-[…]> doesn't capture because of the ｢-]
> <:Ll> doesn't capture because of the ｢:｣
>
> For most of those, you don't actually want it to capture.
> With ｢.｣ the whole point is that it doesn't capture.
>
>  does capture because it starts with an alpha
>  does capture because it starts with an alpha
>
> $0 = <$pattern> doesn't capture to $, but does capture to $0
> $ = <$pattern> captures because of $ =
>
> It would be a mistake to just make <$pattern> capture.
> Consistency is perhaps Raku's most important feature.
>
> One of the mottos of Raku, is that it is ok to confuse a new programmer, it
> is not ok to confuse an expert.
> An expert in Raku understands the deep fundamental ways that Raku is
> consistent.
> So breaking consistency should be very carefully considered.
>
> In this case, there is very little benefit.
> Even worse, you then have to come up with some new syntax to prevent it
> from capturing when you don't want it to.
> That new syntax wouldn't be as guessible as it currently is. Which again
> would confuse experts.
>
> If anyone seriously suggests such a change, I will vehemently fight to
> prevent it from happening.
>
> I would be more likely to accept <=$pattern> being added as a synonym to
> .
>
> On Sat, Mar 13, 2021 at 3:30 PM Joseph Brenner  wrote:
>
>> Thanks much for your answer on this.  I think this is the sort of
>> trick I was looking for:
>>
>> Brad Gilbert wrote:
>>
>> > You can put it back in as a named
>>
>> > > $input ~~ / 
>> > ｢9 million｣
>> >  pattern => ｢9 million｣
>> >   0 => ｢9｣
>> >   1 => ｢million｣
>>
>> That's good enough, I guess, though you need to know about the
>> issue... is there some reason it shouldn't happen automatically,
>> using the variable name to label the captures?
>>
>> I don't think this particular gotcha is all that well
>> documented, though I guess there's a reference to this being a
>> "known trap" in the documentation under "Regex interpolation"--
>> but that's the sort of remark that makes sense only after you know
>> what its talking about.
>>
>> I have to say, my first reaction was something like "if they
>> couldn't get this working right, why did they put it in?"
>>
>>
>> On 3/11/21, Brad Gilbert  wrote:
>> > If you interpolate a regex, it is a sub regex.
>> >
>> > If you have something like a sigil, then the match data structure gets
>> > thrown away.
>> >
>> > You can put it back in as a named
>> >
>> > > $input ~~ / 
>> > ｢9 million｣
>> >  pattern => ｢9 million｣
>> >   0 => ｢9｣
>> >   1 => ｢million｣
>> >
>> > Or as a numbered:
>> >
>> > > $input ~~ / $0 = <$pattern>
>> > ｢9 million｣
>> >  0 => ｢9 million｣
>> >   0 => ｢9｣
>> >   1 => ｢million｣
>> >
>> > Or put it in as a lexical regex
>> >

Re: Working with a regex using positional captures stored in a variable

2021-03-13 Thread Brad Gilbert

It makes <…> more consistent precisely because <$pattern> doesn't capture.

If the first character inside is anything other than an alpha it doesn't
capture.
Which is a very simple description of when it captures.

 doesn't capture because of the ｢?｣
 doesn't capture because of the ｢!｣
<.ws> doesn't capture because of the ｢.｣
<> doesn't capture because of the ｢&｣
<$pattern> doesn't capture because of the ｢$｣
<$0> doesn't capture because of the ｢$｣
<@a> doesn't capture because of the ｢@｣
<[…]> doesn't capture because of the ｢[｣
<-[…]> doesn't capture because of the ｢-]
<:Ll> doesn't capture because of the ｢:｣

For most of those, you don't actually want it to capture.
With ｢.｣ the whole point is that it doesn't capture.

 does capture because it starts with an alpha
 does capture because it starts with an alpha

$0 = <$pattern> doesn't capture to $, but does capture to $0
$ = <$pattern> captures because of $ =

It would be a mistake to just make <$pattern> capture.
Consistency is perhaps Raku's most important feature.

One of the mottos of Raku, is that it is ok to confuse a new programmer, it
is not ok to confuse an expert.
An expert in Raku understands the deep fundamental ways that Raku is
consistent.
So breaking consistency should be very carefully considered.

In this case, there is very little benefit.
Even worse, you then have to come up with some new syntax to prevent it
from capturing when you don't want it to.
That new syntax wouldn't be as guessible as it currently is. Which again
would confuse experts.

If anyone seriously suggests such a change, I will vehemently fight to
prevent it from happening.

I would be more likely to accept <=$pattern> being added as a synonym to
.

On Sat, Mar 13, 2021 at 3:30 PM Joseph Brenner  wrote:

> Thanks much for your answer on this.  I think this is the sort of
> trick I was looking for:
>
> Brad Gilbert wrote:
>
> > You can put it back in as a named
>
> > > $input ~~ / 
> > ｢9 million｣
> >  pattern => ｢9 million｣
> >   0 => ｢9｣
> >   1 => ｢million｣
>
> That's good enough, I guess, though you need to know about the
> issue... is there some reason it shouldn't happen automatically,
> using the variable name to label the captures?
>
> I don't think this particular gotcha is all that well
> documented, though I guess there's a reference to this being a
> "known trap" in the documentation under "Regex interpolation"--
> but that's the sort of remark that makes sense only after you know
> what its talking about.
>
> I have to say, my first reaction was something like "if they
> couldn't get this working right, why did they put it in?"
>
>
> On 3/11/21, Brad Gilbert  wrote:
> > If you interpolate a regex, it is a sub regex.
> >
> > If you have something like a sigil, then the match data structure gets
> > thrown away.
> >
> > You can put it back in as a named
> >
> > > $input ~~ / 
> > ｢9 million｣
> >  pattern => ｢9 million｣
> >   0 => ｢9｣
> >   1 => ｢million｣
> >
> > Or as a numbered:
> >
> > > $input ~~ / $0 = <$pattern>
> > ｢9 million｣
> >  0 => ｢9 million｣
> >   0 => ｢9｣
> >   1 => ｢million｣
> >
> > Or put it in as a lexical regex
> >
> > > my regex pattern { (\d+) \s+ (\w+) }
> > > $input ~~ /   /
> > ｢9 million｣
> >  pattern => ｢9 million｣
> >   0 => ｢9｣
> >   1 => ｢million｣
> >
> > Or just use it as the whole regex
> >
> > > $input ~~ $pattern # variable
> > ｢9 million｣
> >  0 => ｢9｣
> >  1 => ｢million｣
> >
> > > $input ~~  # my regex pattern /…/
> > ｢9 million｣
> >  0 => ｢9｣
> >  1 => ｢million｣
> >
> > On Thu, Mar 11, 2021 at 2:29 AM Joseph Brenner 
> wrote:
> >
> >> Does this behavior make sense to anyone?  When you've got a regex
> >> with captures in it, the captures don't work if the regex is
> >> stashed in a variable and then interpolated into a regex.
> >>
> >> Do capture groups need to be defined at the top level where the
> >> regex is used?
> >>
> >> { #  From a code example in the "Parsing" book by Moritz Lenz, p. 48,
> >> section 5.2
> >>my $input = 'There are 9 million bicycles in beijing.';
> >>if $input ~~ / (\d+) \s+ (\w+) / {
> >>say $0.^name;  # Match
> >>say $0;# ｢9｣
> >>say $1.^name;  # Match
> >>say $1;# ｢million｣
> >>say $/;
> >> # ｢9 million｣
> >> #  0 => ｢9｣
> >> #  1 => ｢million｣
> >>}
> >> }
> >>
> >> say '---';
> >>
> >> { # Moving the pattern to var which we interpolate into match
> >>my $input = 'There are 9 million bicycles in beijing.';
> >>my $pattern = rx{ (\d+) \s+ (\w+) };
> >>if $input ~~ / <$pattern> / {
> >>say $0.^name;  # Nil
> >>say $0;# Nil
> >>say $1.^name;  # Nil
> >>say $1;# Nil
> >>say $/;# ｢9 million｣
> >>}
> >> }
> >>
> >> In the second case, the match clearly works, but it

Re: Working with a regex using positional captures stored in a variable

2021-03-13 Thread Joseph Brenner

Thanks much for your answer on this.  I think this is the sort of
trick I was looking for:

Brad Gilbert wrote:

> You can put it back in as a named

> > $input ~~ / 
> ｢9 million｣
>  pattern => ｢9 million｣
>   0 => ｢9｣
>   1 => ｢million｣

That's good enough, I guess, though you need to know about the
issue... is there some reason it shouldn't happen automatically,
using the variable name to label the captures?

I don't think this particular gotcha is all that well
documented, though I guess there's a reference to this being a
"known trap" in the documentation under "Regex interpolation"--
but that's the sort of remark that makes sense only after you know
what its talking about.

I have to say, my first reaction was something like "if they
couldn't get this working right, why did they put it in?"


On 3/11/21, Brad Gilbert  wrote:
> If you interpolate a regex, it is a sub regex.
>
> If you have something like a sigil, then the match data structure gets
> thrown away.
>
> You can put it back in as a named
>
> > $input ~~ / 
> ｢9 million｣
>  pattern => ｢9 million｣
>   0 => ｢9｣
>   1 => ｢million｣
>
> Or as a numbered:
>
> > $input ~~ / $0 = <$pattern>
> ｢9 million｣
>  0 => ｢9 million｣
>   0 => ｢9｣
>   1 => ｢million｣
>
> Or put it in as a lexical regex
>
> > my regex pattern { (\d+) \s+ (\w+) }
> > $input ~~ /   /
> ｢9 million｣
>  pattern => ｢9 million｣
>   0 => ｢9｣
>   1 => ｢million｣
>
> Or just use it as the whole regex
>
> > $input ~~ $pattern # variable
> ｢9 million｣
>  0 => ｢9｣
>  1 => ｢million｣
>
> > $input ~~  # my regex pattern /…/
> ｢9 million｣
>  0 => ｢9｣
>  1 => ｢million｣
>
> On Thu, Mar 11, 2021 at 2:29 AM Joseph Brenner  wrote:
>
>> Does this behavior make sense to anyone?  When you've got a regex
>> with captures in it, the captures don't work if the regex is
>> stashed in a variable and then interpolated into a regex.
>>
>> Do capture groups need to be defined at the top level where the
>> regex is used?
>>
>> { #  From a code example in the "Parsing" book by Moritz Lenz, p. 48,
>> section 5.2
>>my $input = 'There are 9 million bicycles in beijing.';
>>if $input ~~ / (\d+) \s+ (\w+) / {
>>say $0.^name;  # Match
>>say $0;# ｢9｣
>>say $1.^name;  # Match
>>say $1;# ｢million｣
>>say $/;
>> # ｢9 million｣
>> #  0 => ｢9｣
>> #  1 => ｢million｣
>>}
>> }
>>
>> say '---';
>>
>> { # Moving the pattern to var which we interpolate into match
>>my $input = 'There are 9 million bicycles in beijing.';
>>my $pattern = rx{ (\d+) \s+ (\w+) };
>>if $input ~~ / <$pattern> / {
>>say $0.^name;  # Nil
>>say $0;# Nil
>>say $1.^name;  # Nil
>>say $1;# Nil
>>say $/;# ｢9 million｣
>>}
>> }
>>
>> In the second case, the match clearly works, but it behaves as
>> though the capture groups aren't there.
>>
>>
>>raku --version
>>
>>Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.
>>Implementing the 퐑퐚퐤퐮™ programming language v6.d.
>>
>

Re: Working with a regex using positional captures stored in a variable

2021-03-11 Thread Brad Gilbert

If you interpolate a regex, it is a sub regex.

If you have something like a sigil, then the match data structure gets
thrown away.

You can put it back in as a named

> $input ~~ / 
｢9 million｣
 pattern => ｢9 million｣
  0 => ｢9｣
  1 => ｢million｣

Or as a numbered:

> $input ~~ / $0 = <$pattern>
｢9 million｣
 0 => ｢9 million｣
  0 => ｢9｣
  1 => ｢million｣

Or put it in as a lexical regex

> my regex pattern { (\d+) \s+ (\w+) }
> $input ~~ /   /
｢9 million｣
 pattern => ｢9 million｣
  0 => ｢9｣
  1 => ｢million｣

Or just use it as the whole regex

> $input ~~ $pattern # variable
｢9 million｣
 0 => ｢9｣
 1 => ｢million｣

> $input ~~  # my regex pattern /…/
｢9 million｣
 0 => ｢9｣
 1 => ｢million｣

On Thu, Mar 11, 2021 at 2:29 AM Joseph Brenner  wrote:

> Does this behavior make sense to anyone?  When you've got a regex
> with captures in it, the captures don't work if the regex is
> stashed in a variable and then interpolated into a regex.
>
> Do capture groups need to be defined at the top level where the
> regex is used?
>
> { #  From a code example in the "Parsing" book by Moritz Lenz, p. 48,
> section 5.2
>my $input = 'There are 9 million bicycles in beijing.';
>if $input ~~ / (\d+) \s+ (\w+) / {
>say $0.^name;  # Match
>say $0;# ｢9｣
>say $1.^name;  # Match
>say $1;# ｢million｣
>say $/;
> # ｢9 million｣
> #  0 => ｢9｣
> #  1 => ｢million｣
>}
> }
>
> say '---';
>
> { # Moving the pattern to var which we interpolate into match
>my $input = 'There are 9 million bicycles in beijing.';
>my $pattern = rx{ (\d+) \s+ (\w+) };
>if $input ~~ / <$pattern> / {
>say $0.^name;  # Nil
>say $0;# Nil
>say $1.^name;  # Nil
>say $1;# Nil
>say $/;# ｢9 million｣
>}
> }
>
> In the second case, the match clearly works, but it behaves as
> though the capture groups aren't there.
>
>
>raku --version
>
>Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.
>Implementing the 퐑퐚퐤퐮™ programming language v6.d.
>

Working with a regex using positional captures stored in a variable

2021-03-11 Thread Joseph Brenner

Does this behavior make sense to anyone?  When you've got a regex
with captures in it, the captures don't work if the regex is
stashed in a variable and then interpolated into a regex.

Do capture groups need to be defined at the top level where the
regex is used?

{ #  From a code example in the "Parsing" book by Moritz Lenz, p. 48,
section 5.2
   my $input = 'There are 9 million bicycles in beijing.';
   if $input ~~ / (\d+) \s+ (\w+) / {
   say $0.^name;  # Match
   say $0;# ｢9｣
   say $1.^name;  # Match
   say $1;# ｢million｣
   say $/;
# ｢9 million｣
#  0 => ｢9｣
#  1 => ｢million｣
   }
}

say '---';

{ # Moving the pattern to var which we interpolate into match
   my $input = 'There are 9 million bicycles in beijing.';
   my $pattern = rx{ (\d+) \s+ (\w+) };
   if $input ~~ / <$pattern> / {
   say $0.^name;  # Nil
   say $0;# Nil
   say $1.^name;  # Nil
   say $1;# Nil
   say $/;# ｢9 million｣
   }
}

In the second case, the match clearly works, but it behaves as
though the capture groups aren't there.


   raku --version

   Welcome to 퐑퐚퐤퐮퐝퐨™ v2020.10.
   Implementing the 퐑퐚퐤퐮™ programming language v6.d.

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Re: Working with a regex using positional captures stored in a variable

Working with a regex using positional captures stored in a variable

19 matches

Site Navigation

Mail list logo

Footer information