perl 6 beginner: regex questions

inform80 Tue, 04 Feb 2014 07:41:54 -0800

Hello,

I have a few questions about the regexes.  I'm new to perl 6
and I don't know perl 5.  I'm proficient with grep/sed.


I installed Rakudo with the defaults on Linux (Mandriva 2010.1)
and Windows 7. I get identical output for all these questions
with the two latest Rakudo versions on Linux (2014.01 2013.12)
and Windows (2014.01 2013.09).

Questions

(1) :global

This works as expected:
  say ~( 'aaa'.match(/a/,:g) )  # a a a
but this doesn't work:
  say ?( 'aaa' ~~ m:g/a/)       # False

Is the second form m:g/.../ implemented?
if it is, can you give me an example


(2) capture to lexical variables, external aliasing

>From the synopsis:
  You may capture to existing lexical variables;
  such variables may already be visible from an
  outer scope, or may be declared within the
  regex via a :my declaration.
  my $x; / $x = [...] /  # capture to outer lexical $x
  / :my $x; $x = [...] / # capture to our own lexical $x

Here I have two examples and they don't work, why?
is this feature implemented?

  my $x; 'a' ~~ / $x=. /; say $x;    # (Any)

  'a' ~~ / :my $x; $x = . {say $x} / # [no output]


(3) arrays

None of these four cases work and I don't know why:

my @x; 'abc' ~~ / @x = .+ /; say @x[0]; # (Any)

'abc' ~~ /^ @<x>=.+ $/;      say @<x>;  # Nil

my @x; 'abc' ~~ /@x = (.)+/; # Could not find sub cuid_1_1384375491.61906

'abc' ~~ / @<x> = (.)+ /;    # Could not find sub cuid_1_1391263931.391


(4) submatch

I'm somewhat puzzled what a submatch is.

This works:
'ab' ~~ / (..) { say ~($0 ~~ /b/) } / # b

But can a submatch be immediately nested in a regex like this:

'ab' ~~ /(..) [$0 ~~ b] }/ # doesn't work


(5) interpolation

how do I get a variable to interpolate into <[]> ?

doesn't work:
  my $x='a';
  say ~('aaa' ~~ /^ <[$x]>+ $/) # [no output]


(6) why does this enter an infinite loop instead of matching?

'a' ~~ /^ [.*? {say 'ok'}]+ $/


(7) why don't these produce the same output?

say 'ab'.subst(/ ''  <?before b> /,'c'); # ab
say 'ab'.subst(/ <?> <?before b> /,'c'); # acb
say 'ab'.subst(/     <?before b> /,'c'); # acb

Similar question for these:

'abbbb' ~~ / (b ** 0..3)/; say ~$0; # [no output]
'abbbb' ~~ / (b ** 1..3)/; say ~$0; # bbb

(8) implicit scanning at the beginning
(a) regex/token/rule
    I don't understand the behavior of regex/token/rule in
    these cases, for I thought they didn't implicit scan:
    'a11' ~~ regex { (\d+) }; say "$0"; # 11
    'a11' ~~ token { (\d+) }; say "$0"; # 11
    'a11' ~~ rule  { (\d+) }; say "$0"; # 11

    the synopsis seems to suggest the opposite:
      ...
          $string ~~ regex { \d+ }
          $string ~~ token { \d+ }
          $string ~~ rule { \d+ }
      and these are equivalent to
          $string ~~ m/^ \d+ $/;
          $string ~~ m/^ \d+: $/;
          $string ~~ m/^ <.ws> \d+: <.ws> $/;

(b) I find this behavior unexpected:
    say ?('ab' ~~ / <?before a> /);             # True
    say ?('ab' ~~ / <?before b> /);             # True
    say ?('ab' ~~ / <?before a> <?before b> /); # False

    In each case there is scanning at the very beginning, but
    in the 3rd case there is no scanning before <?before b> as
    if <?before a> had the side effect of cancelling it
    in spite of its "?"

    i.e I was expecting "?" to make <?before a> side effect free


(9) Please verify if the following section of the
    "Regexes and Rules" synopsis contains an error:

"In contrast, if the outer quantified structure is a
capturing structure (i.e. a subpattern) then..."

my $text = "foo:food fool\nbar:bard barb";
          # $0-----------------------
          # |                        |
          # | $0[0]    $0[1]---      |
          # | |   |    |       |     |
$text ~~ m/ ( (\w+) \: (\w+ \h*)* \n ) ** 2..* /;

The error seems that $text lacks a final \n. It works as expected
if you add it.  You should also verify the diagram above, it may,
or not, contain an error. I just tried it thus instead and it seems to work:

my $text = "foo:food fool\nbar:bard barb\n";
         # $0-------------------------------
         # |                               |
         # |($0[0][0],$0[0][1]) 1st iter.  |
         # |($0[1][0],$0[1][1]) 2nd iter.  |
         # ||   |    |        |            |
$text ~~ m/((\w+) \: (\w+ \h*)* \n) ** 2..*/;
say "$0[0][0]"; # foo
say "$0[0][1]"; # food  fool
say "$0[1][0]"; # bar
say "$0[1][1]"; # bard  barb


thank you

perl 6 beginner: regex questions

Reply via email to