Steve Tolkin wrote:

>   { @$appendline =~ s/<in_marker>/</;
> 
> I think this needs a backslash in front of the < symbol, and a space
> after in_marker, i.e. it should be: 
> 
>   { @$appendline =~ s/<in_marker>/\<<sp>/;

Isn't the replacement part of a substitution is still a string?
Having the replacement being a rule would mean that you could write
things like:

  s:e/ \* / <[aeiou]> /;

That would replace asterisks with 'any' vowel, without specifying which
vowel to use.  That makes no sense at all.  (Well, not unless it creates
a superposition, but surely Damian can't have intended to introduce
superpositions like this is the core language?  Can he?)

So since pointy brackets aren't special in strings, it doesn't take a
backslash; similarly spaces should just be written as spaces.

> rule fileinfo {
>             <out_marker><3> $oldfile:=(\S+) $olddate:=[\h* (\N+?) \h*?] \n
>             <in_marker><3>  $newfile:=(\S+) $newdate:=[\h* (\N+?) \h*?] \n
>         }
> ....
> rule out_marker { \+ <sp> }
> rule in_marker  {  - <sp> }
> 
> The <sp> means a single literal space.  So I think <out_marker><3>
> means look for "+ + + " rather than "+++" which is what is really
> needed to match a Unified diff.

Yes, you look to be right to me.

> If these are bugs, then what would be the best way to
> fix the code while retaining as much reuse as possible.

This is one way:

  rule out_marker_symbol { \+ }
  rule in_marker_symbol  {  - }

  rule out_marker { <out_marker_symbol> <sp> }
  rule in_marker  { <in_marker_symbol>  <sp> }

  rule fileinfo {
      <out_marker_symbol><3> $oldfile:=(\S+) $olddate:=[\h* (\N+?) \h*?] \n
      <in_marker_symbol><3>  $newfile:=(\S+) $newdate:=[\h* (\N+?) \h*?] \n
  }

[While we're on the subject of typos, it looks like the final definition
of C<fileinfo> in the exegesis has C<$newfile> where C<$oldfile> is
meant.]

But it'd probably be easier just to do:

  rule fileinfo {
      \+\+\+ $oldfile:=(\S+) $olddate:=[\h* (\N+?) \h*?] \n
      ---    $newfile:=(\S+) $newdate:=[\h* (\N+?) \h*?] \n
  }

That isn't a terrible cop out to repeat the symbols, since there's no
reason why the format has to use the same symbols in both places.

Smylers

Reply via email to