Re: possible bugs in Exegesis 5 code for matching patterns

2002-09-21 Thread Smylers

Steve Tolkin wrote:

>   { @$appendline =~ s// 
> I think this needs a backslash in front of the < symbol, and a space
> after in_marker, i.e. it should be: 
> 
>   { @$appendline =~ s//\ /;

That would replace asterisks with 'any' vowel, without specifying which
vowel to use.  That makes no sense at all.  (Well, not unless it creates
a superposition, but surely Damian can't have intended to introduce
superpositions like this is the core language?  Can he?)

So since pointy brackets aren't special in strings, it doesn't take a
backslash; similarly spaces should just be written as spaces.

> rule fileinfo {
> <3> $oldfile:=(\S+) $olddate:=[\h* (\N+?) \h*?] \n
> <3>  $newfile:=(\S+) $newdate:=[\h* (\N+?) \h*?] \n
> }
> 
> rule out_marker { \+  }
> rule in_marker  {  -  }
> 
> The  means a single literal space.  So I think <3>
> means look for "+ + + " rather than "+++" which is what is really
> needed to match a Unified diff.

Yes, you look to be right to me.

> If these are bugs, then what would be the best way to
> fix the code while retaining as much reuse as possible.

This is one way:

  rule out_marker_symbol { \+ }
  rule in_marker_symbol  {  - }

  rule out_marker {   }
  rule in_marker  {}

  rule fileinfo {
  <3> $oldfile:=(\S+) $olddate:=[\h* (\N+?) \h*?] \n
  <3>  $newfile:=(\S+) $newdate:=[\h* (\N+?) \h*?] \n
  }

[While we're on the subject of typos, it looks like the final definition
of C in the exegesis has C<$newfile> where C<$oldfile> is
meant.]

But it'd probably be easier just to do:

  rule fileinfo {
  \+\+\+ $oldfile:=(\S+) $olddate:=[\h* (\N+?) \h*?] \n
  ---$newfile:=(\S+) $newdate:=[\h* (\N+?) \h*?] \n
  }

That isn't a terrible cop out to repeat the symbols, since there's no
reason why the format has to use the same symbols in both places.

Smylers



possible bugs in Exegesis 5 code for matching patterns

2002-09-20 Thread Tolkin, Steve

Here is a discussion thread of Exegesis 5 
http://www.perl.com/pub/a/2002/08/22/exegesis5.html at
http://developers.slashdot.org/developers/02/08/23/1232230.shtml?tid=145
But the signal/noise is too low, with side tracks into
Monty Python etc.   

In section "Smarter alternatives" there is this code:
{ @$appendline =~ s///\ $oldfile:=(\S+) $olddate:=[\h* (\N+?) \h*?] \n
<3>  $newfile:=(\S+) $newdate:=[\h* (\N+?) \h*?] \n
}

rule out_marker { \+  }
rule in_marker  {  -  }

The  means a single literal space.
So I think <3> means look for "+ + + " 
rather than "+++" which is what is really needed
to match a Unified diff.  Similarly for <3>

Or am I missing something?
If these are bugs, then what would be the best way to
fix the code while retaining as much reuse as possible.

 
Hopefully helpfully yours,
Steve
-- 
Steven Tolkin  [EMAIL PROTECTED]  617-563-0516 
Fidelity Investments   82 Devonshire St. V8D Boston MA 02109
There is nothing so practical as a good theory.  Comments are by me, 
not Fidelity Investments, its subsidiaries or affiliates.