Re: (SPAM?) space-separated tokens (FAQ?)

Ron Smith Mon, 27 Jun 2005 10:19:40 -0700

Scott wrote:

Subject:
space-separated tokens (FAQ?)
From:
Scott <[EMAIL PROTECTED]>
Date:
Sun, 26 Jun 2005 12:28:05 -0600
To:
recdescent@perl.org


To:
recdescent@perl.org


Please forgive me if this an FAQ. I am new to the world of RecDescent
and grammars in general. I'm assisting on a project for which the
original grammar was developed by someone else; making my own
additions, learning (sometimes) as I go. Not a pro, just dabbling.

I've tried to distill the grammar down for this post, but again, I
apologize if it is too long to digest. If you don't like it, you can
say, in our full grammar: ex R3 up.

I can't for the life of me figure out how to ensure that the tokens

are separated by whitespace, so that, eg5 pu 2n mu 3would not be confused with5 pu 2nm

I tried setting $Parse::RecDescent::skip = '[ \t]+' and every line was
bad. Is this separation something that needs to be dealt with in the
rules, and if so how?

TIA,
Scott Swanson

<code>
package SFNParse;

use Parse::RecDescent;

# Note: the grammar is stripped down, so a lot of these don't appear in it.
our %abbrevs = (
        # Hands (X-axis)
        'L'=>'left','R'=>'right','M'=>'middle',        
        # z-axis
        'n'=>'near','f'=>'far', 'a'=>'axial',
        # x-axis
        'o'=>'outer', 'i'=>'inner', 'm'=>'middle',
        # y-axis
        't'=>'top', 'c'=>'center','b'=>'bottom',       
        # Directions
        'dx'=>'distally','px'=>'proximally','lr'=>'from left to right','rl'=>'from 
right to left','nf'=>'near to far', 'fn'=>'far to near',
        'up'=>'upwards', 'dn'=>'downwards',
        'da'=>'down and away from you','ua'=>'up and away from you','dt'=>'down 
and towards you','ut'=>'up and towards you',
        'tw'=>'towards you','aw'=>'away from you','io'=>'from centre 
out','oi'=>'from outside to centre',

'lor'=>'left over right', 'rol'=>'right over left',# Twists

        '>'=>"a half turn away from you",'>>'=>"a full turn away from you",'<'=>"a half turn towards 
you",'<<'=>"a full turn towards you",
        # 'Descriptor' strings (added VS, changed text of CS, anticipating 
'mid')
        
'DS'=>'diagonal','SS'=>'straight','XS'=>'crossed','CS'=>'centre','TV'=>'transverse',
                'VS'=>'vertical','DM'=>'diamond',
        # Bodyparts (. means 'in the figure')
        '1'=>'thumb', '2'=>'forefinger', '3'=>'middle finger','4'=>'ring 
finger','5'=>'little finger',
        'P'=>'palm','H'=>'hand','W'=>'wrist','O'=>'mouth','.'=>' ','B'=>'back 
of hand',

'T'=>'toe or etc', 'F'=>'fingers',# Commands (added hu,hd)

        'pu'=>'pick up','gr'=>'grasp','hu'=>'hook up','hd'=>'hook down',
        'kl'=>'keep loose','ls'=>'let slip','ht'=>'hold 
tight','ex'=>'extend',','=>'then..',
        'na'=>'navaho','tr'=>'transfer','fl'=>'flip','<=>'=>'exchange',
        'mo'=>'over','mu'=>'under','mt'=>'through',
        'mb'=>'between',
        'rep'=>'repeat', 
'rot'=>'rotate','rel'=>'release','ret'=>'return','pnt'=>'point',
        );

my $grammar = q {
move: (rel_move | take ) stringdesc     {"$item[1] $item[2]";}
take: ("pu" | "gr" | "hu" | "hd") direction(?) {$SFNParse::abbrevs{$item[1]}." 
".$item[2][0];}
rel_move: ("mo" | "mu" | "mt") direction(?) {"move" . "$item[2][0] 
".$SFNParse::abbrevs{$item[1]};}
stringdesc: strings {$item[1];}
strings: string(s) {join(" / ", @{$item[1]});}
string:         ( mnoose | mstring )            
mnoose:  manipulator "N" height(?){"$item[3][0] $item[1]  noose";}
mstring: manipulator height(?) side(?) lat(?) descriptor(?){join(" ", 
@{$item[2]},@{$item[3]},@{$item[4]},$item[1],@{$item[5]}," string");}
side:   ("n" | "f" | "a") {$SFNParse::abbrevs{$item[1]};}
lat:    ("i" | "o" | "m") {$SFNParse::abbrevs{$item[1]};}
height: ("t" | "c" | "b") {$SFNParse::abbrevs{$item[1]};}
descriptor: ("DS" | "XS" | "CS" | "SS" | "TV" | "VS" | "DM") 
{$SFNParse::abbrevs{$item[1]};}
direction:("px" | "dx" | "lr" | "rl" | "nf" | "fn") 
{$SFNParse::abbrevs{$item[1]};}
manipulator: hand(?) bodypart(s) {"$item[1][0] ".join("/", @{$item[2]});}
hand: ("L" | "R" | "M")   {$SFNParse::abbrevs{$item[1]};}
bodypart:("1" | "2" | "3" | "4" | "5" | "P" | "H") 
{$SFNParse::abbrevs{$item[1]};}
action: manipulator move(s) {$item[1] . " " . join (", ", @{$item[2]});}
fullmove:  action modifier(?) {$item[1].(($item[2][0]) ? " and ". $item[2][0] : 
"");}
modifier: ("ex" | "ret" | "si" | "kl" | ",")  {$SFNParse::abbrevs{$item[1]};}
validstep: code_ref | repeat | fullmove | voice
stepline: /\\\{?/ validstep /\\\}?/ comment(?)  {"$item[1] $item[2] $item[3] 
$item[4][0]";}
comment: "#" /.+/ {"($item[2])";}
code_ref: "[" /\w+/ "]"     { "perform $item[2]";}
voice:  "v" /.+/ {'"'.$item[2].'"';}
repeat: "rep" stepnumber "-" stepnumber {"repeat steps $item[2] to $item[4] "}
stepnumber: /\d+/

};

my $parser = new Parse::RecDescent ($grammar) or die "Bad grammar!\n";

while (<DATA>) {
  my $text = $parser->stepline($_) || "Error: $_";
  print "$text\n";
}
__DATA__
1 pu 5f
1 pu 5tfo
1 mu 2n pu 5cni
1 pu 2n mu 5cni # this is wrong
1mu2npu5cni     # Yucch! But it parses....

Scott, it is "bad form" to post code that you have not tested. I copiedthe above verbatim into an editor and every line in your test datacauses an "error" message. Moreover, there is nothing in your grammarthat handles "comments".

Second if you really want help, telling the potential helper to go "exR3 Up" themselves is also not particularly helpful. Ex R3 up to youtoo... "Ex R3 io" for the horse you rode in on.

But assuming that you meant this somewhat in jest and that your motherdidn't raise you to have enough sense to be gracious, I'll provide a fewbread crumbs.

First, turn on "$RD_TRACE = 1;", it will show you what the parser isdoing while it is doing it. I find this helpful to work through problems.

Second, it seems that what you want to parse is inherently ambiguousbecause there is no obvious difference between "n mu" and "nm" when youdiscount white space.

Fundamentally you need to decide if white-space is part of your grammar.If so, then what will help to disambiguate this case is a look-aheadthat comprehends the white-space. If white space is not truly required,than what you need is still a look-ahead, but something that parsesahead to see if the "other" case is what *will* happen.


However, in general, designing ambiguous grammars is something to avoid.

Re: (SPAM?) space-separated tokens (FAQ?)

Reply via email to