Re: (SPAM?) space-separated tokens (FAQ?)

Scott Tue, 28 Jun 2005 22:49:56 -0700

On Tue, Jun 28, 2005 at 09:58:56AM -0700, Ron Smith wrote:
> Well, what I did was simple.  I cut and pasted your code into an editor, 
> no more, no less.
> 
> I got this result:
> 
> Error: 1 pu 5f
> Error: 1 pu 5tfo
> Error: 1 mu 2n pu 5cni
> Error: 1 pu 2n mu 5cni
> Error: 1mu2npu5cni
>


Well, here are my results (where test.pl was the file I cut and pasted
into the original email):

[0 ~/string/spl]$ perl ./test.pl
  thumb pick up  far  little finger  string  
  thumb pick up  top far outer  little finger  string  
  thumb move under near  forefinger  string, pick up  center near inner  little 
finger  string  
  thumb pick up  near middle  forefinger  string  
  thumb move under near  forefinger  string, pick up  center near inner  little 
finger  string  (Yucch!)
[0 ~/string/spl]$ perl -v

This is perl, v5.8.0 built for i386-linux-thread-multi
[...]

[0 ~/string/spl]$ perldoc Parse::RecDescent
[...]
VERSION
    This document describes version 1.80 of Parse::RecDescent, released
    January 20, 2001.
[...]

Are we running wildly disparate versions of these, perhaps?

> 
> >
> >>Second, it seems that what you want to parse is inherently ambiguous 
> >>because there is no obvious difference between "n mu" and "nm" when you 
> >>discount white space.
> >
> >
> >Right....
> >
> >
> >>Fundamentally you need to decide if white-space is part of your
> >>grammar. 
> >
> >
> >As is evident from my question, it is.
> 
> No, see that is the point, it is not "evident" as there is more than one 
> way to do it, and one of those ways may not really require white
> space.

We WANT white space. This is the way we want to do it. How do we do
it? That was the question.

> 
> >notjustawholebunchofstuffglommedtogetherinonestream;guessIwaswrong.
> 
> It is interesting that the above is not ambiguous!  While it may be 
> difficult to read, it is clearly not ambiguous.

Hmmm. Is it 'together' or 'to get her'? Who is she? Who's on first?

> 
> Starting from left to right, you have:
> "n"
> but this is not a word.
> Next it could be:
> "no"
> which is a word, so we have a "possibility" here.  But when you accept 
> "no" as a word, the remainder of the sentence starting with "tjust..." 
> cannot be completely and totally broken into words.  So in the end, a 
> production is forced to accept "not" as the first word, simply because 
> it is the only way to allow a production to find a second word.  And so 
> on.  Parsing the above sentence does *not* require white space even 
> though there are specific instances of ambiguity such as "no" vs. "not".

Isthatreallyhowyoureadtext?IfsothenIcanreallysaveawholelotofwearandtearonmythumbsbynotbotheringtoeverpressthespacebaronthiskeyboard!Thankyouverymuchforthishelp,Iwilltreasureitalways.Wasthata'spacebar'ora'spacebaron'?Whocares,asthereisnospace.Wewantspacescanyoutellushoworisitjustnotapossibility?

> 
> >
> >Thanks for the crumbs,
> >Scott.
> >
> 
> If you want more than crumbs, post code that you can prove to yourself 
> can be cut and pasted into an editor such as emacs and run without any 
> modification.  It is hard enough to reverse engineer someone's intent in 
> a piece of code when it works, let alone try to figure it out when it 
> doesn't.

I've proven it to myself. Above run _was_ done in emacs. (Is there any
other editor?) Sorry it doesn't seem to work out on your setup....

If anyone on this list can address the question of how best to attack
input as a series of space-separated tokens
insteadofasteadystreamofcharacters, please let me know.... Seems to be
something that should be an easy thing to do, but what do I know? Can
someone either tell me how to do it or tell me why it can't be done?

Thanks,
Scott.

Re: (SPAM?) space-separated tokens (FAQ?)

Reply via email to