Re: about the dot

Chas. Owens Tue, 22 Jan 2008 21:49:23 -0800

On Jan 22, 2008 10:55 PM, Jeff Pang <[EMAIL PROTECTED]> wrote:
snip
> $ perl -le 'print 3.4 .3. 4'
> Number found where operator expected at -e line 1, near "3. 4"
>         (Missing operator before  4?)
> syntax error at -e line 1, near "3. 4"
> Execution of -e aborted due to compilation errors.
>
> and why this can work?
>
> $ perl -le 'print 3.4 .3 .4'
> 3.434
>
> in the first case, what rules let perl think the last dot is not an operator 
> but a part of the float?
snip


The problem here is that 3. is a valid floating point number.  Take a look at

perl -le 'print 3. 4'

and

perl -le 'print 3. + 4'

The fact that Perl isn't backtracking and determining that "3. 4" is
three tokens (what you want) not two (what Perl actually does) is an
artifact of the type of parser* being used.  Lets pretend we are the
parser.

First lets parse "print 3.4 .3. 4"

The parser sees "p", "r", "i", "n", and "t" followed by a space an
knows the first token is OP_PRINT.  Then it sees "3" and knows it is
dealing with a number.  "." is a valid number character so it gets
appended "3.".  Then it sees the "4" and the space and knows the next
token is "3.4".  Next it sees ".", this could be OP_CONCAT or part of
a number, but since a number is not allowed after a number it assumes
the token is OP_CONCAT.  It sees the "3" next and thinks it is dealing
with a number.  It reads the next character which is also a number
character ".".  It sees a space so it thinks the token is "3.".
Finally, it sees "4", but a number is not allowed here, so it throws
an error.

Now lets parse "print 3.4 .3 .4"

The parser sees "p", "r", "i", "n", and "t" followed by a space an
knows the first token is OP_PRINT.  Then it sees "3" and knows it is
dealing with a number.  "." is a valid number character so it gets
appended "3.".  Then it sees the "4" and the space and knows the next
token is "3.4".  Next it sees ".", this could be OP_CONCAT or part of
a number, but since a number is not allowed after a number it assumes
the token is OP_CONCAT.  It sees the "3" next and thinks it is dealing
with a number.  It reads the next character space, so it knows the
token is "3".  The next character is "." which could be OP_CONCAT or a
number, but a number is not valid here, so the token must be
OP_CONCAT.  Finally, it sees "4" and since there are no more
characters it knows the token is "4".

So in the first case it parses as the following tokens (OP_PRINT, 3.4,
OP_CONCAT, 3.0, 4) which is not valid Perl and in the second case it
gets (OP_PRINT, 3.4, OP_CONCAT, 3, OP_CONCAT, 4) which is valid Perl.

If you found any of this interesting you may want to read the Dragon
book** for more information on how parsers work.

* I should probably say lexer there, but I tend to get mixed up
because they tend to be intertwined, so I just say parser.  Pedants
are welcome to correct my terminology.
** Principles of Compiler Design*** by Alfred V. Aho and Jeffery D. Ullman
*** It looks like it has been updated, gained two authors, and gotten
a name change: Compilers: Principles, Techniques, and Tools by Alfred
V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: about the dot

Reply via email to