I've started writing the Lexer and for it I used some kind of CFG.
But I'm not sure whether I understood the aim of those things
properly. I mean is the CFG supposed to let you build a syntax tree or
just a token list ? Because I can't figure out how to build the syntax
tree from it...
Anyway, I wrote a CFG for basic Maths :
Digit
'0'
'1'
'2'
'3'
'4'
'5'
'6'
'7'
'8'
'9'
Operator
'+'
'-'
'*'
'/'
Expression
'(' Expression ')'
Expression Operator Expression
Number
And as I didn't want to bother writing a parser to parse the CFG for
my parser, I transformed it in JS :
var rules = ( function ( ) {
var Digit = [ ];
var Expression = [ ];
var Operator = [ ];
Digit.name = 'Digit';
Expression.name = 'Expression';
Operator.name = 'Operator';
Digit.push(
[ '0' ],
[ '1' ],
[ '2' ],
[ '3' ],
[ '4' ],
[ '5' ],
[ '6' ],
[ '7' ],
[ '8' ],
[ '9' ]
);
Expression.push(
[ '(', Expression, ')' ],
[ Expression, Operator, Expression ],
[ Digit ]
);
Operator.push(
[ '+' ],
[ '-' ],
[ '*' ],
[ '/' ]
);
return {
Digit: Digit,
Expression: Expression,
Operator: Operator
};
} )( );
The name properties are here because I had some infinite loops so it
was easier to debug with it...
And I define the arrays before so that I can reference them in others.
Then, with this, I built a lexer:
http://jsfiddle.net/xavierm02/tUCXY/
Is it good or not ?
Any comment would be welcome. The only thing I'm planning on changing
is the j += newTokens.join( '' ).length; I probably need it because I
didn't store indexes properly... Or maybe because I use readTokens
that calls itself instead of a loop.
Thanks in advance for your comments.
On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET
<[email protected]> wrote:
> Ok thank you :)
>
> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein
> <[email protected]> wrote:
>> There is nothing in the complexity of '+' that mandates your Abstract Syntax
>> Tree layout. It's still pure syntax at this point.
>> However, any use of the parsed syntax must behave as if the (+ a b c) syntax
>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and probably
>> just makes using the parsed syntax more complex.
>> If repeated use of the same operator happens a lot, and you can easily
>> handle the pairing later, by all means do make a single vairable-sized node.
>> E.g., in the RegExp grammar, adjacent literal characters are really a
>> sequence of Alternatives of Terms (of Atoms), but it happens so often that
>> you probably want to parse /abel.*/ as containing a single text-node with
>> the text "abel".
>> Just don't optimize the abstract syntax without considering how it's going
>> to be used.
>> /L
>>
>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET <[email protected]>
>> wrote:
>>>
>>> Right. I totally forgot that...
>>> So I have to keep only two operands per operator.
>>> Thank you for answering :)
>>>
>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <[email protected]> wrote:
>>> > The + operator is not just for numbers and can have sideeffects, if
>>> > the operands have different types or the operands are not numbers or
>>> > strings (and even in that case). This makes the + operator tricky.
>>> >
>>> >>>> "boo" + 1 + 2
>>> > "boo12"
>>> >>>> 1 + 2 + "boo"
>>> > "3boo"
>>> >
>>> >>>> var a = {toString: function () { return 1; }, valueOf: function () {
>>> >>>> return 2; }}, b = 0; a + b
>>> > 2
>>> >
>>> > --
>>> > Poetro
>>> >
>>> > --
>>> > To view archived discussions from the original JSMentors Mailman list:
>>> > http://www.mail-archive.com/[email protected]/
>>> >
>>> > To search via a non-Google archive, visit here:
>>> > http://www.mail-archive.com/[email protected]/
>>> >
>>> > To unsubscribe from this group, send email to
>>> > [email protected]
>>> >
>>>
>>> --
>>> To view archived discussions from the original JSMentors Mailman list:
>>> http://www.mail-archive.com/[email protected]/
>>>
>>> To search via a non-Google archive, visit here:
>>> http://www.mail-archive.com/[email protected]/
>>>
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>
>> --
>> To view archived discussions from the original JSMentors Mailman list:
>> http://www.mail-archive.com/[email protected]/
>>
>> To search via a non-Google archive, visit here:
>> http://www.mail-archive.com/[email protected]/
>>
>> To unsubscribe from this group, send email to
>> [email protected]
>>
>
--
To view archived discussions from the original JSMentors Mailman list:
http://www.mail-archive.com/[email protected]/
To search via a non-Google archive, visit here:
http://www.mail-archive.com/[email protected]/
To unsubscribe from this group, send email to
[email protected]