I figured out how to get something looking like an AST. http://jsfiddle.net/xavierm02/HG6Bx/1/ But comments and ideas on how to throw errors are still welcome.
On Sun, Oct 9, 2011 at 6:10 PM, Xavier MONTILLET <[email protected]> wrote: > And how do you throw errors with a CFG ? > I mean JSLint does throw errors but the Lexer doesn't use a CFG... > If you use a CFG, how do you throw errors ? You take the longest / > deepest "possibility" that had a meaning ? > > On Sun, Oct 9, 2011 at 4:17 PM, Xavier MONTILLET > <[email protected]> wrote: >> Btw, I have no knowledge in Computer Science at all. So if you know >> tutorials / books explaining how to build a parser without needed too >> much knowledge, please tell me. Because every single tutorial I tried >> to read started with strange formulas I couldn't understand and, >> instead of providing pseudo-code, just described what the code should >> do, gave mathematical methods which I couldn't understand to check >> whether the result was going to be what we wanted or not and so on... >> >> (I know JavaScript quite well because I started doing websites in >> high-school but since I left High-school (last year), I've only been >> doing Maths and Physics in "classes preparatoire" (a France-specific >> system where you do a lot of Maths / Physics for two years in order to >> prepare "contests" which, depending on what your ranking is, will >> allow you to choose a given school or not. And I won't be doing any >> computer science until then...) >> >> On Sun, Oct 9, 2011 at 4:05 PM, Xavier MONTILLET >> <[email protected]> wrote: >>> I've started writing the Lexer and for it I used some kind of CFG. >>> But I'm not sure whether I understood the aim of those things >>> properly. I mean is the CFG supposed to let you build a syntax tree or >>> just a token list ? Because I can't figure out how to build the syntax >>> tree from it... >>> Anyway, I wrote a CFG for basic Maths : >>> >>> Digit >>> '0' >>> '1' >>> '2' >>> '3' >>> '4' >>> '5' >>> '6' >>> '7' >>> '8' >>> '9' >>> >>> Operator >>> '+' >>> '-' >>> '*' >>> '/' >>> >>> Expression >>> '(' Expression ')' >>> Expression Operator Expression >>> Number >>> >>> And as I didn't want to bother writing a parser to parse the CFG for >>> my parser, I transformed it in JS : >>> >>> var rules = ( function ( ) { >>> var Digit = [ ]; >>> var Expression = [ ]; >>> var Operator = [ ]; >>> Digit.name = 'Digit'; >>> Expression.name = 'Expression'; >>> Operator.name = 'Operator'; >>> Digit.push( >>> [ '0' ], >>> [ '1' ], >>> [ '2' ], >>> [ '3' ], >>> [ '4' ], >>> [ '5' ], >>> [ '6' ], >>> [ '7' ], >>> [ '8' ], >>> [ '9' ] >>> ); >>> Expression.push( >>> [ '(', Expression, ')' ], >>> [ Expression, Operator, Expression ], >>> [ Digit ] >>> ); >>> Operator.push( >>> [ '+' ], >>> [ '-' ], >>> [ '*' ], >>> [ '/' ] >>> ); >>> return { >>> Digit: Digit, >>> Expression: Expression, >>> Operator: Operator >>> }; >>> } )( ); >>> >>> The name properties are here because I had some infinite loops so it >>> was easier to debug with it... >>> And I define the arrays before so that I can reference them in others. >>> >>> Then, with this, I built a lexer: >>> >>> http://jsfiddle.net/xavierm02/tUCXY/ >>> >>> Is it good or not ? >>> Any comment would be welcome. The only thing I'm planning on changing >>> is the j += newTokens.join( '' ).length; I probably need it because I >>> didn't store indexes properly... Or maybe because I use readTokens >>> that calls itself instead of a loop. >>> >>> Thanks in advance for your comments. >>> >>> On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET >>> <[email protected]> wrote: >>>> Ok thank you :) >>>> >>>> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein >>>> <[email protected]> wrote: >>>>> There is nothing in the complexity of '+' that mandates your Abstract >>>>> Syntax >>>>> Tree layout. It's still pure syntax at this point. >>>>> However, any use of the parsed syntax must behave as if the (+ a b c) >>>>> syntax >>>>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and >>>>> probably >>>>> just makes using the parsed syntax more complex. >>>>> If repeated use of the same operator happens a lot, and you can easily >>>>> handle the pairing later, by all means do make a single vairable-sized >>>>> node. >>>>> E.g., in the RegExp grammar, adjacent literal characters are really a >>>>> sequence of Alternatives of Terms (of Atoms), but it happens so often that >>>>> you probably want to parse /abel.*/ as containing a single text-node with >>>>> the text "abel". >>>>> Just don't optimize the abstract syntax without considering how it's going >>>>> to be used. >>>>> /L >>>>> >>>>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET >>>>> <[email protected]> >>>>> wrote: >>>>>> >>>>>> Right. I totally forgot that... >>>>>> So I have to keep only two operands per operator. >>>>>> Thank you for answering :) >>>>>> >>>>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <[email protected]> wrote: >>>>>> > The + operator is not just for numbers and can have sideeffects, if >>>>>> > the operands have different types or the operands are not numbers or >>>>>> > strings (and even in that case). This makes the + operator tricky. >>>>>> > >>>>>> >>>> "boo" + 1 + 2 >>>>>> > "boo12" >>>>>> >>>> 1 + 2 + "boo" >>>>>> > "3boo" >>>>>> > >>>>>> >>>> var a = {toString: function () { return 1; }, valueOf: function () { >>>>>> >>>> return 2; }}, b = 0; a + b >>>>>> > 2 >>>>>> > >>>>>> > -- >>>>>> > Poetro >>>>>> > >>>>>> > -- >>>>>> > To view archived discussions from the original JSMentors Mailman list: >>>>>> > http://www.mail-archive.com/[email protected]/ >>>>>> > >>>>>> > To search via a non-Google archive, visit here: >>>>>> > http://www.mail-archive.com/[email protected]/ >>>>>> > >>>>>> > To unsubscribe from this group, send email to >>>>>> > [email protected] >>>>>> > >>>>>> >>>>>> -- >>>>>> To view archived discussions from the original JSMentors Mailman list: >>>>>> http://www.mail-archive.com/[email protected]/ >>>>>> >>>>>> To search via a non-Google archive, visit here: >>>>>> http://www.mail-archive.com/[email protected]/ >>>>>> >>>>>> To unsubscribe from this group, send email to >>>>>> [email protected] >>>>> >>>>> -- >>>>> To view archived discussions from the original JSMentors Mailman list: >>>>> http://www.mail-archive.com/[email protected]/ >>>>> >>>>> To search via a non-Google archive, visit here: >>>>> http://www.mail-archive.com/[email protected]/ >>>>> >>>>> To unsubscribe from this group, send email to >>>>> [email protected] >>>>> >>>> >>> >> > -- To view archived discussions from the original JSMentors Mailman list: http://www.mail-archive.com/[email protected]/ To search via a non-Google archive, visit here: http://www.mail-archive.com/[email protected]/ To unsubscribe from this group, send email to [email protected]
