Btw, I have no knowledge in Computer Science at all. So if you know tutorials / books explaining how to build a parser without needed too much knowledge, please tell me. Because every single tutorial I tried to read started with strange formulas I couldn't understand and, instead of providing pseudo-code, just described what the code should do, gave mathematical methods which I couldn't understand to check whether the result was going to be what we wanted or not and so on...
(I know JavaScript quite well because I started doing websites in high-school but since I left High-school (last year), I've only been doing Maths and Physics in "classes preparatoire" (a France-specific system where you do a lot of Maths / Physics for two years in order to prepare "contests" which, depending on what your ranking is, will allow you to choose a given school or not. And I won't be doing any computer science until then...) On Sun, Oct 9, 2011 at 4:05 PM, Xavier MONTILLET <[email protected]> wrote: > I've started writing the Lexer and for it I used some kind of CFG. > But I'm not sure whether I understood the aim of those things > properly. I mean is the CFG supposed to let you build a syntax tree or > just a token list ? Because I can't figure out how to build the syntax > tree from it... > Anyway, I wrote a CFG for basic Maths : > > Digit > '0' > '1' > '2' > '3' > '4' > '5' > '6' > '7' > '8' > '9' > > Operator > '+' > '-' > '*' > '/' > > Expression > '(' Expression ')' > Expression Operator Expression > Number > > And as I didn't want to bother writing a parser to parse the CFG for > my parser, I transformed it in JS : > > var rules = ( function ( ) { > var Digit = [ ]; > var Expression = [ ]; > var Operator = [ ]; > Digit.name = 'Digit'; > Expression.name = 'Expression'; > Operator.name = 'Operator'; > Digit.push( > [ '0' ], > [ '1' ], > [ '2' ], > [ '3' ], > [ '4' ], > [ '5' ], > [ '6' ], > [ '7' ], > [ '8' ], > [ '9' ] > ); > Expression.push( > [ '(', Expression, ')' ], > [ Expression, Operator, Expression ], > [ Digit ] > ); > Operator.push( > [ '+' ], > [ '-' ], > [ '*' ], > [ '/' ] > ); > return { > Digit: Digit, > Expression: Expression, > Operator: Operator > }; > } )( ); > > The name properties are here because I had some infinite loops so it > was easier to debug with it... > And I define the arrays before so that I can reference them in others. > > Then, with this, I built a lexer: > > http://jsfiddle.net/xavierm02/tUCXY/ > > Is it good or not ? > Any comment would be welcome. The only thing I'm planning on changing > is the j += newTokens.join( '' ).length; I probably need it because I > didn't store indexes properly... Or maybe because I use readTokens > that calls itself instead of a loop. > > Thanks in advance for your comments. > > On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET > <[email protected]> wrote: >> Ok thank you :) >> >> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein >> <[email protected]> wrote: >>> There is nothing in the complexity of '+' that mandates your Abstract Syntax >>> Tree layout. It's still pure syntax at this point. >>> However, any use of the parsed syntax must behave as if the (+ a b c) syntax >>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and probably >>> just makes using the parsed syntax more complex. >>> If repeated use of the same operator happens a lot, and you can easily >>> handle the pairing later, by all means do make a single vairable-sized node. >>> E.g., in the RegExp grammar, adjacent literal characters are really a >>> sequence of Alternatives of Terms (of Atoms), but it happens so often that >>> you probably want to parse /abel.*/ as containing a single text-node with >>> the text "abel". >>> Just don't optimize the abstract syntax without considering how it's going >>> to be used. >>> /L >>> >>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET <[email protected]> >>> wrote: >>>> >>>> Right. I totally forgot that... >>>> So I have to keep only two operands per operator. >>>> Thank you for answering :) >>>> >>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <[email protected]> wrote: >>>> > The + operator is not just for numbers and can have sideeffects, if >>>> > the operands have different types or the operands are not numbers or >>>> > strings (and even in that case). This makes the + operator tricky. >>>> > >>>> >>>> "boo" + 1 + 2 >>>> > "boo12" >>>> >>>> 1 + 2 + "boo" >>>> > "3boo" >>>> > >>>> >>>> var a = {toString: function () { return 1; }, valueOf: function () { >>>> >>>> return 2; }}, b = 0; a + b >>>> > 2 >>>> > >>>> > -- >>>> > Poetro >>>> > >>>> > -- >>>> > To view archived discussions from the original JSMentors Mailman list: >>>> > http://www.mail-archive.com/[email protected]/ >>>> > >>>> > To search via a non-Google archive, visit here: >>>> > http://www.mail-archive.com/[email protected]/ >>>> > >>>> > To unsubscribe from this group, send email to >>>> > [email protected] >>>> > >>>> >>>> -- >>>> To view archived discussions from the original JSMentors Mailman list: >>>> http://www.mail-archive.com/[email protected]/ >>>> >>>> To search via a non-Google archive, visit here: >>>> http://www.mail-archive.com/[email protected]/ >>>> >>>> To unsubscribe from this group, send email to >>>> [email protected] >>> >>> -- >>> To view archived discussions from the original JSMentors Mailman list: >>> http://www.mail-archive.com/[email protected]/ >>> >>> To search via a non-Google archive, visit here: >>> http://www.mail-archive.com/[email protected]/ >>> >>> To unsubscribe from this group, send email to >>> [email protected] >>> >> > -- To view archived discussions from the original JSMentors Mailman list: http://www.mail-archive.com/[email protected]/ To search via a non-Google archive, visit here: http://www.mail-archive.com/[email protected]/ To unsubscribe from this group, send email to [email protected]
