My parser fails if you add () somewhere... And If I replace AdditiveExpression and so on by Expression, it just stops working...
On Tue, Oct 11, 2011 at 11:57 PM, Xavier MONTILLET <[email protected]> wrote: > Apparently, it is a known problem and the solution is to write the CFG > differently... > http://www.google.fr/url?sa=t&source=web&cd=7&ved=0CEsQFjAG&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.3.9583%26rep%3Drep1%26type%3Dpdf&ei=uLmUTqzQMcfrsgaZv6C-BQ&usg=AFQjCNHI7EfNrHlqmZYMIXEDhxibfnrZcg&sig2=Zv7847-YG0CxbfnMbEc4Lw > > On Tue, Oct 11, 2011 at 8:37 PM, Xavier MONTILLET > <[email protected]> wrote: >> I guess I could try to add a loop that increments a variable >> containing the number of time it should read the same rule while it >> doesn't match the whole string and the number of characters matched >> increases with the variable... >> But it looks like going around the problem without really fixing it... >> >> On Tue, Oct 11, 2011 at 7:57 PM, Xavier MONTILLET >> <[email protected]> wrote: >>> I just reformated the whole thing and it kind of works. >>> http://jsfiddle.net/xavierm02/ndDVa/1/ >>> I get a CST and I think I can manage to turn it into an AST. >>> But I still have a problem with the CFG... As shown, it works >>> >>> But if I change the order of the things to match it fails. At last on >>> recursive rules. >>> Here is an example : >>> >>> Text >>> Text Letter >>> >>> The code about won't work with my "parser" but if i rewrite it to >>> avoid having the recursive rule first, it works: >>> >>> Text >>> Letter Text >>> >>> The above example works... >>> >>> But I'm quite sure I won't always be able to avoid having a recursive >>> rule as first rule so... Do you guys see anything I could do to avoid >>> it? >>> >>> I tried allowing a given number of recursion before simply returning >>> false when seeing a recursive rule but if you have more than that >>> number, it fails :/ >>> And it I don't do it at all, I get an infinite loop. >>> >>> Here's the exact same code except I changed the order in the >>> declaration of MultiplicativeExpression or AdditiveExpression. >>> http://jsfiddle.net/xavierm02/ndDVa/2/ >>> >>> On Sun, Oct 9, 2011 at 8:16 PM, Xavier MONTILLET >>> <[email protected]> wrote: >>>> I figured out how to get something looking like an AST. >>>> http://jsfiddle.net/xavierm02/HG6Bx/1/ >>>> But comments and ideas on how to throw errors are still welcome. >>>> >>>> On Sun, Oct 9, 2011 at 6:10 PM, Xavier MONTILLET >>>> <[email protected]> wrote: >>>>> And how do you throw errors with a CFG ? >>>>> I mean JSLint does throw errors but the Lexer doesn't use a CFG... >>>>> If you use a CFG, how do you throw errors ? You take the longest / >>>>> deepest "possibility" that had a meaning ? >>>>> >>>>> On Sun, Oct 9, 2011 at 4:17 PM, Xavier MONTILLET >>>>> <[email protected]> wrote: >>>>>> Btw, I have no knowledge in Computer Science at all. So if you know >>>>>> tutorials / books explaining how to build a parser without needed too >>>>>> much knowledge, please tell me. Because every single tutorial I tried >>>>>> to read started with strange formulas I couldn't understand and, >>>>>> instead of providing pseudo-code, just described what the code should >>>>>> do, gave mathematical methods which I couldn't understand to check >>>>>> whether the result was going to be what we wanted or not and so on... >>>>>> >>>>>> (I know JavaScript quite well because I started doing websites in >>>>>> high-school but since I left High-school (last year), I've only been >>>>>> doing Maths and Physics in "classes preparatoire" (a France-specific >>>>>> system where you do a lot of Maths / Physics for two years in order to >>>>>> prepare "contests" which, depending on what your ranking is, will >>>>>> allow you to choose a given school or not. And I won't be doing any >>>>>> computer science until then...) >>>>>> >>>>>> On Sun, Oct 9, 2011 at 4:05 PM, Xavier MONTILLET >>>>>> <[email protected]> wrote: >>>>>>> I've started writing the Lexer and for it I used some kind of CFG. >>>>>>> But I'm not sure whether I understood the aim of those things >>>>>>> properly. I mean is the CFG supposed to let you build a syntax tree or >>>>>>> just a token list ? Because I can't figure out how to build the syntax >>>>>>> tree from it... >>>>>>> Anyway, I wrote a CFG for basic Maths : >>>>>>> >>>>>>> Digit >>>>>>> '0' >>>>>>> '1' >>>>>>> '2' >>>>>>> '3' >>>>>>> '4' >>>>>>> '5' >>>>>>> '6' >>>>>>> '7' >>>>>>> '8' >>>>>>> '9' >>>>>>> >>>>>>> Operator >>>>>>> '+' >>>>>>> '-' >>>>>>> '*' >>>>>>> '/' >>>>>>> >>>>>>> Expression >>>>>>> '(' Expression ')' >>>>>>> Expression Operator Expression >>>>>>> Number >>>>>>> >>>>>>> And as I didn't want to bother writing a parser to parse the CFG for >>>>>>> my parser, I transformed it in JS : >>>>>>> >>>>>>> var rules = ( function ( ) { >>>>>>> var Digit = [ ]; >>>>>>> var Expression = [ ]; >>>>>>> var Operator = [ ]; >>>>>>> Digit.name = 'Digit'; >>>>>>> Expression.name = 'Expression'; >>>>>>> Operator.name = 'Operator'; >>>>>>> Digit.push( >>>>>>> [ '0' ], >>>>>>> [ '1' ], >>>>>>> [ '2' ], >>>>>>> [ '3' ], >>>>>>> [ '4' ], >>>>>>> [ '5' ], >>>>>>> [ '6' ], >>>>>>> [ '7' ], >>>>>>> [ '8' ], >>>>>>> [ '9' ] >>>>>>> ); >>>>>>> Expression.push( >>>>>>> [ '(', Expression, ')' ], >>>>>>> [ Expression, Operator, Expression ], >>>>>>> [ Digit ] >>>>>>> ); >>>>>>> Operator.push( >>>>>>> [ '+' ], >>>>>>> [ '-' ], >>>>>>> [ '*' ], >>>>>>> [ '/' ] >>>>>>> ); >>>>>>> return { >>>>>>> Digit: Digit, >>>>>>> Expression: Expression, >>>>>>> Operator: Operator >>>>>>> }; >>>>>>> } )( ); >>>>>>> >>>>>>> The name properties are here because I had some infinite loops so it >>>>>>> was easier to debug with it... >>>>>>> And I define the arrays before so that I can reference them in others. >>>>>>> >>>>>>> Then, with this, I built a lexer: >>>>>>> >>>>>>> http://jsfiddle.net/xavierm02/tUCXY/ >>>>>>> >>>>>>> Is it good or not ? >>>>>>> Any comment would be welcome. The only thing I'm planning on changing >>>>>>> is the j += newTokens.join( '' ).length; I probably need it because I >>>>>>> didn't store indexes properly... Or maybe because I use readTokens >>>>>>> that calls itself instead of a loop. >>>>>>> >>>>>>> Thanks in advance for your comments. >>>>>>> >>>>>>> On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET >>>>>>> <[email protected]> wrote: >>>>>>>> Ok thank you :) >>>>>>>> >>>>>>>> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein >>>>>>>> <[email protected]> wrote: >>>>>>>>> There is nothing in the complexity of '+' that mandates your Abstract >>>>>>>>> Syntax >>>>>>>>> Tree layout. It's still pure syntax at this point. >>>>>>>>> However, any use of the parsed syntax must behave as if the (+ a b c) >>>>>>>>> syntax >>>>>>>>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and >>>>>>>>> probably >>>>>>>>> just makes using the parsed syntax more complex. >>>>>>>>> If repeated use of the same operator happens a lot, and you can easily >>>>>>>>> handle the pairing later, by all means do make a single >>>>>>>>> vairable-sized node. >>>>>>>>> E.g., in the RegExp grammar, adjacent literal characters are really a >>>>>>>>> sequence of Alternatives of Terms (of Atoms), but it happens so often >>>>>>>>> that >>>>>>>>> you probably want to parse /abel.*/ as containing a single text-node >>>>>>>>> with >>>>>>>>> the text "abel". >>>>>>>>> Just don't optimize the abstract syntax without considering how it's >>>>>>>>> going >>>>>>>>> to be used. >>>>>>>>> /L >>>>>>>>> >>>>>>>>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET >>>>>>>>> <[email protected]> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Right. I totally forgot that... >>>>>>>>>> So I have to keep only two operands per operator. >>>>>>>>>> Thank you for answering :) >>>>>>>>>> >>>>>>>>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <[email protected]> wrote: >>>>>>>>>> > The + operator is not just for numbers and can have sideeffects, if >>>>>>>>>> > the operands have different types or the operands are not numbers >>>>>>>>>> > or >>>>>>>>>> > strings (and even in that case). This makes the + operator tricky. >>>>>>>>>> > >>>>>>>>>> >>>> "boo" + 1 + 2 >>>>>>>>>> > "boo12" >>>>>>>>>> >>>> 1 + 2 + "boo" >>>>>>>>>> > "3boo" >>>>>>>>>> > >>>>>>>>>> >>>> var a = {toString: function () { return 1; }, valueOf: function >>>>>>>>>> >>>> () { >>>>>>>>>> >>>> return 2; }}, b = 0; a + b >>>>>>>>>> > 2 >>>>>>>>>> > >>>>>>>>>> > -- >>>>>>>>>> > Poetro >>>>>>>>>> > >>>>>>>>>> > -- >>>>>>>>>> > To view archived discussions from the original JSMentors Mailman >>>>>>>>>> > list: >>>>>>>>>> > http://www.mail-archive.com/[email protected]/ >>>>>>>>>> > >>>>>>>>>> > To search via a non-Google archive, visit here: >>>>>>>>>> > http://www.mail-archive.com/[email protected]/ >>>>>>>>>> > >>>>>>>>>> > To unsubscribe from this group, send email to >>>>>>>>>> > [email protected] >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> To view archived discussions from the original JSMentors Mailman >>>>>>>>>> list: >>>>>>>>>> http://www.mail-archive.com/[email protected]/ >>>>>>>>>> >>>>>>>>>> To search via a non-Google archive, visit here: >>>>>>>>>> http://www.mail-archive.com/[email protected]/ >>>>>>>>>> >>>>>>>>>> To unsubscribe from this group, send email to >>>>>>>>>> [email protected] >>>>>>>>> >>>>>>>>> -- >>>>>>>>> To view archived discussions from the original JSMentors Mailman list: >>>>>>>>> http://www.mail-archive.com/[email protected]/ >>>>>>>>> >>>>>>>>> To search via a non-Google archive, visit here: >>>>>>>>> http://www.mail-archive.com/[email protected]/ >>>>>>>>> >>>>>>>>> To unsubscribe from this group, send email to >>>>>>>>> [email protected] >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -- To view archived discussions from the original JSMentors Mailman list: http://www.mail-archive.com/[email protected]/ To search via a non-Google archive, visit here: http://www.mail-archive.com/[email protected]/ To unsubscribe from this group, send email to [email protected]
