My parser fails if you add () somewhere... And If I replace
AdditiveExpression and so on by Expression, it just stops working...

On Tue, Oct 11, 2011 at 11:57 PM, Xavier MONTILLET
<[email protected]> wrote:
> Apparently, it is a known problem and the solution is to write the CFG
> differently...
> http://www.google.fr/url?sa=t&source=web&cd=7&ved=0CEsQFjAG&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.3.9583%26rep%3Drep1%26type%3Dpdf&ei=uLmUTqzQMcfrsgaZv6C-BQ&usg=AFQjCNHI7EfNrHlqmZYMIXEDhxibfnrZcg&sig2=Zv7847-YG0CxbfnMbEc4Lw
>
> On Tue, Oct 11, 2011 at 8:37 PM, Xavier MONTILLET
> <[email protected]> wrote:
>> I guess I could try to add a loop that increments a variable
>> containing the number of time it should read the same rule while it
>> doesn't match the whole string and the number of characters matched
>> increases with the variable...
>> But it looks like going around the problem without really fixing it...
>>
>> On Tue, Oct 11, 2011 at 7:57 PM, Xavier MONTILLET
>> <[email protected]> wrote:
>>> I just reformated the whole thing and it kind of works.
>>> http://jsfiddle.net/xavierm02/ndDVa/1/
>>> I get a CST and I think I can manage to turn it into an AST.
>>> But I still have a problem with the CFG... As shown, it works
>>>
>>> But if I change the order of the things to match it fails. At last on
>>> recursive rules.
>>> Here is an example :
>>>
>>> Text
>>>    Text Letter
>>>
>>> The code about won't work with my "parser" but if i rewrite it to
>>> avoid having the recursive rule first, it works:
>>>
>>> Text
>>>    Letter Text
>>>
>>> The above example works...
>>>
>>> But I'm quite sure I won't always be able to avoid having a recursive
>>> rule as first rule so... Do you guys see anything I could do to avoid
>>> it?
>>>
>>> I tried allowing a given number of recursion before simply returning
>>> false when seeing a recursive rule but if you have more than that
>>> number, it fails :/
>>> And it I don't do it at all, I get an infinite loop.
>>>
>>> Here's the exact same code except I changed the order in the
>>> declaration of MultiplicativeExpression or AdditiveExpression.
>>> http://jsfiddle.net/xavierm02/ndDVa/2/
>>>
>>> On Sun, Oct 9, 2011 at 8:16 PM, Xavier MONTILLET
>>> <[email protected]> wrote:
>>>> I figured out how to get something looking like an AST.
>>>> http://jsfiddle.net/xavierm02/HG6Bx/1/
>>>> But comments and ideas on how to throw errors are still welcome.
>>>>
>>>> On Sun, Oct 9, 2011 at 6:10 PM, Xavier MONTILLET
>>>> <[email protected]> wrote:
>>>>> And how do you throw errors with a CFG ?
>>>>> I mean JSLint does throw errors but the Lexer doesn't use a CFG...
>>>>> If you use a CFG, how do you throw errors ? You take the longest /
>>>>> deepest "possibility" that had a meaning ?
>>>>>
>>>>> On Sun, Oct 9, 2011 at 4:17 PM, Xavier MONTILLET
>>>>> <[email protected]> wrote:
>>>>>> Btw, I have no knowledge in Computer Science at all. So if you know
>>>>>> tutorials / books explaining how to build a parser without needed too
>>>>>> much knowledge, please tell me. Because every single tutorial I tried
>>>>>> to read started with strange formulas I couldn't understand and,
>>>>>> instead of providing pseudo-code, just described what the code should
>>>>>> do, gave mathematical methods which I couldn't understand to check
>>>>>> whether the result was going to be what we wanted or not and so on...
>>>>>>
>>>>>> (I know JavaScript quite well because I started doing websites in
>>>>>> high-school but since I left High-school (last year), I've only been
>>>>>> doing Maths and Physics in "classes preparatoire" (a France-specific
>>>>>> system where you do a lot of Maths / Physics for two years in order to
>>>>>> prepare "contests" which, depending on what your ranking is, will
>>>>>> allow you to choose a given school or not. And I won't be doing any
>>>>>> computer science until then...)
>>>>>>
>>>>>> On Sun, Oct 9, 2011 at 4:05 PM, Xavier MONTILLET
>>>>>> <[email protected]> wrote:
>>>>>>> I've started writing the Lexer and for it I used some kind of CFG.
>>>>>>> But I'm not sure whether I understood the aim of those things
>>>>>>> properly. I mean is the CFG supposed to let you build a syntax tree or
>>>>>>> just a token list ? Because I can't figure out how to build the syntax
>>>>>>> tree from it...
>>>>>>> Anyway, I wrote a CFG for basic Maths :
>>>>>>>
>>>>>>> Digit
>>>>>>>    '0'
>>>>>>>    '1'
>>>>>>>    '2'
>>>>>>>    '3'
>>>>>>>    '4'
>>>>>>>    '5'
>>>>>>>    '6'
>>>>>>>    '7'
>>>>>>>    '8'
>>>>>>>    '9'
>>>>>>>
>>>>>>> Operator
>>>>>>>    '+'
>>>>>>>    '-'
>>>>>>>    '*'
>>>>>>>    '/'
>>>>>>>
>>>>>>> Expression
>>>>>>>    '(' Expression ')'
>>>>>>>    Expression Operator Expression
>>>>>>>    Number
>>>>>>>
>>>>>>> And as I didn't want to bother writing a parser to parse the CFG for
>>>>>>> my parser, I transformed it in JS :
>>>>>>>
>>>>>>> var rules = ( function ( ) {
>>>>>>>    var Digit = [ ];
>>>>>>>    var Expression = [ ];
>>>>>>>    var Operator = [ ];
>>>>>>>    Digit.name = 'Digit';
>>>>>>>    Expression.name = 'Expression';
>>>>>>>    Operator.name = 'Operator';
>>>>>>>    Digit.push(
>>>>>>>        [ '0' ],
>>>>>>>        [ '1' ],
>>>>>>>        [ '2' ],
>>>>>>>        [ '3' ],
>>>>>>>        [ '4' ],
>>>>>>>        [ '5' ],
>>>>>>>        [ '6' ],
>>>>>>>        [ '7' ],
>>>>>>>        [ '8' ],
>>>>>>>        [ '9' ]
>>>>>>>    );
>>>>>>>    Expression.push(
>>>>>>>        [ '(', Expression, ')' ],
>>>>>>>        [ Expression, Operator, Expression ],
>>>>>>>        [ Digit ]
>>>>>>>    );
>>>>>>>    Operator.push(
>>>>>>>        [ '+' ],
>>>>>>>        [ '-' ],
>>>>>>>        [ '*' ],
>>>>>>>        [ '/' ]
>>>>>>>    );
>>>>>>>    return {
>>>>>>>        Digit: Digit,
>>>>>>>        Expression: Expression,
>>>>>>>        Operator: Operator
>>>>>>>    };
>>>>>>> } )( );
>>>>>>>
>>>>>>> The name properties are here because I had some infinite loops so it
>>>>>>> was easier to debug with it...
>>>>>>> And I define the arrays before so that I can reference them in others.
>>>>>>>
>>>>>>> Then, with this, I built a lexer:
>>>>>>>
>>>>>>> http://jsfiddle.net/xavierm02/tUCXY/
>>>>>>>
>>>>>>> Is it good or not ?
>>>>>>> Any comment would be welcome. The only thing I'm planning on changing
>>>>>>> is the j += newTokens.join( '' ).length; I probably need it because I
>>>>>>> didn't store indexes properly... Or maybe because I use readTokens
>>>>>>> that calls itself instead of a loop.
>>>>>>>
>>>>>>> Thanks in advance for your comments.
>>>>>>>
>>>>>>> On Mon, Oct 3, 2011 at 9:11 PM, Xavier MONTILLET
>>>>>>> <[email protected]> wrote:
>>>>>>>> Ok thank you :)
>>>>>>>>
>>>>>>>> On Sat, Oct 1, 2011 at 4:31 PM, Lasse Reichstein
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> There is nothing in the complexity of '+' that mandates your Abstract 
>>>>>>>>> Syntax
>>>>>>>>> Tree layout. It's still pure syntax at this point.
>>>>>>>>> However, any use of the parsed syntax must behave as if the (+ a b c) 
>>>>>>>>> syntax
>>>>>>>>> tree is equivalent to (+ (+ a b) c), so it won't buy you much, and 
>>>>>>>>> probably
>>>>>>>>> just makes using the parsed syntax more complex.
>>>>>>>>> If repeated use of the same operator happens a lot, and you can easily
>>>>>>>>> handle the pairing later, by all means do make a single 
>>>>>>>>> vairable-sized node.
>>>>>>>>> E.g., in the RegExp grammar, adjacent literal characters are really a
>>>>>>>>> sequence of Alternatives of Terms (of Atoms), but it happens so often 
>>>>>>>>> that
>>>>>>>>> you probably want to parse /abel.*/ as containing a single text-node 
>>>>>>>>> with
>>>>>>>>> the text "abel".
>>>>>>>>> Just don't optimize the abstract syntax without considering how it's 
>>>>>>>>> going
>>>>>>>>> to be used.
>>>>>>>>> /L
>>>>>>>>>
>>>>>>>>> On Sat, Oct 1, 2011 at 12:28 PM, Xavier MONTILLET 
>>>>>>>>> <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Right. I totally forgot that...
>>>>>>>>>> So I have to keep only two operands per operator.
>>>>>>>>>> Thank you for answering :)
>>>>>>>>>>
>>>>>>>>>> On Sat, Oct 1, 2011 at 12:48 AM, Poetro <[email protected]> wrote:
>>>>>>>>>> > The + operator is not just for numbers and can have sideeffects, if
>>>>>>>>>> > the operands have different types or the operands are not numbers 
>>>>>>>>>> > or
>>>>>>>>>> > strings (and even in that case). This makes the + operator tricky.
>>>>>>>>>> >
>>>>>>>>>> >>>> "boo" + 1 + 2
>>>>>>>>>> > "boo12"
>>>>>>>>>> >>>> 1 + 2 + "boo"
>>>>>>>>>> > "3boo"
>>>>>>>>>> >
>>>>>>>>>> >>>> var a = {toString: function () { return 1; }, valueOf: function 
>>>>>>>>>> >>>> () {
>>>>>>>>>> >>>> return 2; }}, b = 0; a + b
>>>>>>>>>> > 2
>>>>>>>>>> >
>>>>>>>>>> > --
>>>>>>>>>> > Poetro
>>>>>>>>>> >
>>>>>>>>>> > --
>>>>>>>>>> > To view archived discussions from the original JSMentors Mailman 
>>>>>>>>>> > list:
>>>>>>>>>> > http://www.mail-archive.com/[email protected]/
>>>>>>>>>> >
>>>>>>>>>> > To search via a non-Google archive, visit here:
>>>>>>>>>> > http://www.mail-archive.com/[email protected]/
>>>>>>>>>> >
>>>>>>>>>> > To unsubscribe from this group, send email to
>>>>>>>>>> > [email protected]
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> To view archived discussions from the original JSMentors Mailman 
>>>>>>>>>> list:
>>>>>>>>>> http://www.mail-archive.com/[email protected]/
>>>>>>>>>>
>>>>>>>>>> To search via a non-Google archive, visit here:
>>>>>>>>>> http://www.mail-archive.com/[email protected]/
>>>>>>>>>>
>>>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>>>> [email protected]
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> To view archived discussions from the original JSMentors Mailman list:
>>>>>>>>> http://www.mail-archive.com/[email protected]/
>>>>>>>>>
>>>>>>>>> To search via a non-Google archive, visit here:
>>>>>>>>> http://www.mail-archive.com/[email protected]/
>>>>>>>>>
>>>>>>>>> To unsubscribe from this group, send email to
>>>>>>>>> [email protected]
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-- 
To view archived discussions from the original JSMentors Mailman list: 
http://www.mail-archive.com/[email protected]/

To search via a non-Google archive, visit here: 
http://www.mail-archive.com/[email protected]/

To unsubscribe from this group, send email to
[email protected]

Reply via email to