I currently haven't implemented the precedence parser as part of the ANTLR codegen. I only built the backend to get an idea of what the output might look like. It takes a flat input of arbitrary root elements and tokens that are [part of] some operator (ternary operators can have 2 tokens). The precedence tree is composed of groups of operators, where each group is either left-to-right or right-to-left and contains 1 or more member operators. Operators fall into one of the following categories:
Implemented: UnaryOperator, either prefix or postfix BinaryOperator TernaryOperator Not yet implemented, but considering: UnaryGroupOperator, able to handle what is normally the (expr) root element BinaryGroupOperator, either prefix or postfix, handles function calls and prefix items like Chapel's domain expressions SequencingOperator, such as the C++ comma operator Operators with 1 token use that token as the AST root, optionally setting its token type to some other type (eg. so the walker can distinguish prefix/postfix ++/--). Operators with 2 tokens use the first as the root, optionally setting the type, and optionally including the second operator token as its last child. Sam -----Original Message----- From: Peter Burns [mailto:[email protected]] Sent: Thursday, April 23, 2009 6:35 PM To: Terence Parr Cc: Sam Harwell; ANTLR-dev Dev Subject: Re: [antlr-dev] FW: Unary operator precedence We can handle unary operators at any precedence level. Here's an example: e options {strategy=precedence;} : e ('$'|'$!') e | e ('>>'|'>>=') e | e '||' <associativity=right> e | e '&&' <associativity=right> e | e ('=='|'/='|'<'|'<='|'>='|'>') e | e ':' <associativity=right> e | e ('+'|'-') e | e ('*'|'/') e | '-' e | e ('^' <associativity=right>|'^^' <associativity=right>|'**' <associativity=right>) e | e '.' <associativity=right> e | NUMBER | BOOL | ID ; This passes the following gunit tests. The tenth one ensures that we get unary precedence right: expression: "1234" -> "1234" "-1234" -> (- 1234) "1+1" -> (+ 1 1) "2*2" -> (* 2 2) "1==1" -> (== 1 1) "1*1+2" -> (+ (* 1 1) 2) "1+1*2" -> (+ 1 (* 1 2)) "1^2^3" -> (^ 1 (^ 2 3)) "2^3^^4**5" -> (^ 2 (^^ 3 (** 4 5))) "1*-2+3" -> (+ (* 1 (- 2)) 3) "1:2>>=f>>g" -> (>> (>>= (: 1 2) f) g) "1+1==2&&True||False" -> (|| (&& (== (+ 1 1) 2) True) False) "f.g$h" -> ($ (. f g) h) There's a single string template file that would need to be rewritten for each target language, but it's very straightforward. -Peter On Apr 23, 2009, at 2:32 PM, Terence Parr wrote: > Hi Sam, > > I'm working on an ANTLR preproc with Peter Burns here at USF. > > Peter: we don't require unary higher than binary do we? > > Ter > On Apr 23, 2009, at 10:59 AM, Sam Harwell wrote: > >> I'm working on a generalized expression parser backend that I hope >> to integrate as generated code in ANTLR targets. The forwarded >> email below shows a particular issue that needs to be addressed >> before the expression parser can be used with languages with "odd" >> precedence orders. I'm looking for feedback on my proposed >> resolution in my parser backend. >> >> There shouldn't be a restriction in the code that says all unary >> operators have to have higher precedence than any binary operators. >> As such, I'm planning on handling this by adding the following >> enumeration, and either allowing it (with a default value of >> Elevate) or explicitly requiring it for unary operators that come >> before some binary operator in the precedence order. >> >> This issue fully addresses ternary operators as well, where the >> required elevation involves the use of unary and/or binary >> operators in the middle section, such as the {value if true} >> section of the C++ ternary operator: {condition} ? {value if >> true} : {value if false}. >> >> /// <summary> >> /// Controls the behavior of lower priority, lower cardinality >> operator next to higher priority, >> /// higher cardinality operator. >> /// </summary> >> /// <example> >> /// In languages like Matlab, the unary +/- operators have a lower >> priority than the binary >> /// exponentiation operator. The user needs to be able to configure >> the handling of the unary >> /// +/- operators in the following case: >> /// >> /// 2^-2 >> /// >> /// If the unary operator's precedence remains lower than the >> exponentiation operator, this >> /// is syntactically incorrect. As an option, and one that Matlab >> uses, the operator's >> /// precedence immediately following an exponentiation operator is >> elevated, so the expression >> /// above is allowed. >> /// </example> >> public enum MixedCardinalityAction >> { >> /// <summary> >> /// Never re-order the operator precedence of the lower priority >> operator to allow its >> /// use next to a higher priority operator. >> /// </summary> >> None, >> /// <summary> >> /// When syntactically unambiguous, elevates the lower >> cardinality operator's precedence >> /// to allow its usage next to a higher cardinality operator. >> /// </summary> >> Elevate >> } >> >> Thank you, >> Sam Harwell >> >> From: Sam Harwell >> Sent: Thursday, April 23, 2009 10:05 AM >> To: '[email protected]' >> Subject: Unary operator precedence >> >> Hello, >> >> I am having some trouble with the unary operator precedence in >> certain situations. According to the spec, the multiplication >> operator has higher precedence than the unary - operator, which is >> posing a problem for the following expression: >> >> 3 * -2 >> >> For the fixed precedence levels as documented, this appears to be a >> syntax error unless written like this: >> >> 3 * (-2) >> >> One way to allow this syntax is cause the unary operator >> immediately following a *, /, %, reduce, scan, !, ~, or ** to have >> higher than normal precedence. Operators ! and ~ do not need this >> clarification following *, /, or %. The following is an extract of >> a grammar for a recursive descent LL parser. >> >> not_expression >> : ('!' | '~')* >> reduce_scan_expression >> ; >> >> multiply_expression >> : not_expression >> ( ('*' | '/' | '%') => >> ('*' | '/' | '%') >> not_expression >> )* >> ; >> >> negate_expression >> : ('+' | '-')* >> multiply_expression >> ; >> >> addition_expression >> : negate_expression >> ( ('+' | '-') => >> ('+' | '-') >> negate_expression >> )* >> ; >> >> I would have to make the following alteration which, despite being >> unambiguous, clearly shows that a unary operator following a >> multiplicative operator has a different precedence from the same >> operator in other contexts. >> >> multiply_expression >> : not_expression >> ( ('*' | '/' | '%') => >> ('*' | '/' | '%') >> ('+' | '-')? not_expression >> )* >> ; >> >> negate_expression >> : ('+' | '-')* >> multiply_expression >> ; >> >> Octave (and hence Matlab I'm guessing) follows this approach >> (-2^2=-4 and 2*-2=-4). C++ handles this problem by making .* and - >> >* their own operators, then making all unary operators have a >> higher precedence than any binary operators. Either way it goes, I >> believe the order should be documented so it does not pose a >> problem for people creating a Chapel compiler in the future. :) >> >> Thank you, >> Sam Harwell >> >> _______________________________________________ >> antlr-dev mailing list >> [email protected] >> http://www.antlr.org/mailman/listinfo/antlr-dev > _______________________________________________ antlr-dev mailing list [email protected] http://www.antlr.org/mailman/listinfo/antlr-dev
