Re: Proposal to improve property expression resolution for Wicket 8 by using a parser
Hi, we can look for a better name, but imo it's correct. Lets say that the user wrote a very weir list like: class WeirdList implements List{ int get0(){ return internal;} void set0( p ){ internal = p;} } In this case: - the expression "weirList.0" has "0" correctly parsed to a bean property named "0" - the expression "weirdList[0]" has the "[0]" correctly parsed to an index In the case you pointed out, were we have a non weird list: - the expression "regularList.0" correctly parses "0" to bean property. Yes, our resolver will resolve this expression to a list position as if it was resolving an index expression, but only because it failed to find a bean property - the expression "regularList[0]" correctly parses "[0]" to an index and will allow the resolver to skip any code trying to resolve "[0]" as a bean property Pedro Santos On Wed, Sep 28, 2016 at 3:58 PM, Sven Meierwrote: > Hi, > > > the plan is to change PropertyResolver to use a syntax tree insteadof > the current list of lexemes > > good, I missed that. > > > return resolveListOrArrayIndex(object, expression.beanProperty); > > Well, then expression.beanProperty is wrongly named, isn't it? > > Sven > > > > On 28.09.2016 19:24, Pedro Santos wrote: > >> Hi Sven, thx. Sending the response inline. >> >> Cheers >> >> Pedro Santos >> >> On Wed, Sep 28, 2016 at 3:08 AM, Sven Meier wrote: >> >> Hi Pedro, >>> >>> very interesting work, thank you very much. >>> >>> I have a little nitpick though: >>> >>> The expression "addressList.1" is parsed into a PropertyExpression with >>> JavaProperty-"person" followed by BeanProperty-"1". >>> >>> When PropertyResolver converts the parsed expression into {"addressList", >>> "1"}, the syntactic analysis is lost though. >>> >>> Indeed. The plan is to change PropertyResolver to use a syntax tree >> instead >> of the current list of lexemes. One of the devs can go ahead and make this >> change, but I would rather wait to get the team's thoughts on the parser >> first. As a side note, I moved the current tokenizer code from >> PropertyResolver to the inner class PropertyResolverTokenizer only to make >> the current code clearer; this class should be deleted in the future. >> >> >> With the lexical analysis remaining only, the expression results in >>> invocation of #getAdressList() followed by #get(1) instead, >>> i.e. "1" is interpreted as a list index. >>> >>> It seems our current grammar is more lenient than what your parser is >>> expecting. >>> >>> To be exact, our property resolution logic is lenient, not the current >> grammar. It's perfectly fine to have a lenient property resolution logic >> that falls back the property expression "myArray.1" to "myArray[1]". But >> using a syntax tree, the resolution code would be much clear about it. >> e.g. >> >> PropertyResolver { >>IGetAndSet resolve(object, expression) { >> ... >> if(expression.beanProperty != null) { >>try { >> return resolveBeanProperty(object, expression.beanProperty); >>} catch(ParserException e){ >> if(isANumber(expression.beanProperty)) >>//can't resolve the bean property "1", since it's a number, >> fallback to test a List/Array index as if the beanProperty were an index >> expression like "[1]" >>return resolveListOrArrayIndex(object, >> expression.beanProperty); >> else >>//fallback to test if the beanProperty can be a map key >>... >>} >> } >> ... >>} >> } >> >> >> Regards >>> Sven >>> >>> >>> >>> >>> On 25.09.2016 09:40, Pedro Santos wrote: >>> >>> Hi devs, Currently PropertyResolver is doing all the tokenizer plus property resolution logic to resolve property expressions to IGetAndSet interfaces. To improve its cohesion and to add new messages explaining syntax error in the property expression entered by users, I propose us to use a parser for the grammar below. I created the branch [1] to show the parser and, if the team agrees with the proposal, I will move to the next step that is to change the PropertyResolver to use the generated syntax tree object instead of the current lexemes. I'm using the ticket WICKET-4008 to track the work. Proposed grammar in an EBNF like description: java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...); java letter or digit = java letter | "0" | "1" | (...) ; char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" | "#" | (...); index char = char - "]"; empty space = { " " }; java identifier = java letter , {java letter or digit}; property name = java letter or digit , {java letter or digit}; method sign = "(" , empty space , ")"; index = "[" , index char , { index char } , "]"; bean property = property name, [ index ]; java
Re: Proposal to improve property expression resolution for Wicket 8 by using a parser
Hi, > the plan is to change PropertyResolver to use a syntax tree insteadof the current list of lexemes good, I missed that. > return resolveListOrArrayIndex(object, expression.beanProperty); Well, then expression.beanProperty is wrongly named, isn't it? Sven On 28.09.2016 19:24, Pedro Santos wrote: Hi Sven, thx. Sending the response inline. Cheers Pedro Santos On Wed, Sep 28, 2016 at 3:08 AM, Sven Meierwrote: Hi Pedro, very interesting work, thank you very much. I have a little nitpick though: The expression "addressList.1" is parsed into a PropertyExpression with JavaProperty-"person" followed by BeanProperty-"1". When PropertyResolver converts the parsed expression into {"addressList", "1"}, the syntactic analysis is lost though. Indeed. The plan is to change PropertyResolver to use a syntax tree instead of the current list of lexemes. One of the devs can go ahead and make this change, but I would rather wait to get the team's thoughts on the parser first. As a side note, I moved the current tokenizer code from PropertyResolver to the inner class PropertyResolverTokenizer only to make the current code clearer; this class should be deleted in the future. With the lexical analysis remaining only, the expression results in invocation of #getAdressList() followed by #get(1) instead, i.e. "1" is interpreted as a list index. It seems our current grammar is more lenient than what your parser is expecting. To be exact, our property resolution logic is lenient, not the current grammar. It's perfectly fine to have a lenient property resolution logic that falls back the property expression "myArray.1" to "myArray[1]". But using a syntax tree, the resolution code would be much clear about it. e.g. PropertyResolver { IGetAndSet resolve(object, expression) { ... if(expression.beanProperty != null) { try { return resolveBeanProperty(object, expression.beanProperty); } catch(ParserException e){ if(isANumber(expression.beanProperty)) //can't resolve the bean property "1", since it's a number, fallback to test a List/Array index as if the beanProperty were an index expression like "[1]" return resolveListOrArrayIndex(object, expression.beanProperty); else //fallback to test if the beanProperty can be a map key ... } } ... } } Regards Sven On 25.09.2016 09:40, Pedro Santos wrote: Hi devs, Currently PropertyResolver is doing all the tokenizer plus property resolution logic to resolve property expressions to IGetAndSet interfaces. To improve its cohesion and to add new messages explaining syntax error in the property expression entered by users, I propose us to use a parser for the grammar below. I created the branch [1] to show the parser and, if the team agrees with the proposal, I will move to the next step that is to change the PropertyResolver to use the generated syntax tree object instead of the current lexemes. I'm using the ticket WICKET-4008 to track the work. Proposed grammar in an EBNF like description: java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...); java letter or digit = java letter | "0" | "1" | (...) ; char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" | "#" | (...); index char = char - "]"; empty space = { " " }; java identifier = java letter , {java letter or digit}; property name = java letter or digit , {java letter or digit}; method sign = "(" , empty space , ")"; index = "[" , index char , { index char } , "]"; bean property = property name, [ index ]; java property = java identifier , [ index | method sign ]; property expression = [ bean property | java property | index ] , { "." , property expression }; Main goals: - To remove from PropertyResolver its current tokenizer code - To validate property expressions syntax To exemplify the new messages explaining syntax errors: - Given the expression: "bean.method(parameter)", we can explain that no parameter is allowed inside the brackets - Given the expression "array[]", we can explain that the empty brackets aren't allowed - Given the expression "bean.prop#name", we can explain that "bean.prop#name" isn't a valid bean property name - Given the expression "bean.0method()", we can explain that "0method" isn't a valid java identifier, thus the method call is invalid Collateral benefits: - We will be able to map keys in map expressions using the '[' character like "may[key[weird]" Obs. 1: even having the open square bracket '[', the parser is able to understand the key because of grammar's production rule: index char = char - "]"; Obs. 2: a better solution was proposed in the thread http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l - Allows empty spaces inside method call brackets ( ) Obs.: because of grammar's production rule: method sign = "(" , empty space, ")"; - Improves
Re: Proposal to improve property expression resolution for Wicket 8 by using a parser
Hi Sven, thx. Sending the response inline. Cheers Pedro Santos On Wed, Sep 28, 2016 at 3:08 AM, Sven Meierwrote: > Hi Pedro, > > very interesting work, thank you very much. > > I have a little nitpick though: > > The expression "addressList.1" is parsed into a PropertyExpression with > JavaProperty-"person" followed by BeanProperty-"1". > > When PropertyResolver converts the parsed expression into {"addressList", > "1"}, the syntactic analysis is lost though. > Indeed. The plan is to change PropertyResolver to use a syntax tree instead of the current list of lexemes. One of the devs can go ahead and make this change, but I would rather wait to get the team's thoughts on the parser first. As a side note, I moved the current tokenizer code from PropertyResolver to the inner class PropertyResolverTokenizer only to make the current code clearer; this class should be deleted in the future. > With the lexical analysis remaining only, the expression results in > invocation of #getAdressList() followed by #get(1) instead, > i.e. "1" is interpreted as a list index. > > It seems our current grammar is more lenient than what your parser is > expecting. > To be exact, our property resolution logic is lenient, not the current grammar. It's perfectly fine to have a lenient property resolution logic that falls back the property expression "myArray.1" to "myArray[1]". But using a syntax tree, the resolution code would be much clear about it. e.g. PropertyResolver { IGetAndSet resolve(object, expression) { ... if(expression.beanProperty != null) { try { return resolveBeanProperty(object, expression.beanProperty); } catch(ParserException e){ if(isANumber(expression.beanProperty)) //can't resolve the bean property "1", since it's a number, fallback to test a List/Array index as if the beanProperty were an index expression like "[1]" return resolveListOrArrayIndex(object, expression.beanProperty); else //fallback to test if the beanProperty can be a map key ... } } ... } } > > Regards > Sven > > > > > On 25.09.2016 09:40, Pedro Santos wrote: > >> Hi devs, >> >> Currently PropertyResolver is doing all the tokenizer plus property >> resolution logic to resolve property expressions to IGetAndSet interfaces. >> To improve its cohesion and to add new messages explaining syntax error in >> the property expression entered by users, I propose us to use a parser for >> the grammar below. >> >> I created the branch [1] to show the parser and, if the team agrees with >> the proposal, I will move to the next step that is to change the >> PropertyResolver to use the generated syntax tree object instead of the >> current lexemes. I'm using the ticket WICKET-4008 to track the work. >> >> Proposed grammar in an EBNF like description: >> >>java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...); >> >>java letter or digit = java letter | "0" | "1" | (...) ; >>char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" | >> "#" | (...); >>index char = char - "]"; >> >>empty space = { " " }; >>java identifier = java letter , {java letter or digit}; >>property name = java letter or digit , {java letter or digit}; >>method sign = "(" , empty space , ")"; >>index = "[" , index char , { index char } , "]"; >> >>bean property = property name, [ index ]; >>java property = java identifier , [ index | method sign ]; >>property expression = [ bean property | java property | index ] , { >> "." , >> property expression }; >> >> Main goals: >> >> - To remove from PropertyResolver its current tokenizer code >> - To validate property expressions syntax >> >> To exemplify the new messages explaining syntax errors: >> >> - Given the expression: "bean.method(parameter)", we can explain that no >> parameter is allowed inside the brackets >> - Given the expression "array[]", we can explain that the empty brackets >> aren't allowed >> - Given the expression "bean.prop#name", we can explain that >> "bean.prop#name" isn't a valid bean property name >> - Given the expression "bean.0method()", we can explain that "0method" >> isn't a valid java identifier, thus the method call is invalid >> >> Collateral benefits: >> >> - We will be able to map keys in map expressions using the '[' character >> like "may[key[weird]" >> Obs. 1: even having the open square bracket '[', the parser is able >> to >> understand the key because of grammar's production rule: index char = char >> - "]"; >> Obs. 2: a better solution was proposed in the thread >> http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l >> - Allows empty spaces inside method call brackets ( ) >> Obs.: because of grammar's production rule: method sign = "(" , empty >> space, ")"; >> - Improves Wicket's performance since every detected syntax error prevents >> PropertyResolver from to test bean's classes via reflection,
Re: Proposal to improve property expression resolution for Wicket 8 by using a parser
Hi Pedro, very interesting work, thank you very much. I have a little nitpick though: The expression "addressList.1" is parsed into a PropertyExpression with JavaProperty-"person" followed by BeanProperty-"1". When PropertyResolver converts the parsed expression into {"addressList", "1"}, the syntactic analysis is lost though. With the lexical analysis remaining only, the expression results in invocation of #getAdressList() followed by #get(1) instead, i.e. "1" is interpreted as a list index. It seems our current grammar is more lenient than what your parser is expecting. Regards Sven On 25.09.2016 09:40, Pedro Santos wrote: Hi devs, Currently PropertyResolver is doing all the tokenizer plus property resolution logic to resolve property expressions to IGetAndSet interfaces. To improve its cohesion and to add new messages explaining syntax error in the property expression entered by users, I propose us to use a parser for the grammar below. I created the branch [1] to show the parser and, if the team agrees with the proposal, I will move to the next step that is to change the PropertyResolver to use the generated syntax tree object instead of the current lexemes. I'm using the ticket WICKET-4008 to track the work. Proposed grammar in an EBNF like description: java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...); java letter or digit = java letter | "0" | "1" | (...) ; char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" | "#" | (...); index char = char - "]"; empty space = { " " }; java identifier = java letter , {java letter or digit}; property name = java letter or digit , {java letter or digit}; method sign = "(" , empty space , ")"; index = "[" , index char , { index char } , "]"; bean property = property name, [ index ]; java property = java identifier , [ index | method sign ]; property expression = [ bean property | java property | index ] , { "." , property expression }; Main goals: - To remove from PropertyResolver its current tokenizer code - To validate property expressions syntax To exemplify the new messages explaining syntax errors: - Given the expression: "bean.method(parameter)", we can explain that no parameter is allowed inside the brackets - Given the expression "array[]", we can explain that the empty brackets aren't allowed - Given the expression "bean.prop#name", we can explain that "bean.prop#name" isn't a valid bean property name - Given the expression "bean.0method()", we can explain that "0method" isn't a valid java identifier, thus the method call is invalid Collateral benefits: - We will be able to map keys in map expressions using the '[' character like "may[key[weird]" Obs. 1: even having the open square bracket '[', the parser is able to understand the key because of grammar's production rule: index char = char - "]"; Obs. 2: a better solution was proposed in the thread http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l - Allows empty spaces inside method call brackets ( ) Obs.: because of grammar's production rule: method sign = "(" , empty space, ")"; - Improves Wicket's performance since every detected syntax error prevents PropertyResolver from to test bean's classes via reflection, know to add performance overhead [2]. Future benefits: - The parser enables us to easily enforce that expressions like "bean.map[key]" will have the key tested only against a map, and won't be resolved as to object's get/set methods. - PropertyResolver resolves expressions like "bean.attributeAt.1" to the methods: bean.getAttributeAt(1) and bean.setAttributeAt(1, value). This resolution logic can be improved to only test for this kind of method signatures if the analyzed input allows such usage (currently the resolver always test for such signatures). 1 - https://github.com/apache/wicket/commits/WICKET-4008-property-expression-parser 2 - http://docs.oracle.com/javase/tutorial/reflect/index.html Cheers Pedro Santos
Proposal to improve property expression resolution for Wicket 8 by using a parser
Hi devs, Currently PropertyResolver is doing all the tokenizer plus property resolution logic to resolve property expressions to IGetAndSet interfaces. To improve its cohesion and to add new messages explaining syntax error in the property expression entered by users, I propose us to use a parser for the grammar below. I created the branch [1] to show the parser and, if the team agrees with the proposal, I will move to the next step that is to change the PropertyResolver to use the generated syntax tree object instead of the current lexemes. I'm using the ticket WICKET-4008 to track the work. Proposed grammar in an EBNF like description: java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...); java letter or digit = java letter | "0" | "1" | (...) ; char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" | "#" | (...); index char = char - "]"; empty space = { " " }; java identifier = java letter , {java letter or digit}; property name = java letter or digit , {java letter or digit}; method sign = "(" , empty space , ")"; index = "[" , index char , { index char } , "]"; bean property = property name, [ index ]; java property = java identifier , [ index | method sign ]; property expression = [ bean property | java property | index ] , { "." , property expression }; Main goals: - To remove from PropertyResolver its current tokenizer code - To validate property expressions syntax To exemplify the new messages explaining syntax errors: - Given the expression: "bean.method(parameter)", we can explain that no parameter is allowed inside the brackets - Given the expression "array[]", we can explain that the empty brackets aren't allowed - Given the expression "bean.prop#name", we can explain that "bean.prop#name" isn't a valid bean property name - Given the expression "bean.0method()", we can explain that "0method" isn't a valid java identifier, thus the method call is invalid Collateral benefits: - We will be able to map keys in map expressions using the '[' character like "may[key[weird]" Obs. 1: even having the open square bracket '[', the parser is able to understand the key because of grammar's production rule: index char = char - "]"; Obs. 2: a better solution was proposed in the thread http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l - Allows empty spaces inside method call brackets ( ) Obs.: because of grammar's production rule: method sign = "(" , empty space, ")"; - Improves Wicket's performance since every detected syntax error prevents PropertyResolver from to test bean's classes via reflection, know to add performance overhead [2]. Future benefits: - The parser enables us to easily enforce that expressions like "bean.map[key]" will have the key tested only against a map, and won't be resolved as to object's get/set methods. - PropertyResolver resolves expressions like "bean.attributeAt.1" to the methods: bean.getAttributeAt(1) and bean.setAttributeAt(1, value). This resolution logic can be improved to only test for this kind of method signatures if the analyzed input allows such usage (currently the resolver always test for such signatures). 1 - https://github.com/apache/wicket/commits/WICKET-4008-property-expression-parser 2 - http://docs.oracle.com/javase/tutorial/reflect/index.html Cheers Pedro Santos