Re: Proposal to improve property expression resolution for Wicket 8 by using a parser

2016-09-28 Thread Pedro Santos
Hi,

we can look for a better name, but imo it's correct. Lets say that the user
wrote a very weir list like:

class WeirdList implements List{
  int get0(){ return internal;}
  void set0( p ){ internal = p;}
}

In this case:
 - the expression "weirList.0" has "0" correctly parsed to a bean property
named "0"
 - the expression "weirdList[0]" has the "[0]" correctly parsed to an index

In the case you pointed out, were we have a non weird list:

 - the expression "regularList.0" correctly parses "0" to bean property.
Yes, our resolver will resolve this expression to a list position as if it
was resolving an index expression, but only because it failed to find a
bean property
 - the expression "regularList[0]" correctly parses "[0]" to an index and
will allow the resolver to skip any code trying to resolve "[0]" as a bean
property

Pedro Santos

On Wed, Sep 28, 2016 at 3:58 PM, Sven Meier  wrote:

> Hi,
>
> > the plan is to change PropertyResolver to use a syntax tree insteadof
> the current list of lexemes
>
> good, I missed that.
>
> > return resolveListOrArrayIndex(object, expression.beanProperty);
>
> Well, then expression.beanProperty is wrongly named, isn't it?
>
> Sven
>
>
>
> On 28.09.2016 19:24, Pedro Santos wrote:
>
>> Hi Sven, thx. Sending the response inline.
>>
>> Cheers
>>
>> Pedro Santos
>>
>> On Wed, Sep 28, 2016 at 3:08 AM, Sven Meier  wrote:
>>
>> Hi Pedro,
>>>
>>> very interesting work, thank you very much.
>>>
>>> I have a little nitpick though:
>>>
>>> The expression "addressList.1" is parsed into a PropertyExpression with
>>> JavaProperty-"person" followed by BeanProperty-"1".
>>>
>>> When PropertyResolver converts the parsed expression into {"addressList",
>>> "1"}, the syntactic analysis is lost though.
>>>
>>> Indeed. The plan is to change PropertyResolver to use a syntax tree
>> instead
>> of the current list of lexemes. One of the devs can go ahead and make this
>> change, but I would rather wait to get the team's thoughts on the parser
>> first. As a side note, I moved the current tokenizer code from
>> PropertyResolver to the inner class PropertyResolverTokenizer only to make
>> the current code clearer; this class should be deleted in the future.
>>
>>
>> With the lexical analysis remaining only, the expression results in
>>> invocation of #getAdressList() followed by #get(1) instead,
>>> i.e. "1" is interpreted as a list index.
>>>
>>> It seems our current grammar is more lenient than what your parser is
>>> expecting.
>>>
>>> To be exact, our property resolution logic is lenient, not the current
>> grammar. It's perfectly fine to have a lenient property resolution logic
>> that falls back the property expression "myArray.1" to "myArray[1]". But
>> using a syntax tree, the resolution code would be much clear about it.
>> e.g.
>>
>> PropertyResolver {
>>IGetAndSet resolve(object, expression) {
>> ...
>>  if(expression.beanProperty != null) {
>>try {
>>  return resolveBeanProperty(object, expression.beanProperty);
>>} catch(ParserException e){
>>  if(isANumber(expression.beanProperty))
>>//can't resolve the bean property "1", since it's a number,
>> fallback to test a List/Array index as if the beanProperty were an index
>> expression like "[1]"
>>return resolveListOrArrayIndex(object,
>> expression.beanProperty);
>>  else
>>//fallback to test if the beanProperty can be a map key
>>...
>>}
>>  }
>> ...
>>}
>> }
>>
>>
>> Regards
>>> Sven
>>>
>>>
>>>
>>>
>>> On 25.09.2016 09:40, Pedro Santos wrote:
>>>
>>> Hi devs,

 Currently PropertyResolver is doing all the tokenizer plus property
 resolution logic to resolve property expressions to IGetAndSet
 interfaces.
 To improve its cohesion and to add new messages explaining syntax error
 in
 the property expression entered by users, I propose us to use a parser
 for
 the grammar below.

 I created the branch [1] to show the parser and, if the team agrees with
 the proposal, I will move to the next step that is to change the
 PropertyResolver to use the generated syntax tree object instead of the
 current lexemes. I'm using the ticket WICKET-4008 to track the work.

 Proposed grammar in an EBNF like description:

 java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...);

 java letter or digit = java letter | "0" | "1" | (...) ;
 char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" |
 "@" |
 "#" | (...);
 index char = char - "]";

 empty space = { " " };
 java identifier = java letter , {java letter or digit};
 property name = java letter or digit , {java letter or digit};
 method sign = "(" , empty space , ")";
 index = "[" , index char , { index char } , "]";

 bean property = property name, [ index ];
 java 

Re: Proposal to improve property expression resolution for Wicket 8 by using a parser

2016-09-28 Thread Sven Meier

Hi,

> the plan is to change PropertyResolver to use a syntax tree insteadof 
the current list of lexemes


good, I missed that.

> return resolveListOrArrayIndex(object, expression.beanProperty);

Well, then expression.beanProperty is wrongly named, isn't it?

Sven


On 28.09.2016 19:24, Pedro Santos wrote:

Hi Sven, thx. Sending the response inline.

Cheers

Pedro Santos

On Wed, Sep 28, 2016 at 3:08 AM, Sven Meier  wrote:


Hi Pedro,

very interesting work, thank you very much.

I have a little nitpick though:

The expression "addressList.1" is parsed into a PropertyExpression with
JavaProperty-"person" followed by BeanProperty-"1".

When PropertyResolver converts the parsed expression into {"addressList",
"1"}, the syntactic analysis is lost though.


Indeed. The plan is to change PropertyResolver to use a syntax tree instead
of the current list of lexemes. One of the devs can go ahead and make this
change, but I would rather wait to get the team's thoughts on the parser
first. As a side note, I moved the current tokenizer code from
PropertyResolver to the inner class PropertyResolverTokenizer only to make
the current code clearer; this class should be deleted in the future.



With the lexical analysis remaining only, the expression results in
invocation of #getAdressList() followed by #get(1) instead,
i.e. "1" is interpreted as a list index.

It seems our current grammar is more lenient than what your parser is
expecting.


To be exact, our property resolution logic is lenient, not the current
grammar. It's perfectly fine to have a lenient property resolution logic
that falls back the property expression "myArray.1" to "myArray[1]". But
using a syntax tree, the resolution code would be much clear about it. e.g.

PropertyResolver {
   IGetAndSet resolve(object, expression) {
...
 if(expression.beanProperty != null) {
   try {
 return resolveBeanProperty(object, expression.beanProperty);
   } catch(ParserException e){
 if(isANumber(expression.beanProperty))
   //can't resolve the bean property "1", since it's a number,
fallback to test a List/Array index as if the beanProperty were an index
expression like "[1]"
   return resolveListOrArrayIndex(object, expression.beanProperty);
 else
   //fallback to test if the beanProperty can be a map key
   ...
   }
 }
...
   }
}



Regards
Sven




On 25.09.2016 09:40, Pedro Santos wrote:


Hi devs,

Currently PropertyResolver is doing all the tokenizer plus property
resolution logic to resolve property expressions to IGetAndSet interfaces.
To improve its cohesion and to add new messages explaining syntax error in
the property expression entered by users, I propose us to use a parser for
the grammar below.

I created the branch [1] to show the parser and, if the team agrees with
the proposal, I will move to the next step that is to change the
PropertyResolver to use the generated syntax tree object instead of the
current lexemes. I'm using the ticket WICKET-4008 to track the work.

Proposed grammar in an EBNF like description:

java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...);

java letter or digit = java letter | "0" | "1" | (...) ;
char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" |
"#" | (...);
index char = char - "]";

empty space = { " " };
java identifier = java letter , {java letter or digit};
property name = java letter or digit , {java letter or digit};
method sign = "(" , empty space , ")";
index = "[" , index char , { index char } , "]";

bean property = property name, [ index ];
java property = java identifier , [ index | method sign ];
property expression = [ bean property | java property | index ] , {
"." ,
property expression };

Main goals:

- To remove from PropertyResolver its current tokenizer code
- To validate property expressions syntax

To exemplify the new messages explaining syntax errors:

- Given the expression: "bean.method(parameter)", we can explain that no
parameter is allowed inside the brackets
- Given the expression "array[]", we can explain that the empty brackets
aren't allowed
- Given the expression "bean.prop#name", we can explain that
"bean.prop#name" isn't a valid bean property name
- Given the expression "bean.0method()", we can explain that "0method"
isn't a valid java identifier, thus the method call is invalid

Collateral benefits:

- We will be able to map keys in map expressions using the '[' character
like "may[key[weird]"
   Obs. 1: even having the open square bracket '[', the parser is able
to
understand the key because of grammar's production rule: index char = char
- "]";
   Obs. 2: a better solution was proposed in the thread
http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l
- Allows empty spaces inside method call brackets ( )
  Obs.: because of grammar's production rule: method sign = "(" , empty
space, ")";
- Improves 

Re: Proposal to improve property expression resolution for Wicket 8 by using a parser

2016-09-28 Thread Pedro Santos
Hi Sven, thx. Sending the response inline.

Cheers

Pedro Santos

On Wed, Sep 28, 2016 at 3:08 AM, Sven Meier  wrote:

> Hi Pedro,
>
> very interesting work, thank you very much.
>
> I have a little nitpick though:
>
> The expression "addressList.1" is parsed into a PropertyExpression with
> JavaProperty-"person" followed by BeanProperty-"1".
>
> When PropertyResolver converts the parsed expression into {"addressList",
> "1"}, the syntactic analysis is lost though.
>

Indeed. The plan is to change PropertyResolver to use a syntax tree instead
of the current list of lexemes. One of the devs can go ahead and make this
change, but I would rather wait to get the team's thoughts on the parser
first. As a side note, I moved the current tokenizer code from
PropertyResolver to the inner class PropertyResolverTokenizer only to make
the current code clearer; this class should be deleted in the future.


> With the lexical analysis remaining only, the expression results in
> invocation of #getAdressList() followed by #get(1) instead,
> i.e. "1" is interpreted as a list index.
>
> It seems our current grammar is more lenient than what your parser is
> expecting.
>

To be exact, our property resolution logic is lenient, not the current
grammar. It's perfectly fine to have a lenient property resolution logic
that falls back the property expression "myArray.1" to "myArray[1]". But
using a syntax tree, the resolution code would be much clear about it. e.g.

PropertyResolver {
  IGetAndSet resolve(object, expression) {
...
if(expression.beanProperty != null) {
  try {
return resolveBeanProperty(object, expression.beanProperty);
  } catch(ParserException e){
if(isANumber(expression.beanProperty))
  //can't resolve the bean property "1", since it's a number,
fallback to test a List/Array index as if the beanProperty were an index
expression like "[1]"
  return resolveListOrArrayIndex(object, expression.beanProperty);
else
  //fallback to test if the beanProperty can be a map key
  ...
  }
}
...
  }
}


>
> Regards
> Sven
>
>
>
>
> On 25.09.2016 09:40, Pedro Santos wrote:
>
>> Hi devs,
>>
>> Currently PropertyResolver is doing all the tokenizer plus property
>> resolution logic to resolve property expressions to IGetAndSet interfaces.
>> To improve its cohesion and to add new messages explaining syntax error in
>> the property expression entered by users, I propose us to use a parser for
>> the grammar below.
>>
>> I created the branch [1] to show the parser and, if the team agrees with
>> the proposal, I will move to the next step that is to change the
>> PropertyResolver to use the generated syntax tree object instead of the
>> current lexemes. I'm using the ticket WICKET-4008 to track the work.
>>
>> Proposed grammar in an EBNF like description:
>>
>>java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...);
>>
>>java letter or digit = java letter | "0" | "1" | (...) ;
>>char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" |
>> "#" | (...);
>>index char = char - "]";
>>
>>empty space = { " " };
>>java identifier = java letter , {java letter or digit};
>>property name = java letter or digit , {java letter or digit};
>>method sign = "(" , empty space , ")";
>>index = "[" , index char , { index char } , "]";
>>
>>bean property = property name, [ index ];
>>java property = java identifier , [ index | method sign ];
>>property expression = [ bean property | java property | index ] , {
>> "." ,
>> property expression };
>>
>> Main goals:
>>
>> - To remove from PropertyResolver its current tokenizer code
>> - To validate property expressions syntax
>>
>> To exemplify the new messages explaining syntax errors:
>>
>> - Given the expression: "bean.method(parameter)", we can explain that no
>> parameter is allowed inside the brackets
>> - Given the expression "array[]", we can explain that the empty brackets
>> aren't allowed
>> - Given the expression "bean.prop#name", we can explain that
>> "bean.prop#name" isn't a valid bean property name
>> - Given the expression "bean.0method()", we can explain that "0method"
>> isn't a valid java identifier, thus the method call is invalid
>>
>> Collateral benefits:
>>
>> - We will be able to map keys in map expressions using the '[' character
>> like "may[key[weird]"
>>   Obs. 1: even having the open square bracket '[', the parser is able
>> to
>> understand the key because of grammar's production rule: index char = char
>> - "]";
>>   Obs. 2: a better solution was proposed in the thread
>> http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l
>> - Allows empty spaces inside method call brackets ( )
>>  Obs.: because of grammar's production rule: method sign = "(" , empty
>> space, ")";
>> - Improves Wicket's performance since every detected syntax error prevents
>> PropertyResolver from to test bean's classes via reflection, 

Re: Proposal to improve property expression resolution for Wicket 8 by using a parser

2016-09-28 Thread Sven Meier

Hi Pedro,

very interesting work, thank you very much.

I have a little nitpick though:

The expression "addressList.1" is parsed into a PropertyExpression with 
JavaProperty-"person" followed by BeanProperty-"1".


When PropertyResolver converts the parsed expression into 
{"addressList", "1"}, the syntactic analysis is lost though.


With the lexical analysis remaining only, the expression results in 
invocation of #getAdressList() followed by #get(1) instead,

i.e. "1" is interpreted as a list index.

It seems our current grammar is more lenient than what your parser is 
expecting.


Regards
Sven



On 25.09.2016 09:40, Pedro Santos wrote:

Hi devs,

Currently PropertyResolver is doing all the tokenizer plus property
resolution logic to resolve property expressions to IGetAndSet interfaces.
To improve its cohesion and to add new messages explaining syntax error in
the property expression entered by users, I propose us to use a parser for
the grammar below.

I created the branch [1] to show the parser and, if the team agrees with
the proposal, I will move to the next step that is to change the
PropertyResolver to use the generated syntax tree object instead of the
current lexemes. I'm using the ticket WICKET-4008 to track the work.

Proposed grammar in an EBNF like description:

   java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...);

   java letter or digit = java letter | "0" | "1" | (...) ;
   char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" |
"#" | (...);
   index char = char - "]";

   empty space = { " " };
   java identifier = java letter , {java letter or digit};
   property name = java letter or digit , {java letter or digit};
   method sign = "(" , empty space , ")";
   index = "[" , index char , { index char } , "]";

   bean property = property name, [ index ];
   java property = java identifier , [ index | method sign ];
   property expression = [ bean property | java property | index ] , { "." ,
property expression };

Main goals:

- To remove from PropertyResolver its current tokenizer code
- To validate property expressions syntax

To exemplify the new messages explaining syntax errors:

- Given the expression: "bean.method(parameter)", we can explain that no
parameter is allowed inside the brackets
- Given the expression "array[]", we can explain that the empty brackets
aren't allowed
- Given the expression "bean.prop#name", we can explain that
"bean.prop#name" isn't a valid bean property name
- Given the expression "bean.0method()", we can explain that "0method"
isn't a valid java identifier, thus the method call is invalid

Collateral benefits:

- We will be able to map keys in map expressions using the '[' character
like "may[key[weird]"
  Obs. 1: even having the open square bracket '[', the parser is able to
understand the key because of grammar's production rule: index char = char
- "]";
  Obs. 2: a better solution was proposed in the thread
http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l
- Allows empty spaces inside method call brackets ( )
 Obs.: because of grammar's production rule: method sign = "(" , empty
space, ")";
- Improves Wicket's performance since every detected syntax error prevents
PropertyResolver from to test bean's classes via reflection, know to add
performance overhead [2].

Future benefits:

- The parser enables us to easily enforce that expressions like
"bean.map[key]" will have the key tested only against a map, and won't be
resolved as to object's get/set methods.
- PropertyResolver resolves expressions like "bean.attributeAt.1" to the
methods: bean.getAttributeAt(1) and bean.setAttributeAt(1, value). This
resolution logic can be improved to only test for this kind of method
signatures if the analyzed input allows such usage (currently the resolver
always test for such signatures).

1 -
https://github.com/apache/wicket/commits/WICKET-4008-property-expression-parser
2 - http://docs.oracle.com/javase/tutorial/reflect/index.html

Cheers

Pedro Santos





Proposal to improve property expression resolution for Wicket 8 by using a parser

2016-09-25 Thread Pedro Santos
Hi devs,

Currently PropertyResolver is doing all the tokenizer plus property
resolution logic to resolve property expressions to IGetAndSet interfaces.
To improve its cohesion and to add new messages explaining syntax error in
the property expression entered by users, I propose us to use a parser for
the grammar below.

I created the branch [1] to show the parser and, if the team agrees with
the proposal, I will move to the next step that is to change the
PropertyResolver to use the generated syntax tree object instead of the
current lexemes. I'm using the ticket WICKET-4008 to track the work.

Proposed grammar in an EBNF like description:

  java letter = "_" | "$" | "A" | "a" | "B" | "b" | (...);

  java letter or digit = java letter | "0" | "1" | (...) ;
  char = java letter or digit | "." | "(" | ")" | "[" | "]" | "!" | "@" |
"#" | (...);
  index char = char - "]";

  empty space = { " " };
  java identifier = java letter , {java letter or digit};
  property name = java letter or digit , {java letter or digit};
  method sign = "(" , empty space , ")";
  index = "[" , index char , { index char } , "]";

  bean property = property name, [ index ];
  java property = java identifier , [ index | method sign ];
  property expression = [ bean property | java property | index ] , { "." ,
property expression };

Main goals:

- To remove from PropertyResolver its current tokenizer code
- To validate property expressions syntax

To exemplify the new messages explaining syntax errors:

- Given the expression: "bean.method(parameter)", we can explain that no
parameter is allowed inside the brackets
- Given the expression "array[]", we can explain that the empty brackets
aren't allowed
- Given the expression "bean.prop#name", we can explain that
"bean.prop#name" isn't a valid bean property name
- Given the expression "bean.0method()", we can explain that "0method"
isn't a valid java identifier, thus the method call is invalid

Collateral benefits:

- We will be able to map keys in map expressions using the '[' character
like "may[key[weird]"
 Obs. 1: even having the open square bracket '[', the parser is able to
understand the key because of grammar's production rule: index char = char
- "]";
 Obs. 2: a better solution was proposed in the thread
http://wicket-dev.markmail.org/thread/unwdqpxulw7tcd5l
- Allows empty spaces inside method call brackets ( )
Obs.: because of grammar's production rule: method sign = "(" , empty
space, ")";
- Improves Wicket's performance since every detected syntax error prevents
PropertyResolver from to test bean's classes via reflection, know to add
performance overhead [2].

Future benefits:

- The parser enables us to easily enforce that expressions like
"bean.map[key]" will have the key tested only against a map, and won't be
resolved as to object's get/set methods.
- PropertyResolver resolves expressions like "bean.attributeAt.1" to the
methods: bean.getAttributeAt(1) and bean.setAttributeAt(1, value). This
resolution logic can be improved to only test for this kind of method
signatures if the analyzed input allows such usage (currently the resolver
always test for such signatures).

1 -
https://github.com/apache/wicket/commits/WICKET-4008-property-expression-parser
2 - http://docs.oracle.com/javase/tutorial/reflect/index.html

Cheers

Pedro Santos