Re: [Pharo-dev] RBPattern matching dynamic array expressions

John Brant Mon, 12 Dec 2016 15:25:35 -0800

> On Dec 11, 2016, at 1:06 PM, Yuriy Tymchuk <[email protected]> wrote:
> 
> Hi John,
> 
> thank you for our reply it is really helpful.
> 
> Can you please also spend a bit of your time to talk about the 
> “permissiveness" of the pattern syntax. For example the string '`.head 
> `.@tail’ is parsed into a message node with selector `.@tail and a receiver 
> variable node `.head. To my knowledge this is not a valid syntax as the names 
> that have a dot in the beginning should become matchers for statements. Also 
> for example if you write `#@literal the list marker is not doing anything 
> when combines with the literal pound as far as I know, and some people are 
> confused about this, were the any reason why this is not simply rising a 
> syntax error?
>


The @ in a literal can make sense when you are matching inside of a literal 
array:
        (RBParser parseRewriteExpression: '#(`#@a 1 `#@b)')
                match: (RBParser parseExpression: '#(2 1 3 4)')
                inContext: Dictionary new

As for the permissiveness, I built the rewriter to be used internally for the 
refactorings so I wasn’t too concerned about that when it was built. Only after 
using the rewrites internally and seeing how useful they were, did we decide to 
surface them to the user. The rewriter was easier to build without all of the 
validation checks since we could just extend the existing parser to check for 
pattern tokens in a few places. If you add the validation to the parser, then 
you’ll need to override several methods to check for valid input. 

Another validation approach is to validate the tree after it has been created. 
I think Niall Ross did this for the VA Smalltalk rewriter about a decade ago.  
Using this approach one can also validate the replacement expression only 
references patterns defined in the search expression. I think this is probably 
the best way to add validation.

BTW, the scanner actually allows any characters between the initial backquote 
and the first letter character, so you can have 
`!@#$%^&*()0987654321~`<>?,./{}|[]\a for a pattern variable name. The code only 
processes the first modifier characters that it knows about. For the example 
above, since it doesn’t know how to process the ! modifier character, it stops 
looking so the pattern above will only match a variable node. 

If I was defining the pattern modifiers again using the prefix backquote like 
is done in the existing RB, I think I would make `name match anything (so you 
didn’t need to add an @ so often since that is the most common case), and add a 
| to signify a variable  match (e.g., `|var) — although the bar character may 
be too easily confused with a lower case L. Lists, @, could become * or + or 
{2,3} to look like regexs. For example, you could have a pattern like this:
        | `|{3,}temps |
        `.*Statement
This would match sequence nodes with 3 or more temps and 0 or more statements.

In SmaCC, I took the approach to spell out what I wanted. The patterns begin 
and end with a backquote. For example, if I wanted a list, instead of using a 
special @ character, I would send the #beList message (e.g., `name{beList}`). 
These can be combined/cascaded: `name{beList; testBlock: [:nodes | … do 
something to test the nodes …]}`.

While some can understand and use these pattern rewrites, I don’t know if they 
will ever be easy enough for most people to use. Instead I think some other 
interface should be built for these people. For example, I can envision a 
system where a user provides an sample of something to find. The system would 
build some potential patterns and find potential matches, and then the user 
would look at the matches and tell the tool which were valid or not. After a 
few iterations, hopefully the tool has figured out the pattern the user wanted. 


John Brant

Re: [Pharo-dev] RBPattern matching dynamic array expressions

Reply via email to