> On Dec 11, 2016, at 1:06 PM, Yuriy Tymchuk <[email protected]> wrote:
>
> Hi John,
>
> thank you for our reply it is really helpful.
>
> Can you please also spend a bit of your time to talk about the
> “permissiveness" of the pattern syntax. For example the string '`.head
> `.@tail’ is parsed into a message node with selector `.@tail and a receiver
> variable node `.head. To my knowledge this is not a valid syntax as the names
> that have a dot in the beginning should become matchers for statements. Also
> for example if you write `#@literal the list marker is not doing anything
> when combines with the literal pound as far as I know, and some people are
> confused about this, were the any reason why this is not simply rising a
> syntax error?
>
The @ in a literal can make sense when you are matching inside of a literal
array:
(RBParser parseRewriteExpression: '#(`#@a 1 `#@b)')
match: (RBParser parseExpression: '#(2 1 3 4)')
inContext: Dictionary new
As for the permissiveness, I built the rewriter to be used internally for the
refactorings so I wasn’t too concerned about that when it was built. Only after
using the rewrites internally and seeing how useful they were, did we decide to
surface them to the user. The rewriter was easier to build without all of the
validation checks since we could just extend the existing parser to check for
pattern tokens in a few places. If you add the validation to the parser, then
you’ll need to override several methods to check for valid input.
Another validation approach is to validate the tree after it has been created.
I think Niall Ross did this for the VA Smalltalk rewriter about a decade ago.
Using this approach one can also validate the replacement expression only
references patterns defined in the search expression. I think this is probably
the best way to add validation.
BTW, the scanner actually allows any characters between the initial backquote
and the first letter character, so you can have
`!@#$%^&*()0987654321~`<>?,./{}|[]\a for a pattern variable name. The code only
processes the first modifier characters that it knows about. For the example
above, since it doesn’t know how to process the ! modifier character, it stops
looking so the pattern above will only match a variable node.
If I was defining the pattern modifiers again using the prefix backquote like
is done in the existing RB, I think I would make `name match anything (so you
didn’t need to add an @ so often since that is the most common case), and add a
| to signify a variable match (e.g., `|var) — although the bar character may
be too easily confused with a lower case L. Lists, @, could become * or + or
{2,3} to look like regexs. For example, you could have a pattern like this:
| `|{3,}temps |
`.*Statement
This would match sequence nodes with 3 or more temps and 0 or more statements.
In SmaCC, I took the approach to spell out what I wanted. The patterns begin
and end with a backquote. For example, if I wanted a list, instead of using a
special @ character, I would send the #beList message (e.g., `name{beList}`).
These can be combined/cascaded: `name{beList; testBlock: [:nodes | … do
something to test the nodes …]}`.
While some can understand and use these pattern rewrites, I don’t know if they
will ever be easy enough for most people to use. Instead I think some other
interface should be built for these people. For example, I can envision a
system where a user provides an sample of something to find. The system would
build some potential patterns and find potential matches, and then the user
would look at the matches and tell the tool which were valid or not. After a
few iterations, hopefully the tool has figured out the pattern the user wanted.
John Brant