[ 
https://issues.apache.org/jira/browse/DAFFODIL-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343205#comment-17343205
 ] 

Steve Lawrence commented on DAFFODIL-1437:
------------------------------------------

Great! Looking more at this, this might not be a great first bug, just because 
it requires coming up with a name for about 80 new objects. And knowing what a 
good name is going to require a pretty deep understanding about what daffodil 
is doing in some fairly complicated code. The changes themselves are actually 
pretty straightforword (although about 80 of them), but knowing what to name 
things is perhaps a challenge. But here's the idea for this change:

A little background first, Daffodil implements a DFA  (Deterministic Finite 
Automaton) used to scan for specific text in the data. This DFA is essentially 
a bunch of States, each containing a list of Rules. Each Rule has a test() and 
an act() function. When we are in a State we loop over it's list of Rules and 
call the test function. When a test function succeeds then we execute the act 
function, the act function whill do something (e.g. parse a character, change a 
state, etc.).

You can fine the Rule traight and companion object in the 
[dfa/Rules.scala|https://github.com/apache/daffodil/blob/master/daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/dfa/Runtime.scala#L159-L180]
 file. Note that the Rule companion object allows a way to easily create a new 
Rule without having to name it. For example, you can just do something like this

{code:scala}
Rule {
  (r: Registers) => { // test contents here },
  (r: Registers) => { // act contents here }
}
{code}

The above calls the companion object passing in two anonymous functions. The 
first is the test function and the second is the act function. Each function 
takes a single Registers parameter and then does something with that in the 
contents.

You can fine all the Rules that we create in 
[dfa/Rules.scala|https://github.com/apache/daffodil/blob/master/daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/dfa/Rules.scala].
 There's just over 70 of these Rules.

The idea behind this bug is that even though this works, it is pretty hard to 
read and is not self documenting at all. It's really difficult to understand 
what each rule is testing for and how it will act when that test passes. So 
instead of doing the above, for each Rule we come up with a reasonable name 
that describes that rule, and do something like this:

{code:scala}
object DescriptionOfRuleNumberOne extends Rule {
  override def test(r: Registers): Unit = { // test contents here }
  override def act(r: Registers): Unit = { // act contents here }
}
...
{code}

And then the list of rules in each state is changed to something like:

{code:scala}
val rules = ArrayBuffer(
  DescriptionOfRuleNumberOne,
  DescriptionOfRuleNumberTwo,
  ...
)
{code}

Once we do that for all rules, we can delete the Rule companion object and 
require that all rules have good descriptive nams.

But as I mentioned before, coming up with good descriptive names is the hard 
part. Alot of these Rules might not have obvious names, even to peopel very 
familiar with the code. And there's over 70 that need to be determined. Some 
are probably easier than others, but it's still not a easy task.

If you're up for the challenge, we're happen to help. But it might also be 
worth taking a look at some bugs we've labeled as "beginner" here:  


> Clarity: DFA Rules - eliminate anonymous classes and functions
> --------------------------------------------------------------
>
>                 Key: DAFFODIL-1437
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-1437
>             Project: Daffodil
>          Issue Type: Improvement
>          Components: Back End, Clean Ups
>            Reporter: Mike Beckerle
>            Priority: Major
>
> This is a lisp sort of thing to do as written, but there really is no reason 
> these can't be ordinary objects  with two methods, test and act. The code 
> would be clearer and easier to debug. Particularly if the classes have names 
> like 
> {code}
> final class StartState(states: => ArrayBuffer[State], val stateNum: Int) 
> extends State(states) {
>    type R = Registers // put into State base class, protected
>    object Got_EC_goto_ESCState extends Rule {
>        def test(r: R) = { ... }
>        def act(r: R) = { ... }
>      }
>   object Got_EEC_goto_ESCESCState extends Rule {
>        def test(r: R) = { ... }
>        def act(r: R) = { ... }
>      }
> ...
>   override val rules = ArrayBuffer(
>      Got_EC_goto_ESCState, 
>      Got_EEC_goto_EECState,
>     ...
>   )
> ...
> }
> {code}
> In the above, the nesting makes it clear each "Rule" which is a state-machine 
> transition, originates at a particular state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to