[
https://issues.apache.org/jira/browse/DAFFODIL-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046691#comment-18046691
]
Mike Beckerle commented on DAFFODIL-2831:
-----------------------------------------
That's a good observation. We could ball up all the initiators of any length
into a big list of them and create a scanner that will scan for any of them.
Then assign a unique integer based on which one was found, then use the
choice-by-dispatch like mechanism given the integer to select a branch. This is
then independent of whether the initiators are fixed or variable length,
whether some branches have multiple initiators as alternatives, etc. It even
allows initiators to have the various wildcards like '%WSP*;' in them.
One issue is making sure this still has sequential-order semantics. I.e., you
can't get a longer match to a delimiter expressed in a later branch rather than
a shorter match to an earlier branch. The spec says the behavior is equivalent
to the branches being tried one by one in sequence.
So if an earlier branch has dfdl:initiator="A"
a later branch has dfdl:initiator="AA"
The data is "AA123"
Then DFDL semantics is the first branch must win even though the later branch
has a longer match, and this is true even if using dfdl:initiatedContent="yes".
The longest match behavior is only within the various initiators for *one*
dfdl:initiator property, not across branches. I.e., if dfdl:initiator="A AA"
then it should find the "AA" and not just stop with "A".
This could be handled by verifying that no delimiter across branches can be a
prefix of a later delimiter. That's not an easy check if the delimiters are
full of wildcards and such, but one need not get a positive answer to this
question. It is either known that no earlier branch has an initiator that is a
prefix of a later one, or it is unknown, and we can only do the optimization if
it is known.
I would want a setting/property indicating that you do not want backtracking,
and if the optimization can't be done, to cause an SDE.
> InitiatedContent performance isn't equivalent to choicedispatchkey performance
> ------------------------------------------------------------------------------
>
> Key: DAFFODIL-2831
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2831
> Project: Daffodil
> Issue Type: Improvement
> Components: Middle "End", Performance
> Affects Versions: 3.5.0
> Reporter: Olabusayo Kilo
> Priority: Major
>
> One should achieve similar performance by an optimization of
> initiatedContent="yes". Having to do a by-hand optimization/workaround of
> choiceDispatchKey, with the corresponding ugly outputValueCalc, is definitely
> to be avoided.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)