mbeckerle commented on code in PR #987:
URL: https://github.com/apache/daffodil/pull/987#discussion_r1907644147
##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/HasStatementsGrammarMixin.scala:
##########
@@ -23,11 +23,11 @@ trait HasStatementsGrammarMixin extends GrammarMixin {
self: Term =>
// Includes setVariable as well as assert/discriminator statements that
Review Comment:
Not sure this comment is correct any more. expressionStatements doesn't
include any setVariables per code above.
Perhaps we should call this asssertDiscrimExpressionStatements ?
##########
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/DFDLStatementMixin.scala:
##########
@@ -216,8 +216,8 @@ trait ProvidesDFDLStatementMixin extends ThrowsSDE with
HasTermCheck {
final lazy val patternStatements: Seq[DFDLStatement] = patternAsserts ++
patternDiscrims
Review Comment:
I think we should put pattern discrims before the pattern asserts. But I am
raising this ambiguity with the DFDL Workgroup on next call.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -588,6 +588,24 @@ final class PState private (
}
}
+ /**
+ * This function is used for cases where a parse must be performed after
there
+ * has already been a failed parse of an enclosing element/sequence. Most
+ * common example of this would be a choice branch containing a sequence with
+ * an annotated assert expression. According to 9.5.2 of the DFDL spec this
Review Comment:
This kind of mandatory evaluation applies only to discriminators, not
"ordinary assert" assertions.
The last sentence of this comment "assert may be used as a discriminator"...
I think we should rephrase and just use the term "discriminator" and avoid the
blurring of assertions being a class which includes discrminators.
##########
daffodil-test/src/test/resources/org/apache/daffodil/section07/discriminators/discriminator.tdml:
##########
@@ -765,13 +814,66 @@
<tdml:parserTestCase name="discrimExpression_04"
root="e3" model="discrimExpression"
- description="Property Name: test - DFDL-7-076R">
+ description="Verify that a discriminator will prevent backtracking
+ regardless of whether the sequence content parses succesfully">
<tdml:document><![CDATA[abc]]></tdml:document>
+ <tdml:errors>
+ <tdml:error>Parse Error</tdml:error>
+ <tdml:error>Failed to populate ex:ints</tdml:error>
+ <tdml:error>Unable to parse xs:int from text: a</tdml:error>
+ </tdml:errors>
+ </tdml:parserTestCase>
+
+ <tdml:parserTestCase name="discrimExpression_05"
+ root="e4" model="discrimExpression"
+ description="Verify that a failed discriminator will be handled correctly
+ when the body of the sequence parses successfully">
+ <tdml:document><![CDATA[123]]></tdml:document>
<tdml:infoset>
<tdml:dfdlInfoset>
- <e3>
+ <e4>
+ <string>123</string>
+ </e4>
+ </tdml:dfdlInfoset>
+ </tdml:infoset>
+ </tdml:parserTestCase>
+
+ <tdml:parserTestCase name="discrimExpression_06"
+ root="e4" model="discrimExpression"
+ description="Verify correct behavior when the body of a sequence fails and
+ the discriminator fails">
+ <tdml:document><![CDATA[abc]]></tdml:document>
+ <tdml:infoset>
+ <tdml:dfdlInfoset>
+ <e4>
<string>abc</string>
- </e3>
+ </e4>
+ </tdml:dfdlInfoset>
+ </tdml:infoset>
+ </tdml:parserTestCase>
+
+ <tdml:parserTestCase name="discrimExpression_07"
+ root="e5" model="discrimExpression"
+ description="Sequence body fails, but discriminator succeeds and references
+ elements in partial body infoset">
+ <tdml:document><![CDATA[1ab]]></tdml:document>
+ <tdml:errors>
+ <tdml:error>Parse Error</tdml:error>
+ <tdml:error>Failed to populate ex:ints</tdml:error>
+ <tdml:error>Unable to parse xs:int from text: a</tdml:error>
+ </tdml:errors>
+ </tdml:parserTestCase>
+
+ <tdml:parserTestCase name="discrimExpression_08"
Review Comment:
I'm looking for a test where the sequence body fails, the discriminator
succeeds and references things in the sequence body that in fact exist and are
correct, so the behavior is as if the parse failed after the discrminator
already evaluated to true.
The use case that motivates all this complexity is a bunch of record types,
each a sequence of child elements, each record type has a discriminator that
looks at one or a few of the fields of the record to decide if the data is in
fact of this record type. Those discriminators *could* be hoisted and evaluated
as soon as the fields of the record needed by the expressions have been parsed.
If parsing of other later fields of the record then fails, the discriminator
has already been set true, so those failures don't cause other record types to
be attempted. Rather the whole choice fails.
This is very close conceptually to a choice by dispatch. Once the dispatch
key has been computed and the branch selected, a subsequent failure does not
cause backtracking of that choice, but a failure of the whole choice.
##########
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/DFDLStatementMixin.scala:
##########
@@ -216,8 +216,8 @@ trait ProvidesDFDLStatementMixin extends ThrowsSDE with
HasTermCheck {
final lazy val patternStatements: Seq[DFDLStatement] = patternAsserts ++
patternDiscrims
- final lazy val lowPriorityStatements: Seq[DFDLStatement] =
- setVariableStatements ++ nonPatternAsserts ++ nonPatternDiscrims
+ final lazy val expressionStatements: Seq[DFDLStatement] =
+ nonPatternAsserts ++ nonPatternDiscrims
Review Comment:
This feels backward to me. We should evaluate the discriminators before any
asserts.
Below I believe these just get put into the grammar in the order they appear
here, so this is going to evaluate discriminators last after the assertions,
which is still 'allowed' so long as we don't prevent them from evaluating when
an assertion fails. But, we can avoid that complexity by just evaluating the
discriminators first.
This hoist, where we move the discriminators before the asserts, is always
allowed, because an assert can't produce anything that a discriminator needs.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -227,26 +227,31 @@ abstract class CombinatorParser(override val context:
RuntimeData)
extends Parser
with CombinatorProcessor
-final class SeqCompParser(context: RuntimeData, val childParsers:
Vector[Parser])
- extends CombinatorParser(context) {
+final class SeqCompParser(
+ context: RuntimeData,
+ val childParsers: Vector[Parser],
+) extends CombinatorParser(context) {
override lazy val runtimeDependencies = Vector()
override def childProcessors = childParsers
override def nom = "seq"
- val numChildParsers = childParsers.size
+ val discrimExpressions = childParsers.collect{ case ae:
AssertExpressionEvaluationParser if (ae.discrim) => ae }
+ val nonDiscrimChildren = childParsers.diff(discrimExpressions)
def parse(pstate: PState): Unit = {
- var i: Int = 0
- while (i < numChildParsers) {
- val parser = childParsers(i)
- parser.parse1(pstate)
- if (pstate.processorStatus ne Success)
- return
- i += 1
+ for (p <- nonDiscrimChildren) {
Review Comment:
We use var and while loops in the runtime on purpose because many scala
idioms don't generate as efficient code as just plain while loops.
I don't know whether these 'for comprehensions' as they're called, generate
a tight loop or a more expensive foreach/flatmap kind of thing. We should know
that the byte code is equivalent to a tight while loop before rewriting loops
as for comprehensions. My concern is that a simple for comprehension might in
fact create tight iteration, but once you start using more features of the for
comprehensions the efficiency may change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]