Re: [PR] Evaluate assert/discriminator expressions after groupContent: WIP [daffodil]

via GitHub Wed, 08 Jan 2025 11:10:48 -0800


mbeckerle commented on code in PR #987:
URL: https://github.com/apache/daffodil/pull/987#discussion_r1907644147



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/HasStatementsGrammarMixin.scala:
##########
@@ -23,11 +23,11 @@ trait HasStatementsGrammarMixin extends GrammarMixin { 
self: Term =>
 
   // Includes setVariable as well as assert/discriminator statements that

Review Comment:
   Not sure this comment is correct any more. expressionStatements doesn't 
include any setVariables per code above. 
   
   Perhaps we should call this asssertDiscrimExpressionStatements ? 



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/DFDLStatementMixin.scala:
##########
@@ -216,8 +216,8 @@ trait ProvidesDFDLStatementMixin extends ThrowsSDE with 
HasTermCheck {
 
   final lazy val patternStatements: Seq[DFDLStatement] = patternAsserts ++ 
patternDiscrims

Review Comment:
   I think we should put pattern discrims before the pattern asserts. But I am 
raising this ambiguity with the DFDL Workgroup on next call. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/PState.scala:
##########
@@ -588,6 +588,24 @@ final class PState private (
     }
   }
 
+  /**
+   * This function is used for cases where a parse must be performed after 
there
+   * has already been a failed parse of an enclosing element/sequence. Most
+   * common example of this would be a choice branch containing a sequence with
+   * an annotated assert expression. According to 9.5.2 of the DFDL spec this

Review Comment:
   This kind of mandatory evaluation applies only to discriminators, not 
"ordinary assert" assertions. 
   
   The last sentence of this comment "assert may be used as a discriminator"... 
I think we should rephrase and just use the term "discriminator" and avoid the 
blurring of assertions being a class which includes discrminators.



##########
daffodil-test/src/test/resources/org/apache/daffodil/section07/discriminators/discriminator.tdml:
##########
@@ -765,13 +814,66 @@
 
   <tdml:parserTestCase name="discrimExpression_04"
     root="e3" model="discrimExpression"
-    description="Property Name: test - DFDL-7-076R">
+    description="Verify that a discriminator will prevent backtracking
+      regardless of whether the sequence content parses succesfully">
     <tdml:document><![CDATA[abc]]></tdml:document>
+    <tdml:errors>
+      <tdml:error>Parse Error</tdml:error>
+      <tdml:error>Failed to populate ex:ints</tdml:error>
+      <tdml:error>Unable to parse xs:int from text: a</tdml:error>
+    </tdml:errors>
+  </tdml:parserTestCase>
+
+  <tdml:parserTestCase name="discrimExpression_05"
+    root="e4" model="discrimExpression"
+    description="Verify that a failed discriminator will be handled correctly
+    when the body of the sequence parses successfully">
+    <tdml:document><![CDATA[123]]></tdml:document>
     <tdml:infoset>
       <tdml:dfdlInfoset>
-        <e3>
+        <e4>
+          <string>123</string>
+        </e4>
+      </tdml:dfdlInfoset>
+    </tdml:infoset>
+  </tdml:parserTestCase>
+
+  <tdml:parserTestCase name="discrimExpression_06"
+    root="e4" model="discrimExpression"
+    description="Verify correct behavior when the body of a sequence fails and
+    the discriminator fails">
+    <tdml:document><![CDATA[abc]]></tdml:document>
+    <tdml:infoset>
+      <tdml:dfdlInfoset>
+        <e4>
           <string>abc</string>
-        </e3>
+        </e4>
+      </tdml:dfdlInfoset>
+    </tdml:infoset>
+  </tdml:parserTestCase>
+
+  <tdml:parserTestCase name="discrimExpression_07"
+    root="e5" model="discrimExpression"
+    description="Sequence body fails, but discriminator succeeds and references
+    elements in partial body infoset">
+    <tdml:document><![CDATA[1ab]]></tdml:document>
+    <tdml:errors>
+      <tdml:error>Parse Error</tdml:error>
+      <tdml:error>Failed to populate ex:ints</tdml:error>
+      <tdml:error>Unable to parse xs:int from text: a</tdml:error>
+    </tdml:errors>
+  </tdml:parserTestCase>
+
+  <tdml:parserTestCase name="discrimExpression_08"

Review Comment:
   I'm looking for a test where the sequence body fails, the discriminator 
succeeds and references things in the sequence body that in fact exist and are 
correct, so the behavior is as if the parse failed after the discrminator 
already evaluated to true. 
   
   The use case that motivates all this complexity is a bunch of record types, 
each a sequence of child elements, each record type has a discriminator that 
looks at one or a few of the fields of the record to decide if the data is in 
fact of this record type. Those discriminators *could* be hoisted and evaluated 
as soon as the fields of the record needed by the expressions have been parsed. 
If parsing of other later fields of the record then fails, the discriminator 
has already been set true, so those failures don't cause other record types to 
be attempted. Rather the whole choice fails. 
   
   This is very close conceptually to a choice by dispatch. Once the dispatch 
key has been computed and the branch selected, a subsequent failure does not 
cause backtracking of that choice, but a failure of the whole choice. 
   
   



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/DFDLStatementMixin.scala:
##########
@@ -216,8 +216,8 @@ trait ProvidesDFDLStatementMixin extends ThrowsSDE with 
HasTermCheck {
 
   final lazy val patternStatements: Seq[DFDLStatement] = patternAsserts ++ 
patternDiscrims
 
-  final lazy val lowPriorityStatements: Seq[DFDLStatement] =
-    setVariableStatements ++ nonPatternAsserts ++ nonPatternDiscrims
+  final lazy val expressionStatements: Seq[DFDLStatement] =
+    nonPatternAsserts ++ nonPatternDiscrims

Review Comment:
   This feels backward to me. We should evaluate the discriminators before any 
asserts. 
   Below I believe these just get put into the grammar in the order they appear 
here, so this is going to evaluate discriminators last after the assertions, 
which is still 'allowed' so long as we don't prevent them from evaluating when 
an assertion fails. But, we can avoid that complexity by just evaluating the 
discriminators first. 
   
   This hoist, where we move the discriminators before the asserts, is always 
allowed, because an assert can't produce anything that a discriminator needs. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -227,26 +227,31 @@ abstract class CombinatorParser(override val context: 
RuntimeData)
   extends Parser
   with CombinatorProcessor
 
-final class SeqCompParser(context: RuntimeData, val childParsers: 
Vector[Parser])
-  extends CombinatorParser(context) {
+final class SeqCompParser(
+  context: RuntimeData,
+  val childParsers: Vector[Parser],
+) extends CombinatorParser(context) {
   override lazy val runtimeDependencies = Vector()
   override def childProcessors = childParsers
 
   override def nom = "seq"
 
-  val numChildParsers = childParsers.size
+  val discrimExpressions = childParsers.collect{ case ae: 
AssertExpressionEvaluationParser if (ae.discrim) => ae }
+  val nonDiscrimChildren = childParsers.diff(discrimExpressions)
 
   def parse(pstate: PState): Unit = {
-    var i: Int = 0
-    while (i < numChildParsers) {
-      val parser = childParsers(i)
-      parser.parse1(pstate)
-      if (pstate.processorStatus ne Success)
-        return
-      i += 1
+    for (p <- nonDiscrimChildren) {

Review Comment:
   We use var and while loops in the runtime on purpose because many scala 
idioms don't generate as efficient code as just plain while loops.
   
   I don't know whether these 'for comprehensions' as they're called, generate 
a tight loop or a more expensive foreach/flatmap kind of thing. We should know 
that the byte code is equivalent to a tight while loop before rewriting loops 
as for comprehensions.  My concern is that a simple for comprehension might in 
fact create tight iteration, but once you start using more features of the for 
comprehensions the efficiency may change. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Evaluate assert/discriminator expressions after groupContent: WIP [daffodil]

Reply via email to