tuxji commented on code in PR #901:
URL: https://github.com/apache/daffodil/pull/901#discussion_r1066344188


##########
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/daffodil_main.c:
##########
@@ -88,6 +88,15 @@ main(int argc, char *argv[])
         XMLWriter xmlWriter = {xmlWriterMethods, output, {NULL, NULL, 0}};
         error = walkInfoset((VisitEventHandler *)&xmlWriter, root);
         continue_or_exit(error);
+
+        // Any diagnostics will fail the parse if validate mode is on
+        if (daffodil_parse.validate && pstate.diagnostics)
+        {
+            static Error error = {CLI_DIAGNOSTICS, {0}};
+
+            error.arg.d64 = pstate.diagnostics->length;
+            continue_or_exit(&error);
+        }

Review Comment:
   In runtime2, `pstate.diagnostics` accumulates only validation diagnostics.  
There are no processor diagnostics and only one processor error object.  Any 
problem while parsing immediately skips the rest of parsing and returns only 
one failure object in `pstate.error` directly to `daffodil_main`.  The `-V 
on|limited` option turns the validation diagnostics in `pstate.diagnostics` 
from validation warnings (exit status 0) to validation errors (exit status 1).  
I realize this simplification makes runtime2 different from runtime1.  I don't 
know if we will ever need runtime2 to accumulate both processor and validation 
warnings and errors all mixed together in `pstate.diagostics`.



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/BinaryValueCodeGenerator.scala:
##########
@@ -21,81 +21,65 @@ import org.apache.daffodil.dsom.ElementBase
 import org.apache.daffodil.schema.annotation.props.gen.BitOrder
 import org.apache.daffodil.schema.annotation.props.gen.ByteOrder
 import org.apache.daffodil.schema.annotation.props.gen.LengthKind
-import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind
 import org.apache.daffodil.util.Maybe.Nope
 
 // Base trait which provides common code to generate C code for primitive 
value elements
 trait BinaryValueCodeGenerator {
 
-  // Generate C code for a primitive value element by calculating how many 
times the element occurs
-  // and applying the given partially applied function values the necessary 
number of times.  Intended
-  // to be called by other traits which extend this trait, not directly by 
Runtime2CodeGenerator.
-  def binaryValueGenerateCode(e: ElementBase, addField: String => Unit, 
validateFixed: String => Unit): Unit = {
+  // Generate C code for a primitive value element differently depending on 
how many times the element occurs.
+  // Intended to be called by other traits which extend this trait, not 
directly by Runtime2CodeGenerator.
+  def binaryValueGenerateCode(e: ElementBase, addField: String => Unit, 
validateFixed: String => Unit, cgState: CodeGeneratorState): Unit = {
     // For the time being this is a very limited back end.
     // So there are some restrictions to enforce.
     e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
     e.schemaDefinitionUnless(e.maybeByteOrderEv == Nope || 
e.byteOrderEv.isConstant, "Runtime dfdl:byteOrder expressions not supported.")
     e.schemaDefinitionUnless(e.lengthKind == LengthKind.Prefixed || 
e.elementLengthInBitsEv.isConstant, "Runtime dfdl:length expressions not 
supported.")
 
-    // Calculate how many times the element occurs in the data.
-    val arraySize = e.occursCountKind match {
-      case OccursCountKind.Fixed if e.maxOccurs > 1 => e.maxOccurs
-      case OccursCountKind.Fixed if e.maxOccurs == 1 => 0
-      case OccursCountKind.Implicit if e.minOccurs == 1 && e.maxOccurs == 1 => 0
-      case _ => e.SDE("occursCountKind %s minOccurs %d maxOccurs %d is not 
supported in C code generator",
-        e.occursCountKind.toString, e.minOccurs, e.maxOccurs)
+    // Call the given partially applied function values with their remaining 
unbound argument (deref)
+    val hasArray = cgState.arrayMaxOccurs(e) > 0

Review Comment:
   Please suggest a less confusing name than `arrayMaxOccurs` which is actually 
trying to do several things at the same time.  It's trying to convert all the 
elements it sees into arrays that it can handle (maxOccurs > 0), scalars that 
it can ignore (fixed maxOccurs=1 arrays turn into 0), or SDEs 
(OccursCountKind.Parse will call e.SDE("... is not supported in C code 
generator") as you expect).



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala:
##########
@@ -58,38 +56,209 @@ class CodeGeneratorState(private val root: ElementBase) {
     sb
   }
 
+  // Returns a name for the given element's element resource descriptor

Review Comment:
   I must have gotten mixed up at the time I wrote this comment.  ERD indeed 
means "element runtime data needed to parse/unparse objects" in runtime2 as 
well (a comment in infoset.h at line 81 says so).  The ERDs in runtime2 are 
statically initialized data structures with constant values that are never 
changed at runtime.  The only "runtime" thing about these data structures 
(which could be called element runtime descriptors, but not element resource 
descriptors) is that an element's ERD pointer might point to one out of several 
different ERD descriptors at runtime if the element is a choice element.



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala:
##########
@@ -58,38 +56,209 @@ class CodeGeneratorState(private val root: ElementBase) {
     sb
   }
 
+  // Returns a name for the given element's element resource descriptor
+  private def erdName(context: ElementBase): String = {
+    val sb = buildName(context, new StringBuilder) ++= "ERD"
+    val name = sb.toString
+    name
+  }
+
   // Returns a name for the given element's C struct identifier
   private def cStructName(context: ElementBase): String = {
     val sb = buildName(context, new StringBuilder)
     val name = sb.toString
     name
   }
 
-  // Returns a name for the given element's element resource descriptor
-  private def erdName(context: ElementBase): String = {
-    val sb = buildName(context, new StringBuilder) ++= "ERD"
-    val name = sb.toString
-    name
+  // Returns the notation needed to access a C struct field.  We make some 
simplifying
+  // assumptions to make generating the field access easier:
+  // - the expression contains only a relative or absolute path, nothing else 
(e.g.,
+  //   the expression doesn't call any functions or perform any computation)
+  // - we can convert an absolute path to a rootElement()-> indirection
+  // - we can convert a relative path beginning with up dirs to a parents-> 
indirection
+  // - we can convert a relative path without any up dirs to an instance-> 
indirection
+  // - we can convert slashes in the path to dots in a C struct field access 
notation
+  private def cStructFieldAccess(expr: String): String = {
+    // Strip any namespace prefixes from expr and the root element's local 
name since
+    // C code uses only struct fields' names.
+    val rootName = root.namedQName.local
+    val exprPath = expr.replaceAll("/[^/:]+:", "/").stripPrefix(s"/$rootName")
+    val fieldAccess = if (exprPath.startsWith("/")) {
+      // Convert exprPath to a rootElement()-> indirection
+      val C = cStructName(root)
+      s"""(($C *)rootElement())->${exprPath.stripPrefix("/")}"""
+    } else if (exprPath.startsWith("../")) {
+      // Split exprPath into the up dirs and after the up dirs
+      val afterUpDirs = exprPath.split("\\.\\./").mkString
+      val upDirs = exprPath.stripSuffix(afterUpDirs)
+      // Count how many up dirs there are
+      val nUpDirs = upDirs.split('/').length
+      // Go up the stack that many times to get that struct's C type
+      val C = structs(nUpDirs).C
+      // Convert the up dirs to parents
+      val parents = upDirs.replaceAllLiterally("../", 
"parent->").stripSuffix("->")
+      // Convert exprPath to a parents-> indirection
+      s"""(($C *)instance->_base.$parents)->$afterUpDirs"""
+    } else {
+      // Convert exprPath to an instance-> indirection
+      s"""instance->$exprPath"""
+    }
+    // Finally, convert the field access to C struct dot notation
+    val notation = fieldAccess.replace('/', '.')
+    notation
   }

Review Comment:
   Yes, I agree that this method is fragile.  This method's code to parse the 
original DPath expression as a string broke in tests and had to go through 
several iterations before arriving at this point and a DFDL expression walker 
over a compiled DPath expression might do a better job converting the 
expression to C code.  Does the compiled DPath support a walk API at this time 
or if not, how hard would it be to add the walk API?



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala:
##########
@@ -168,64 +330,42 @@ class CodeGeneratorState(private val root: ElementBase) {
     qNameInit
   }
 
-  /**
-   * We want to convert a choiceDispatchKey expression into C struct dot
-   * notation (rootElement->[subElement.field]) which will access the C
-   * struct field containing the choiceDispatchKey's runtime value.
-   *
-   * We make some assumptions to make generating the dot notation easier:
-   * - the expression starts with '{xs:string( and ends with )}'
-   * - the expression returns the value of a previous element without
-   *   changing the value in any way (except converting it to xs:string)
-   * - both the expression and the C code use only local names (for now...)
-   * - we can map the context node's path to a Unix-like slash path
-   * - all dpath operations look like Unix-like relative paths (../tag)
-   * - we can normalize the new path and convert it to C struct dot notation
-   * - we can store the accessed value in an int64_t local variable safely
-   */
-  private def choiceDispatchField(context: ElementBase): String = {
-    // We want to call SchemaComponent.scPath but it's private so duplicate it 
here for now
-    def scPath(sc: SchemaComponent): Seq[SchemaComponent] = 
sc.optLexicalParent.map { scPath }.getOrElse(Nil) :+ sc
-
-    // We handle only direct dispatch choices, so ignore other elements
-    context.complexType.modelGroup match {
-      case choice: Choice if choice.isDirectDispatch =>
-        // Get parent path against which to perform up paths
-        val parentNames = scPath(context).map {
-          case _: Root => ""
-          case er: AbstractElementRef => er.refQName.local
-          case eb: ElementBase => eb.namedQName.local
-          case ed: GlobalElementDecl => ed.namedQName.local
-          case _ => ""
-        }
-        val parentPath = parentNames.mkString("/")
-
-        // Convert expression to a relative path (may have up paths)
-        val expr = 
choice.choiceDispatchKeyEv.expr.toBriefXML().filterNot(_.isWhitespace)
-        val before = "'{xs:string("
-        val after = ")}'"
-        val relativePath = if (expr.startsWith(before) && expr.endsWith(after))
-          expr.substring(before.length, expr.length - after.length) else expr
-
-        // Remove redundant slashes (//) and up paths (../)
-        val normalizedURI = new URI(parentPath + "/" + relativePath).normalize
-
-        // Strip namespace prefixes since C code uses only local names (for 
now)
-        val dispatchPath = normalizedURI.getPath.replaceAll("/[^/:]+:", "/")
+  // Returns true if the element has not been seen before (checking if a
+  // map already contains the element, otherwise adding it to the map)
+  def elementNotSeenYet(context: ElementBase): Boolean = {
+    val key = cStructName(context)
+    val alreadySeen = elementsAlreadySeen.contains(key)
+    if (!alreadySeen)
+      elementsAlreadySeen += (key -> context)
+    !alreadySeen
+  }
 
-        // Convert to C struct dot notation without any leading dot
-        val notation = dispatchPath.replace('/', '.').substring(1)
-        notation
-      // We get called on every group element, so we need to return "" for 
non-choice elements
-      case _ => ""
+  // Returns the given element's maxOccurs if it is an array element
+  // with a finite maxOccurs > 0, otherwise returns zero for scalar
+  // elements and array elements with unbounded maxOccurs (we don't
+  // support unbounded arrays in C right now)
+  def arrayMaxOccurs(e: ElementBase): Int = {

Review Comment:
   Perhaps a better name might be `getElementMaxOccurs`?  The special zero 
value just means that `val hasArray = getElementMaxOccurs(e) > 0` is false for 
both scalar elements and fixed arrays with maxOccurs=1, but we don't HAVE to 
give the second case special handling.  Schema writers are very unlikely to 
declare such a fixed array with only one occurrence since that's only a 
slightly more verbose way of declaring a scalar element.  



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala:
##########
@@ -404,76 +542,117 @@ class CodeGeneratorState(private val root: ElementBase) {
     finalStructs += struct
   }
 
-  def addSimpleTypeStatements(initERDStatement: String, initSelfStatement: 
String, parseStatement: String, unparseStatement: String): Unit = {
-    if (initERDStatement.nonEmpty) structs.top.initERDStatements += 
initERDStatement
-    if (initSelfStatement.nonEmpty) structs.top.initSelfStatements += 
initSelfStatement
+  // Add C code to initialize, parse, and unparse a primitive value element
+  def addSimpleTypeStatements(initERDStatement: String, parseStatement: 
String, unparseStatement: String): Unit = {
+    if (initERDStatement.nonEmpty) {
+      if (hasChoice)
+        structs.top.initChoiceStatements += initERDStatement
+      else
+        structs.top.initERDStatements += initERDStatement
+    }
     if (parseStatement.nonEmpty) structs.top.parserStatements += parseStatement
     if (unparseStatement.nonEmpty) structs.top.unparserStatements += 
unparseStatement
   }
 
-  def addComplexTypeStatements(child: ElementBase): Unit = {
+  // Generate C code to initialize, parse, and unparse a child element
+  def addPerChildStatements(child: ElementBase): Unit = {
     val C = cStructName(child)
-    val e = child.name
-    val hasChoice = structs.top.initChoiceStatements.nonEmpty
-    val arraySize = if (child.occursCountKind == OccursCountKind.Fixed) 
child.maxOccurs else 0
+    val e = child.namedQName.local
 
+    // Generate a choice statement case if the child is in a choice element
+    val hasChoice = structs.top.initChoiceStatements.nonEmpty
     if (hasChoice) {
+      val position = child.position
       structs.top.initChoiceStatements ++= 
ChoiceBranchKeyCooker.convertConstant(
         child.choiceBranchKey, child, forUnparse = false).map { key => s"    
case $key:"}
-
-      val offset = child.position - 1
-      val initChoiceStatement = s"        instance->_choice = $offset;"
-      val parseStatement = s"    case $offset:"
-      val unparseStatement = s"    case $offset:"
-
-      structs.top.initChoiceStatements += initChoiceStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+      structs.top.initChoiceStatements += s"        instance->_choice = 
$position;"
+      structs.top.parserStatements += s"    case $position:"
+      structs.top.unparserStatements += s"    case $position:"
     }
+    val indent1 = if (hasChoice) "    " else ""
 
-    def addStatements(deref: String): Unit = {
-      val initChoiceStatement = s"        ${C}_initERD(&instance->$e$deref);"
-      val initERDStatement = s"    ${C}_initERD(&instance->$e$deref);"
-      val initSelfStatement = s"    ${C}_initSelf(&instance->$e$deref);"
-      val moreIndent = if (hasChoice) "    " else ""
-      val parseStatement =
-        s"""$moreIndent    ${C}_parseSelf(&instance->$e$deref, pstate);
-           |$moreIndent    if (pstate->error) return;""".stripMargin
-      val unparseStatement =
-        s"""$moreIndent    ${C}_unparseSelf(&instance->$e$deref, ustate);
-           |$moreIndent    if (ustate->error) return;""".stripMargin
+    // Add an array if the child occurs multiple times
+    val hasArray = arrayMaxOccurs(child) > 0
+    val deref = if (hasArray) "[i]" else ""
+    val indent2 = if (hasArray) "    " else ""
+    if (hasArray) pushArray(child)
 
+    // Generate statements to initialize, parse, and unparse the child
+    Runtime2CodeGenerator.generateCode(child.enclosedElement, this)
+    if (!child.isSimpleType) {
       if (hasChoice)
-        structs.top.initChoiceStatements += initChoiceStatement
+        structs.top.initChoiceStatements += s"$indent2        
${C}_initERD(&instance->$e$deref, (InfosetBase *)instance);"
       else
-        structs.top.initERDStatements += initERDStatement
-      structs.top.initSelfStatements += initSelfStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+        structs.top.initERDStatements += s"$indent2    
${C}_initERD(&instance->$e$deref, (InfosetBase *)instance);"
+      structs.top.parserStatements +=
+        s"""$indent1$indent2    ${C}_parseSelf(&instance->$e$deref, pstate);
+           |$indent1$indent2    if (pstate->error) return;""".stripMargin
+      structs.top.unparserStatements +=
+        s"""$indent1$indent2    ${C}_unparseSelf(&instance->$e$deref, ustate);
+           |$indent1$indent2    if (ustate->error) return;""".stripMargin
     }
-    if (arraySize > 0)
-      for (i <- 0 until arraySize)
-        addStatements(s"[$i]")
-    else
-      addStatements("")
 
-    if (hasChoice) {
-      val initChoiceStatement = s"        break;"
-      val parseStatement = s"        break;"
-      val unparseStatement = s"        break;"
+    // Generate statements to align the next element
+    val mustAlign = child.endingAlignmentApprox.nBits > 
child.priorAlignmentApprox.nBits
+    if (mustAlign) {
+      val nBits = child.endingAlignmentApprox.nBits
+      structs.top.parserStatements +=
+        s"""$indent1$indent2    // Fill to closest alignment
+           |$indent1$indent2    parse_fill_bits((pstate->bitPos0b + ${nBits - 
1}) / $nBits * $nBits, pstate);
+           |$indent1$indent2    if (pstate->error) return;""".stripMargin
+      structs.top.unparserStatements +=
+        s"""$indent1$indent2    // Fill to closest alignment
+           |$indent1$indent2    unparse_fill_bits((ustate->bitPos0b + ${nBits 
- 1}) / $nBits * $nBits, '\\0', ustate);
+           |$indent1$indent2    if (ustate->error) return;""".stripMargin
+    }
 
-      structs.top.initChoiceStatements += initChoiceStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+    // Terminate the array if the child occurs multiple times
+    if (hasArray) popArray(child)
+
+    // Generate a choice statement break if the child is in a choice element
+    if (hasChoice) {
+      val break = s"        break;"
+      structs.top.initChoiceStatements += break
+      structs.top.parserStatements += break
+      structs.top.unparserStatements += break
     }
   }
 
+  private def pushArray(e: ElementBase): Unit = {
+    val C = structs.top.C
+    val name = "%5Bi%5D"

Review Comment:
   This is holdover code from an earlier attempt to convert a DPath occursCount 
expression to C code.  Without going into too many details, I tried to use the 
name field to build a normalized URL before converting it to C struct dot 
notation, but that approach needed to use an `[i]` dereference with an i of the 
right value which wasn't feasible.  I'll remove the `name` field since I was 
keeping it only for debugging to help see the difference between duplicate 
"structs" with the same class (C) in the stack at this point.



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/HexBinaryCodeGenerator.scala:
##########
@@ -60,76 +62,70 @@ trait HexBinaryCodeGenerator extends 
BinaryValueCodeGenerator {
     val primType = s"$intType$intLen"
     val conv = if (e.prefixedLengthElementDecl.byteOrderEv.constValue == 
ByteOrder.BigEndian) "be" else "le"
     val function = s"${conv}_$primType"
-    val i = if (deref.length > 2) deref.substring(1, deref.length - 1) else ""
-    val lenVar = s"_l_$localName$i"
+    val lenVar = s"_l_$localName"
 
     val initERDStatement =
-      s"""    $field.array = NULL;
-         |    $field.lengthInBytes = 0;
-         |    $field.dynamic = true;""".stripMargin
-    val initSelfStatement = ""
+      s"""$indent1$indent2    $field.dynamic = true;""".stripMargin
     val parseStatement =
-      s"""    ${primType}_t $lenVar;
-         |    parse_$function(&$lenVar, $intLen, pstate);
-         |    if (pstate->error) return;
-         |    alloc_hexBinary(&$field, $lenVar, pstate);
-         |    if (pstate->error) return;
-         |    parse_hexBinary(&$field, pstate);
-         |    if (pstate->error) return;""".stripMargin
+      s"""$indent1$indent2    ${primType}_t $lenVar;
+         |$indent1$indent2    parse_$function(&$lenVar, $intLen, pstate);
+         |$indent1$indent2    if (pstate->error) return;
+         |$indent1$indent2    alloc_hexBinary(&$field, $lenVar, pstate);
+         |$indent1$indent2    if (pstate->error) return;
+         |$indent1$indent2    parse_hexBinary(&$field, pstate);
+         |$indent1$indent2    if (pstate->error) return;""".stripMargin
     val unparseStatement =
-      s"""    unparse_$function($field.lengthInBytes, $intLen, ustate);
-         |    if (ustate->error) return;
-         |    unparse_hexBinary($field, ustate);
-         |    if (ustate->error) return;""".stripMargin
-    cgState.addSimpleTypeStatements(initERDStatement, initSelfStatement, 
parseStatement, unparseStatement)
+      s"""$indent1$indent2    unparse_$function($field.lengthInBytes, $intLen, 
ustate);
+         |$indent1$indent2    if (ustate->error) return;
+         |$indent1$indent2    unparse_hexBinary($field, ustate);
+         |$indent1$indent2    if (ustate->error) return;""".stripMargin
+    cgState.addSimpleTypeStatements(initERDStatement, parseStatement, 
unparseStatement)
   }
 
   // Generate C code to initialize, parse, and unparse a hexBinary specified 
length element
   private def hexBinarySpecifiedLengthAddField(e: ElementBase, deref: String, 
cgState: CodeGeneratorState): Unit = {
+    val indent1 = if (cgState.hasChoice) "    " else ""
+    val indent2 = if (deref.nonEmpty) "    " else ""
     val localName = e.namedQName.local
     val field = s"instance->$localName$deref"
     val fieldArray = s"instance->_a_$localName$deref"
     val specifiedLength = e.elementLengthInBitsEv.constValue.get
 
     val initERDStatement = if (specifiedLength > 0)
-      s"""    $field.array = $fieldArray;
-         |    $field.lengthInBytes = sizeof($fieldArray);
-         |    $field.dynamic = false;""".stripMargin
+      s"""$indent1$indent2    $field.array = $fieldArray;
+         |$indent1$indent2    $field.lengthInBytes = sizeof($fieldArray);
+         |$indent1$indent2    $field.dynamic = false;""".stripMargin
     else
-      s"""    $field.array = NULL;
-         |    $field.lengthInBytes = 0;
-         |    $field.dynamic = false;""".stripMargin
-    val initSelfStatement = if (specifiedLength > 0)
-      s"    memset($fieldArray, 0x77, sizeof($fieldArray));"
-    else
-      ""
+      s"""$indent1$indent2    $field.array = NULL;
+         |$indent1$indent2    $field.lengthInBytes = 0;
+         |$indent1$indent2    $field.dynamic = false;""".stripMargin
     val parseStatement =
-      s"""    parse_hexBinary(&$field, pstate);
-         |    if (pstate->error) return;""".stripMargin
+      s"""$indent1$indent2    parse_hexBinary(&$field, pstate);
+         |$indent1$indent2    if (pstate->error) return;""".stripMargin
     val unparseStatement =
-      s"""    unparse_hexBinary($field, ustate);
-         |    if (ustate->error) return;""".stripMargin
-    cgState.addSimpleTypeStatements(initERDStatement, initSelfStatement, 
parseStatement, unparseStatement)
+      s"""$indent1$indent2    unparse_hexBinary($field, ustate);
+         |$indent1$indent2    if (ustate->error) return;""".stripMargin
+    cgState.addSimpleTypeStatements(initERDStatement, parseStatement, 
unparseStatement)
   }
 
   // Generate C code to validate a hexBinary element against its fixed value
   private def hexBinaryValidateFixed(e: ElementBase, deref: String, cgState: 
CodeGeneratorState): Unit = {
+    val indent1 = if (cgState.hasChoice) "    " else ""
+    val indent2 = if (deref.nonEmpty) "    " else ""
     val localName = e.namedQName.local
     val field = s"instance->$localName$deref"
-    val i = if (deref.length > 2) deref.substring(1, deref.length - 1) else ""
-    val fixed = s"${localName}_fixed$i"
+    val fixed = s"${localName}_fixed"
     val array = e.fixedValueAsString.grouped(2).mkString("0x", ", 0x", "")
 
     val initERDStatement = ""
-    val initSelfStatement = ""
     val parseStatement =
-      s"""    uint8_t $fixed[] = {$array};
-         |    parse_validate_fixed(memcmp($field.array, $fixed, 
sizeof($fixed)) == 0, "$localName", pstate);
-         |    if (pstate->error) return;""".stripMargin
+      s"""$indent1$indent2    uint8_t $fixed[] = {$array};
+         |$indent1$indent2    parse_validate_fixed(memcmp($field.array, 
$fixed, sizeof($fixed)) == 0, "$localName", pstate);
+         |$indent1$indent2    if (pstate->error) return;""".stripMargin
     val unparseStatement =
-      s"""    uint8_t $fixed[] = {$array};
-         |    unparse_validate_fixed(memcmp($field.array, $fixed, 
sizeof($fixed)) == 0, "$localName", ustate);
-         |    if (ustate->error) return;""".stripMargin
-    cgState.addSimpleTypeStatements(initERDStatement, initSelfStatement, 
parseStatement, unparseStatement)
+      s"""$indent1$indent2    uint8_t $fixed[] = {$array};
+         |$indent1$indent2    unparse_validate_fixed(memcmp($field.array, 
$fixed, sizeof($fixed)) == 0, "$localName", ustate);
+         |$indent1$indent2    if (ustate->error) return;""".stripMargin

Review Comment:
   I like the idea, but how about calling the function `stripMargin(indent: 
Int)` if Scala allows us to overload the original `stripMargin` function?  
Keeping the original name and allowing any number of spaces makes the function 
more general and powerful.
   
   ```scala
   val indent = 2 + 2 * Seq(cgState.hasChoice, deref.nonEmpty).count(_ == true)
   
   s"""|uint8_t $fixed[] = {$array};
       |unparse_validate_fixed(memcmp($field.array, $fixed, sizeof($fixed)) == 
0, "$localName", ustate);
       |if (ustate->error) return;
       |""".stripMargin(indent)
   ```



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala:
##########
@@ -404,76 +542,117 @@ class CodeGeneratorState(private val root: ElementBase) {
     finalStructs += struct
   }
 
-  def addSimpleTypeStatements(initERDStatement: String, initSelfStatement: 
String, parseStatement: String, unparseStatement: String): Unit = {
-    if (initERDStatement.nonEmpty) structs.top.initERDStatements += 
initERDStatement
-    if (initSelfStatement.nonEmpty) structs.top.initSelfStatements += 
initSelfStatement
+  // Add C code to initialize, parse, and unparse a primitive value element
+  def addSimpleTypeStatements(initERDStatement: String, parseStatement: 
String, unparseStatement: String): Unit = {
+    if (initERDStatement.nonEmpty) {
+      if (hasChoice)
+        structs.top.initChoiceStatements += initERDStatement
+      else
+        structs.top.initERDStatements += initERDStatement
+    }
     if (parseStatement.nonEmpty) structs.top.parserStatements += parseStatement
     if (unparseStatement.nonEmpty) structs.top.unparserStatements += 
unparseStatement
   }
 
-  def addComplexTypeStatements(child: ElementBase): Unit = {
+  // Generate C code to initialize, parse, and unparse a child element
+  def addPerChildStatements(child: ElementBase): Unit = {
     val C = cStructName(child)
-    val e = child.name
-    val hasChoice = structs.top.initChoiceStatements.nonEmpty
-    val arraySize = if (child.occursCountKind == OccursCountKind.Fixed) 
child.maxOccurs else 0
+    val e = child.namedQName.local
 
+    // Generate a choice statement case if the child is in a choice element
+    val hasChoice = structs.top.initChoiceStatements.nonEmpty
     if (hasChoice) {
+      val position = child.position
       structs.top.initChoiceStatements ++= 
ChoiceBranchKeyCooker.convertConstant(
         child.choiceBranchKey, child, forUnparse = false).map { key => s"    
case $key:"}
-
-      val offset = child.position - 1
-      val initChoiceStatement = s"        instance->_choice = $offset;"
-      val parseStatement = s"    case $offset:"
-      val unparseStatement = s"    case $offset:"
-
-      structs.top.initChoiceStatements += initChoiceStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+      structs.top.initChoiceStatements += s"        instance->_choice = 
$position;"
+      structs.top.parserStatements += s"    case $position:"
+      structs.top.unparserStatements += s"    case $position:"
     }
+    val indent1 = if (hasChoice) "    " else ""
 
-    def addStatements(deref: String): Unit = {
-      val initChoiceStatement = s"        ${C}_initERD(&instance->$e$deref);"
-      val initERDStatement = s"    ${C}_initERD(&instance->$e$deref);"
-      val initSelfStatement = s"    ${C}_initSelf(&instance->$e$deref);"
-      val moreIndent = if (hasChoice) "    " else ""
-      val parseStatement =
-        s"""$moreIndent    ${C}_parseSelf(&instance->$e$deref, pstate);
-           |$moreIndent    if (pstate->error) return;""".stripMargin
-      val unparseStatement =
-        s"""$moreIndent    ${C}_unparseSelf(&instance->$e$deref, ustate);
-           |$moreIndent    if (ustate->error) return;""".stripMargin
+    // Add an array if the child occurs multiple times
+    val hasArray = arrayMaxOccurs(child) > 0
+    val deref = if (hasArray) "[i]" else ""
+    val indent2 = if (hasArray) "    " else ""
+    if (hasArray) pushArray(child)
 
+    // Generate statements to initialize, parse, and unparse the child
+    Runtime2CodeGenerator.generateCode(child.enclosedElement, this)
+    if (!child.isSimpleType) {
       if (hasChoice)
-        structs.top.initChoiceStatements += initChoiceStatement
+        structs.top.initChoiceStatements += s"$indent2        
${C}_initERD(&instance->$e$deref, (InfosetBase *)instance);"
       else
-        structs.top.initERDStatements += initERDStatement
-      structs.top.initSelfStatements += initSelfStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+        structs.top.initERDStatements += s"$indent2    
${C}_initERD(&instance->$e$deref, (InfosetBase *)instance);"
+      structs.top.parserStatements +=
+        s"""$indent1$indent2    ${C}_parseSelf(&instance->$e$deref, pstate);
+           |$indent1$indent2    if (pstate->error) return;""".stripMargin
+      structs.top.unparserStatements +=
+        s"""$indent1$indent2    ${C}_unparseSelf(&instance->$e$deref, ustate);
+           |$indent1$indent2    if (ustate->error) return;""".stripMargin
     }
-    if (arraySize > 0)
-      for (i <- 0 until arraySize)
-        addStatements(s"[$i]")
-    else
-      addStatements("")
 
-    if (hasChoice) {
-      val initChoiceStatement = s"        break;"
-      val parseStatement = s"        break;"
-      val unparseStatement = s"        break;"
+    // Generate statements to align the next element
+    val mustAlign = child.endingAlignmentApprox.nBits > 
child.priorAlignmentApprox.nBits
+    if (mustAlign) {
+      val nBits = child.endingAlignmentApprox.nBits
+      structs.top.parserStatements +=
+        s"""$indent1$indent2    // Fill to closest alignment
+           |$indent1$indent2    parse_fill_bits((pstate->bitPos0b + ${nBits - 
1}) / $nBits * $nBits, pstate);
+           |$indent1$indent2    if (pstate->error) return;""".stripMargin
+      structs.top.unparserStatements +=
+        s"""$indent1$indent2    // Fill to closest alignment
+           |$indent1$indent2    unparse_fill_bits((ustate->bitPos0b + ${nBits 
- 1}) / $nBits * $nBits, '\\0', ustate);
+           |$indent1$indent2    if (ustate->error) return;""".stripMargin
+    }
 
-      structs.top.initChoiceStatements += initChoiceStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+    // Terminate the array if the child occurs multiple times
+    if (hasArray) popArray(child)
+
+    // Generate a choice statement break if the child is in a choice element
+    if (hasChoice) {
+      val break = s"        break;"
+      structs.top.initChoiceStatements += break
+      structs.top.parserStatements += break
+      structs.top.unparserStatements += break
     }
   }
 
+  private def pushArray(e: ElementBase): Unit = {
+    val C = structs.top.C
+    val name = "%5Bi%5D"
+    structs.push(new ComplexCGState(C, name))
+  }
+
+  private def popArray(e: ElementBase): Unit = {
+    addArrayImplementation(e)
+    structs.pop()
+
+    // Now call the array's methods instead of the array's element's methods
+    val C = structs.top.C
+    val arrayName = s"array_${cStructName(e)}$C"
+    val indent = if (hasChoice) "    " else ""
+    if (hasChoice)
+      structs.top.initChoiceStatements += s"$indent    
${arrayName}_initERD(instance, parent);"
+    else
+      structs.top.initERDStatements += s"$indent    
${arrayName}_initERD(instance, parent);"
+    structs.top.parserStatements +=
+      s"""$indent    ${arrayName}_parseSelf(instance, pstate);
+         |$indent    if (pstate->error) return;""".stripMargin
+    structs.top.unparserStatements +=
+      s"""$indent    ${arrayName}_unparseSelf(instance, ustate);
+         |$indent    if (ustate->error) return;""".stripMargin
+  }
+
   def pushComplexElement(context: ElementBase): Unit = {
     val C = cStructName(context)
-    structs.push(new ComplexCGState(C))
+    val deref = if (arrayMaxOccurs(context) > 0) "%5B0%5D" else ""

Review Comment:
   Yes, this was a `[0]`.  I will remove both the deref and the name.



##########
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/daffodil_getopt.c:
##########
@@ -85,6 +86,18 @@ parse_daffodil_cli(int argc, char *argv[])
             daffodil_unparse.outfile = optarg;
             break;
         case 'V':
+            if (strcmp("limited", optarg) == 0 || strcmp("on", optarg) == 0)
+            {
+                daffodil_parse.validate = true;

Review Comment:
   Yes, the C code checks only individual fields' fixed="value" attributes and 
doesn't perform any other schema validation yet.  I checked and you're right 
that `xmllint --schema ex_nums.dfdl.xsd infosets/ex_nums.dat.xml` can validate 
an infoset (infosets/ex_nums.dat.xml) against its schema (ex_nums.dfdl.xsd) as 
long as xmllint can load "org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd" 
from the current directory.  If xmllint can find that included schema file, it 
will issue 7 validation error messages for ex_nums.error.dat.xml, otherwise it 
will fail to compile the schema (ex_nums.dfdl.xsd) and only parse the infoset.
   
   However, I don't think we'd ever want `c/daffodil parse -V on` to fork and 
exec `xmllint --schema` to validate the infoset similarly to how `daffodil 
parse -V on` validates the infoset with Xerces once parsing completes.  First, 
c/daffodil doesn't know how to find the original DFDL schema so it can pass the 
schema's filename to xmllint.  Second, it'd be a lot easier for the TDML 
processor to validate the new infoset with Xerces too than for either Scala or 
C code to call `xmllint --schema` since Xerces knows how to find the schema 
file and all included files thanks to Xerces's integration into Daffodil.



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala:
##########
@@ -58,38 +56,209 @@ class CodeGeneratorState(private val root: ElementBase) {
     sb
   }
 
+  // Returns a name for the given element's element resource descriptor
+  private def erdName(context: ElementBase): String = {
+    val sb = buildName(context, new StringBuilder) ++= "ERD"
+    val name = sb.toString
+    name
+  }
+
   // Returns a name for the given element's C struct identifier
   private def cStructName(context: ElementBase): String = {
     val sb = buildName(context, new StringBuilder)
     val name = sb.toString
     name
   }
 
-  // Returns a name for the given element's element resource descriptor
-  private def erdName(context: ElementBase): String = {
-    val sb = buildName(context, new StringBuilder) ++= "ERD"
-    val name = sb.toString
-    name
+  // Returns the notation needed to access a C struct field.  We make some 
simplifying
+  // assumptions to make generating the field access easier:
+  // - the expression contains only a relative or absolute path, nothing else 
(e.g.,
+  //   the expression doesn't call any functions or perform any computation)
+  // - we can convert an absolute path to a rootElement()-> indirection
+  // - we can convert a relative path beginning with up dirs to a parents-> 
indirection
+  // - we can convert a relative path without any up dirs to an instance-> 
indirection
+  // - we can convert slashes in the path to dots in a C struct field access 
notation
+  private def cStructFieldAccess(expr: String): String = {
+    // Strip any namespace prefixes from expr and the root element's local 
name since
+    // C code uses only struct fields' names.
+    val rootName = root.namedQName.local
+    val exprPath = expr.replaceAll("/[^/:]+:", "/").stripPrefix(s"/$rootName")
+    val fieldAccess = if (exprPath.startsWith("/")) {
+      // Convert exprPath to a rootElement()-> indirection
+      val C = cStructName(root)
+      s"""(($C *)rootElement())->${exprPath.stripPrefix("/")}"""
+    } else if (exprPath.startsWith("../")) {
+      // Split exprPath into the up dirs and after the up dirs
+      val afterUpDirs = exprPath.split("\\.\\./").mkString
+      val upDirs = exprPath.stripSuffix(afterUpDirs)
+      // Count how many up dirs there are
+      val nUpDirs = upDirs.split('/').length
+      // Go up the stack that many times to get that struct's C type
+      val C = structs(nUpDirs).C
+      // Convert the up dirs to parents
+      val parents = upDirs.replaceAllLiterally("../", 
"parent->").stripSuffix("->")
+      // Convert exprPath to a parents-> indirection
+      s"""(($C *)instance->_base.$parents)->$afterUpDirs"""
+    } else {
+      // Convert exprPath to an instance-> indirection
+      s"""instance->$exprPath"""
+    }
+    // Finally, convert the field access to C struct dot notation
+    val notation = fieldAccess.replace('/', '.')
+    notation
   }
 
-  // Returns true if the element has not been seen before (checking if a
-  // map already contains the element, otherwise adding it to the map)
-  def elementNotSeenYet(context: ElementBase): Boolean = {
-    val key = cStructName(context)
-    val alreadySeen = elementsAlreadySeen.contains(key)
-    if (!alreadySeen)
-      elementsAlreadySeen += (key -> context)
-    !alreadySeen
+  // Returns the code needed to get the size of an array of elements, which
+  // may be either a constant (maxOccurs) or an expression (occursCount)
+  // which accesses a particular C struct field.
+  private def getOccursCount(e: ElementBase): String = {
+    val occursCount = e.occursCountKind match {
+      case OccursCountKind.Fixed =>
+        s"""    UNUSED(instance);
+           |    return ${e.maxOccurs};""".stripMargin
+      case OccursCountKind.Expression =>
+        // Extract expression from {...} in element's occursCount attribute
+        val xml = e.occursCountEv.toBriefXML()

Review Comment:
   I'd looked at the e.occursCountEv in the debugger, tried to find a way to 
get the original expression out of it as a string, and this toBriefXML() had 
been the only way I'd found.  



##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala:
##########
@@ -404,76 +542,117 @@ class CodeGeneratorState(private val root: ElementBase) {
     finalStructs += struct
   }
 
-  def addSimpleTypeStatements(initERDStatement: String, initSelfStatement: 
String, parseStatement: String, unparseStatement: String): Unit = {
-    if (initERDStatement.nonEmpty) structs.top.initERDStatements += 
initERDStatement
-    if (initSelfStatement.nonEmpty) structs.top.initSelfStatements += 
initSelfStatement
+  // Add C code to initialize, parse, and unparse a primitive value element
+  def addSimpleTypeStatements(initERDStatement: String, parseStatement: 
String, unparseStatement: String): Unit = {
+    if (initERDStatement.nonEmpty) {
+      if (hasChoice)
+        structs.top.initChoiceStatements += initERDStatement
+      else
+        structs.top.initERDStatements += initERDStatement
+    }
     if (parseStatement.nonEmpty) structs.top.parserStatements += parseStatement
     if (unparseStatement.nonEmpty) structs.top.unparserStatements += 
unparseStatement
   }
 
-  def addComplexTypeStatements(child: ElementBase): Unit = {
+  // Generate C code to initialize, parse, and unparse a child element
+  def addPerChildStatements(child: ElementBase): Unit = {
     val C = cStructName(child)
-    val e = child.name
-    val hasChoice = structs.top.initChoiceStatements.nonEmpty
-    val arraySize = if (child.occursCountKind == OccursCountKind.Fixed) 
child.maxOccurs else 0
+    val e = child.namedQName.local
 
+    // Generate a choice statement case if the child is in a choice element
+    val hasChoice = structs.top.initChoiceStatements.nonEmpty
     if (hasChoice) {
+      val position = child.position
       structs.top.initChoiceStatements ++= 
ChoiceBranchKeyCooker.convertConstant(
         child.choiceBranchKey, child, forUnparse = false).map { key => s"    
case $key:"}
-
-      val offset = child.position - 1
-      val initChoiceStatement = s"        instance->_choice = $offset;"
-      val parseStatement = s"    case $offset:"
-      val unparseStatement = s"    case $offset:"
-
-      structs.top.initChoiceStatements += initChoiceStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+      structs.top.initChoiceStatements += s"        instance->_choice = 
$position;"
+      structs.top.parserStatements += s"    case $position:"
+      structs.top.unparserStatements += s"    case $position:"
     }
+    val indent1 = if (hasChoice) "    " else ""
 
-    def addStatements(deref: String): Unit = {
-      val initChoiceStatement = s"        ${C}_initERD(&instance->$e$deref);"
-      val initERDStatement = s"    ${C}_initERD(&instance->$e$deref);"
-      val initSelfStatement = s"    ${C}_initSelf(&instance->$e$deref);"
-      val moreIndent = if (hasChoice) "    " else ""
-      val parseStatement =
-        s"""$moreIndent    ${C}_parseSelf(&instance->$e$deref, pstate);
-           |$moreIndent    if (pstate->error) return;""".stripMargin
-      val unparseStatement =
-        s"""$moreIndent    ${C}_unparseSelf(&instance->$e$deref, ustate);
-           |$moreIndent    if (ustate->error) return;""".stripMargin
+    // Add an array if the child occurs multiple times
+    val hasArray = arrayMaxOccurs(child) > 0
+    val deref = if (hasArray) "[i]" else ""
+    val indent2 = if (hasArray) "    " else ""
+    if (hasArray) pushArray(child)
 
+    // Generate statements to initialize, parse, and unparse the child
+    Runtime2CodeGenerator.generateCode(child.enclosedElement, this)
+    if (!child.isSimpleType) {
       if (hasChoice)
-        structs.top.initChoiceStatements += initChoiceStatement
+        structs.top.initChoiceStatements += s"$indent2        
${C}_initERD(&instance->$e$deref, (InfosetBase *)instance);"
       else
-        structs.top.initERDStatements += initERDStatement
-      structs.top.initSelfStatements += initSelfStatement
-      structs.top.parserStatements += parseStatement
-      structs.top.unparserStatements += unparseStatement
+        structs.top.initERDStatements += s"$indent2    
${C}_initERD(&instance->$e$deref, (InfosetBase *)instance);"
+      structs.top.parserStatements +=
+        s"""$indent1$indent2    ${C}_parseSelf(&instance->$e$deref, pstate);
+           |$indent1$indent2    if (pstate->error) return;""".stripMargin
+      structs.top.unparserStatements +=
+        s"""$indent1$indent2    ${C}_unparseSelf(&instance->$e$deref, ustate);
+           |$indent1$indent2    if (ustate->error) return;""".stripMargin
     }
-    if (arraySize > 0)
-      for (i <- 0 until arraySize)
-        addStatements(s"[$i]")
-    else
-      addStatements("")
 
-    if (hasChoice) {
-      val initChoiceStatement = s"        break;"
-      val parseStatement = s"        break;"
-      val unparseStatement = s"        break;"
+    // Generate statements to align the next element
+    val mustAlign = child.endingAlignmentApprox.nBits > 
child.priorAlignmentApprox.nBits

Review Comment:
   Thanks, I'll try that code and check if it works as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to