[GitHub] [daffodil] tuxji commented on a change in pull request #650: Extend C code generator to support hexBinary elements

GitBox Thu, 07 Oct 2021 16:04:08 -0700


tuxji commented on a change in pull request #650:
URL: https://github.com/apache/daffodil/pull/650#discussion_r722694112




##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_reader.c
##########
@@ -361,6 +436,9 @@ xmlNumberElem(XMLReader *reader, const ERD *erd, void 
*number)
             case PRIMITIVE_DOUBLE:
                 *(double *)number = strtodnum(number_from_xml, &error);
                 return error;
+            case PRIMITIVE_HEXBINARY:
+                error = hexToBinary(number_from_xml, (HexBinary *)number);

Review comment:
       All good suggestions.  I'm renaming `hexToBinary` to `strtohexbinary`, 
passing value out parameter `valueptr` to all `strtoX` functions, and getting 
`error` from their return value.  I had picked the other way around to make the 
functions more similar to the C library's own `strtoX` functions but this way 
makes sense.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libruntime/infoset.h
##########
@@ -91,6 +94,15 @@ typedef struct ERD
     const InitChoiceRD   initChoice;
 } ERD;
 
+// HexBinary - parameters of a hexBinary element
+
+typedef struct HexBinary
+{
+    uint8_t *array;   // pointer to element's byte array
+    size_t   size;    // size of element's byte array (in bytes)

Review comment:
       Good suggestions, thanks.  I'm fixing the comments and renaming `size` 
to `lengthInBytes`.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_writer.c
##########
@@ -108,26 +109,65 @@ xmlEndComplex(XMLWriter *writer, const InfosetBase *base)
 static void
 fixNumberIfNeeded(const char *text)
 {
-    if (text[0] == 'N' && text[1] == 'A')
-    {
-        // xsd:float requires NaN to be capitalized correctly
-        char *modifyInPlace = (char *)text;
-        modifyInPlace[1] = 'a';
-    }
     // We aren't writing code to remove all textual differences
     // between runtime1/runtime2 because it would be better to make
     // Daffodil compare numbers numerically, not textually.  If we did
     // have to match runtime1 perfectly, we would have to:
     //  - Strip + from <f>E+<e> to get <f>E<e>
     //  - Add .0 to 1 to get 1.0
     //  - Change number of significant digits to match runtime1
+    if (text && text[0] == 'N' && text[1] == 'A')
+    {
+        // xsd:float requires NaN to be capitalized correctly
+        char *modifyInPlace = (char *)text;
+        modifyInPlace[1] = 'a';
+    }
+}
+
+// Convert a byte array to a string of hexadecimal characters (two
+// nibbles per byte).  Return NULL if no dynamic memory could be
+// allocated for string.  Reuse same dynamic memory next time if big
+// enough.  Caller is responsible for copying string before next call.
+
+static const char *
+binaryToHex(HexBinary hexBinary)
+{
+    static char * hexString = NULL;
+    static size_t capacity = 256;

Review comment:
       Currently the CLI is the only caller of xml_writer.c's functions and the 
CLI exits immediately, so I hadn't considered that behavior to be a problem.  
I'm modifying this function to take another parameter and call it in 
xmlEndDocument with that parameter to free the memory.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libruntime/parsers.h
##########
@@ -65,6 +65,14 @@ extern void parse_le_uint8(uint8_t *number, PState *pstate);
 
 extern void parse_fill_bytes(size_t end_position, PState *pstate);
 
+// Allocate or reallocate memory for hexBinary field
+
+extern void parse_alloc(HexBinary *hexBinary, size_t num_bytes, PState 
*pstate);

Review comment:
       I made two separate `malloc` calls because there are two different code 
paths, this function which gets called from generated parse methods that parse 
a prefixed length from binary data and then need to allocate the corresponding 
byte array, and the other code path in xml_reader.c which reads an XML 
document, processes a prefixed length hexBinary element, and needs to allocate 
a byte array large enough to hold the converted data in the newly initialized 
infoset.  The way an error gets propagated is different enough (stored into 
pstate->error in the first path, returned by return value in the second path) 
that I figured it was easier to have two `malloc` calls than to make 
xml_reader.c call `alloc_hexBinary` as well.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_writer.c
##########
@@ -159,6 +199,9 @@ xmlNumberElem(XMLWriter *writer, const ERD *erd, const void 
*number)
         text = mxmlNewOpaquef(simple, "%.17lG", *(const double *)number);
         fixNumberIfNeeded(mxmlGetOpaque(text));
         break;
+    case PRIMITIVE_HEXBINARY:
+        text = mxmlNewOpaque(simple, binaryToHex(*(const HexBinary *)number));

Review comment:
       If the `malloc` fails and `binaryToHex` returns NULL, `mxmlNewOpaque` 
will return NULL too.  Code further below will detect the NULL and return an 
error, although with an unspecific error message (`CLI_XML_ELEMENT`).

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libruntime/unparsers.c
##########
@@ -133,10 +133,18 @@ unparse_fill_bytes(size_t end_position, const char 
fill_byte, UState *ustate)
 
     while (ustate->position < end_position)
     {
-        write_stream_update_position;
+        write_stream_update_position(buffer.c_val, 1);
     }
 }
 
+// Unparse 8-bit bytes from hexBinary field
+
+void
+unparse_hexBinary(HexBinary hexBinary, UState *ustate)

Review comment:
       Yes, I was striving for uniformity.  If you look at the prototypes in 
parsers.h and unparsers.h, you'll see that every parse method takes a value out 
pointer and every unparse method uses pass-by-value instead of taking a 
pointer.  I figured passing the struct by value would look better stylistically 
and wouldn't cause a lot of extra work (only three member fields versus one 
pointer to copy by value).

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_writer.c
##########
@@ -108,26 +109,65 @@ xmlEndComplex(XMLWriter *writer, const InfosetBase *base)
 static void
 fixNumberIfNeeded(const char *text)
 {
-    if (text[0] == 'N' && text[1] == 'A')
-    {
-        // xsd:float requires NaN to be capitalized correctly
-        char *modifyInPlace = (char *)text;
-        modifyInPlace[1] = 'a';
-    }
     // We aren't writing code to remove all textual differences
     // between runtime1/runtime2 because it would be better to make
     // Daffodil compare numbers numerically, not textually.  If we did
     // have to match runtime1 perfectly, we would have to:
     //  - Strip + from <f>E+<e> to get <f>E<e>
     //  - Add .0 to 1 to get 1.0
     //  - Change number of significant digits to match runtime1
+    if (text && text[0] == 'N' && text[1] == 'A')
+    {
+        // xsd:float requires NaN to be capitalized correctly
+        char *modifyInPlace = (char *)text;
+        modifyInPlace[1] = 'a';
+    }
+}
+
+// Convert a byte array to a string of hexadecimal characters (two
+// nibbles per byte).  Return NULL if no dynamic memory could be
+// allocated for string.  Reuse same dynamic memory next time if big
+// enough.  Caller is responsible for copying string before next call.
+
+static const char *
+binaryToHex(HexBinary hexBinary)
+{
+    static char * hexString = NULL;
+    static size_t capacity = 256;
+
+    // Make room for hexadecimal characters if necessary
+    size_t numNibbles = hexBinary.size * 2;
+    if (hexString == NULL || capacity < numNibbles)
+    {
+        free(hexString);
+        capacity = numNibbles > capacity ? numNibbles : capacity;
+        hexString = malloc(capacity + 1);
+    }
+
+    // Check whether dynamic memory allocation succeeded
+    if (hexString == NULL)
+    {
+        return NULL;
+    }
+
+    // Convert each binary byte to two hexadecimal characters
+    char *nibble = hexString;
+    for (size_t i = 0; i < hexBinary.size; i++)
+    {
+        static char hexDigit[] = "0123456789ABCDEF";
+        *(nibble++) = hexDigit[hexBinary.array[i] / 16]; // high nibble
+        *(nibble++) = hexDigit[hexBinary.array[i] % 16]; // low nibble
+    }
+    *(nibble++) = '\0';
+
+    return hexString;
 }
 
 // Write a boolean, 32-bit or 64-bit real number, or 8, 16, 32, or
 // 64-bit signed or unsigned integer as an XML element's value
 
 static const Error *
-xmlNumberElem(XMLWriter *writer, const ERD *erd, const void *number)
+xmlSimpleElem(XMLWriter *writer, const ERD *erd, const void *number)

Review comment:
       Agreed, I'm replacing `number` with `valueptr`.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_reader.c
##########
@@ -326,7 +401,7 @@ xmlEndComplex(XMLReader *reader, const InfosetBase *base)
 // 64-bit signed or unsigned integer from XML data
 
 static const Error *
-xmlNumberElem(XMLReader *reader, const ERD *erd, void *number)
+xmlSimpleElem(XMLReader *reader, const ERD *erd, void *number)

Review comment:
       Agreed, I'm renaming `number` to `valueptr`.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_reader.c
##########
@@ -341,7 +416,7 @@ xmlNumberElem(XMLReader *reader, const ERD *erd, void 
*number)
     reader->node = mxmlWalkNext(reader->node, reader->xml, MXML_DESCEND);
 
     // Check whether we are walking both XML data and infoset in lockstep
-    if (name_from_xml && name_from_erd)
+    if (name_from_xml && name_from_erd && number_from_xml)

Review comment:
       Agreed, l'm renaming `number_from_xml` to `text`.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_reader.c
##########
@@ -217,6 +217,81 @@ strtounum(const char *numptr, uintmax_t maxval, const 
Error **errorptr)
     return value;
 }
 
+// Store an XML element's string of hexadecimal characters (two
+// nibbles per byte) into a byte array.  Allocate memory for byte
+// array if needed.  Return error if string does not fit into byte
+// array or does not contain valid hexadecimal characters.
+
+static const Error *
+hexToBinary(const char *hexstring, HexBinary *hexBinary)
+{
+    // Check whether string has even number of nibbles
+    size_t numNibbles = strlen(hexstring);
+    size_t numBytes = numNibbles / 2;
+    if (numNibbles == 0 || (numNibbles % 2) != 0)

Review comment:
       I think allowing this is worth the effort.  I'm modifying the logic to 
allow zero-length hex binary elements and adding test zero-length elements to 
verify no code or function dereferences `hexBinary->array` when its length is 
zero.

##########
File path: 
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/HexBinaryCodeGenerator.scala
##########
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.runtime2.generators
+
+import org.apache.daffodil.dpath.NodeInfo.PrimType
+import org.apache.daffodil.grammar.primitives.HexBinaryLengthPrefixed
+import org.apache.daffodil.grammar.primitives.HexBinarySpecifiedLength
+import org.apache.daffodil.schema.annotation.props.gen.BitOrder
+import org.apache.daffodil.schema.annotation.props.gen.ByteOrder
+import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind
+
+trait HexBinaryCodeGenerator extends BinaryAbstractCodeGenerator {
+
+  def hexBinaryLengthPrefixedGenerateCode(g: HexBinaryLengthPrefixed, cgState: 
CodeGeneratorState): Unit = {
+    // For the time being this is a very limited back end.
+    // So there are some restrictions to enforce.
+    val e = g.e
+    e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
+
+    val fieldName = e.namedQName.local
+    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
+    val conv = if (e.prefixedLengthElementDecl.byteOrderEv.constValue == 
ByteOrder.BigEndian) "be" else "le"
+    val intType = e.prefixedLengthElementDecl.optPrimType.get match {
+      case PrimType.Byte
+           | PrimType.Short
+           | PrimType.Int
+           | PrimType.Long
+           | PrimType.Integer => "int"
+      case PrimType.UnsignedByte
+           | PrimType.UnsignedShort
+           | PrimType.UnsignedInt
+           | PrimType.UnsignedLong
+           | PrimType.NonNegativeInteger => "uint"
+      case p => e.SDE("Prefixed length PrimType %s is not supported in C code 
generator.", p.toString)
+    }
+    val intLen = 
e.prefixedLengthElementDecl.elementLengthInBitsEv.constValue.get
+
+    def addStatements(deref: String): Unit = {
+      val field = s"instance->$fieldName$deref"
+      val i = if (deref.length > 2) deref.substring(1, deref.length - 1) else 
""
+      val lenVar = s"_l_$fieldName$i"
+      val initStatement =
+        s"""    $field.array = NULL;
+           |    $field.size = 0;
+           |    $field.dynamic = true;""".stripMargin
+      val parseStatement =
+        s"""    $intType${intLen}_t $lenVar;
+           |    parse_${conv}_$intType$intLen(&$lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_alloc(&$field, $lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_hexBinary(&$field, pstate);
+           |    if (pstate->error) return;""".stripMargin
+      val unparseStatement =
+        s"""    unparse_${conv}_$intType$intLen($field.size, ustate);
+           |    if (ustate->error) return;
+           |    unparse_hexBinary($field, ustate);
+           |    if (ustate->error) return;""".stripMargin
+      cgState.addSimpleTypeStatements(initStatement, parseStatement, 
unparseStatement)
+
+      if (e.hasFixedValue) {
+        val variable = s"${fieldName}_fixed$i"
+        val value = e.fixedValueAsString.grouped(2).mkString(", 0x")
+        val init2 = ""
+        val parse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    parse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", pstate);
+             |    if (pstate->error) return;""".stripMargin
+        val unparse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    unparse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", ustate);
+             |    if (ustate->error) return;""".stripMargin
+        cgState.addSimpleTypeStatements(init2, parse2, unparse2)
+      }

Review comment:
       This code should already be covered since ex_nums.dfdl.xsd puts fixed 
attributes on all of its types.  I'll check again.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libruntime/parsers.c
##########
@@ -149,10 +150,47 @@ parse_fill_bytes(size_t end_position, PState *pstate)
 
     while (pstate->position < end_position)
     {
-        read_stream_update_position;
+        read_stream_update_position(&buffer.c_val, 1);
     }
 }
 
+// Allocate or reallocate memory for hexBinary field
+
+void
+parse_alloc(HexBinary *hexBinary, size_t num_bytes, PState *pstate)

Review comment:
       Agreed, I'm renaming `parse_alloc` to `alloc_hexBinary` in case we ever 
need to allocate text strings later.

##########
File path: 
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/HexBinaryCodeGenerator.scala
##########
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.runtime2.generators
+
+import org.apache.daffodil.dpath.NodeInfo.PrimType
+import org.apache.daffodil.grammar.primitives.HexBinaryLengthPrefixed
+import org.apache.daffodil.grammar.primitives.HexBinarySpecifiedLength
+import org.apache.daffodil.schema.annotation.props.gen.BitOrder
+import org.apache.daffodil.schema.annotation.props.gen.ByteOrder
+import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind
+
+trait HexBinaryCodeGenerator extends BinaryAbstractCodeGenerator {
+
+  def hexBinaryLengthPrefixedGenerateCode(g: HexBinaryLengthPrefixed, cgState: 
CodeGeneratorState): Unit = {
+    // For the time being this is a very limited back end.
+    // So there are some restrictions to enforce.
+    val e = g.e
+    e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
+
+    val fieldName = e.namedQName.local
+    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
+    val conv = if (e.prefixedLengthElementDecl.byteOrderEv.constValue == 
ByteOrder.BigEndian) "be" else "le"
+    val intType = e.prefixedLengthElementDecl.optPrimType.get match {
+      case PrimType.Byte
+           | PrimType.Short
+           | PrimType.Int
+           | PrimType.Long
+           | PrimType.Integer => "int"
+      case PrimType.UnsignedByte
+           | PrimType.UnsignedShort
+           | PrimType.UnsignedInt
+           | PrimType.UnsignedLong
+           | PrimType.NonNegativeInteger => "uint"
+      case p => e.SDE("Prefixed length PrimType %s is not supported in C code 
generator.", p.toString)
+    }
+    val intLen = 
e.prefixedLengthElementDecl.elementLengthInBitsEv.constValue.get
+
+    def addStatements(deref: String): Unit = {
+      val field = s"instance->$fieldName$deref"
+      val i = if (deref.length > 2) deref.substring(1, deref.length - 1) else 
""
+      val lenVar = s"_l_$fieldName$i"
+      val initStatement =
+        s"""    $field.array = NULL;
+           |    $field.size = 0;
+           |    $field.dynamic = true;""".stripMargin
+      val parseStatement =
+        s"""    $intType${intLen}_t $lenVar;
+           |    parse_${conv}_$intType$intLen(&$lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_alloc(&$field, $lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_hexBinary(&$field, pstate);
+           |    if (pstate->error) return;""".stripMargin
+      val unparseStatement =
+        s"""    unparse_${conv}_$intType$intLen($field.size, ustate);
+           |    if (ustate->error) return;
+           |    unparse_hexBinary($field, ustate);
+           |    if (ustate->error) return;""".stripMargin
+      cgState.addSimpleTypeStatements(initStatement, parseStatement, 
unparseStatement)
+
+      if (e.hasFixedValue) {
+        val variable = s"${fieldName}_fixed$i"
+        val value = e.fixedValueAsString.grouped(2).mkString(", 0x")

Review comment:
       Agreed, not only does `.mkString("0x", ", 0x", "")` put the logic in one 
place, it also handles zero-length elements correctly.  Haven't decided if a 
test is worth it since `.grouped(2)` will return a single nibble as the last 
byte which is valid C syntax and some other code probably will report an error 
anyway.  

##########
File path: 
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/HexBinaryCodeGenerator.scala
##########
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.runtime2.generators
+
+import org.apache.daffodil.dpath.NodeInfo.PrimType
+import org.apache.daffodil.grammar.primitives.HexBinaryLengthPrefixed
+import org.apache.daffodil.grammar.primitives.HexBinarySpecifiedLength
+import org.apache.daffodil.schema.annotation.props.gen.BitOrder
+import org.apache.daffodil.schema.annotation.props.gen.ByteOrder
+import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind
+
+trait HexBinaryCodeGenerator extends BinaryAbstractCodeGenerator {
+
+  def hexBinaryLengthPrefixedGenerateCode(g: HexBinaryLengthPrefixed, cgState: 
CodeGeneratorState): Unit = {
+    // For the time being this is a very limited back end.
+    // So there are some restrictions to enforce.
+    val e = g.e
+    e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
+
+    val fieldName = e.namedQName.local
+    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
+    val conv = if (e.prefixedLengthElementDecl.byteOrderEv.constValue == 
ByteOrder.BigEndian) "be" else "le"
+    val intType = e.prefixedLengthElementDecl.optPrimType.get match {
+      case PrimType.Byte
+           | PrimType.Short
+           | PrimType.Int
+           | PrimType.Long
+           | PrimType.Integer => "int"
+      case PrimType.UnsignedByte
+           | PrimType.UnsignedShort
+           | PrimType.UnsignedInt
+           | PrimType.UnsignedLong
+           | PrimType.NonNegativeInteger => "uint"
+      case p => e.SDE("Prefixed length PrimType %s is not supported in C code 
generator.", p.toString)
+    }
+    val intLen = 
e.prefixedLengthElementDecl.elementLengthInBitsEv.constValue.get
+
+    def addStatements(deref: String): Unit = {
+      val field = s"instance->$fieldName$deref"
+      val i = if (deref.length > 2) deref.substring(1, deref.length - 1) else 
""
+      val lenVar = s"_l_$fieldName$i"
+      val initStatement =
+        s"""    $field.array = NULL;
+           |    $field.size = 0;
+           |    $field.dynamic = true;""".stripMargin
+      val parseStatement =
+        s"""    $intType${intLen}_t $lenVar;
+           |    parse_${conv}_$intType$intLen(&$lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_alloc(&$field, $lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_hexBinary(&$field, pstate);
+           |    if (pstate->error) return;""".stripMargin
+      val unparseStatement =
+        s"""    unparse_${conv}_$intType$intLen($field.size, ustate);
+           |    if (ustate->error) return;
+           |    unparse_hexBinary($field, ustate);
+           |    if (ustate->error) return;""".stripMargin
+      cgState.addSimpleTypeStatements(initStatement, parseStatement, 
unparseStatement)
+
+      if (e.hasFixedValue) {
+        val variable = s"${fieldName}_fixed$i"
+        val value = e.fixedValueAsString.grouped(2).mkString(", 0x")
+        val init2 = ""
+        val parse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    parse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", pstate);
+             |    if (pstate->error) return;""".stripMargin
+        val unparse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    unparse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", ustate);
+             |    if (ustate->error) return;""".stripMargin
+        cgState.addSimpleTypeStatements(init2, parse2, unparse2)
+      }
+    }
+    if (arraySize > 0)
+      for (i <- 0 until arraySize)
+        addStatements(s"[$i]")
+    else
+      addStatements("")
+  }
+
+  def hexBinarySpecifiedLengthGenerateCode(g: HexBinarySpecifiedLength, 
cgState: CodeGeneratorState): Unit = {
+    // For the time being this is a very limited back end.
+    // So there are some restrictions to enforce.
+    val e = g.e
+    e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
+    e.schemaDefinitionUnless(e.elementLengthInBitsEv.isConstant, "Runtime 
dfdl:length expressions not supported.")
+
+    val fieldName = e.namedQName.local
+    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
+
+    def addStatements(deref: String): Unit = {
+      val field = s"instance->$fieldName$deref"
+      val fieldArray = s"instance->_a_$fieldName$deref"
+      val initStatement =
+        s"""    $field.array = $fieldArray;
+           |    $field.size = sizeof($fieldArray);
+           |    $field.dynamic = false;
+           |    memset($fieldArray, 0x77, sizeof($fieldArray));""".stripMargin
+      val parseStatement =
+        s"""    parse_hexBinary(&$field, pstate);
+           |    if (pstate->error) return;""".stripMargin
+      val unparseStatement =
+        s"""    unparse_hexBinary($field, ustate);
+           |    if (ustate->error) return;""".stripMargin
+      cgState.addSimpleTypeStatements(initStatement, parseStatement, 
unparseStatement)
+
+      if (e.hasFixedValue) {
+        val i = if (deref.length > 2) deref.substring(1, deref.length - 1) 
else ""
+        val variable = s"${fieldName}_fixed$i"
+        val value = e.fixedValueAsString.grouped(2).mkString(", 0x")
+        val init2 = ""
+        val parse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    parse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", pstate);
+             |    if (pstate->error) return;""".stripMargin
+        val unparse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    unparse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", ustate);
+             |    if (ustate->error) return;""".stripMargin
+        cgState.addSimpleTypeStatements(init2, parse2, unparse2)
+      }
+    }
+    if (arraySize > 0)
+      for (i <- 0 until arraySize)
+        addStatements(s"[$i]")
+    else
+      addStatements("")
+  }

Review comment:
       Easier said than done.  I'll have to think of a way.

##########
File path: 
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/HexBinaryCodeGenerator.scala
##########
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.runtime2.generators
+
+import org.apache.daffodil.dpath.NodeInfo.PrimType
+import org.apache.daffodil.grammar.primitives.HexBinaryLengthPrefixed
+import org.apache.daffodil.grammar.primitives.HexBinarySpecifiedLength
+import org.apache.daffodil.schema.annotation.props.gen.BitOrder
+import org.apache.daffodil.schema.annotation.props.gen.ByteOrder
+import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind
+
+trait HexBinaryCodeGenerator extends BinaryAbstractCodeGenerator {
+
+  def hexBinaryLengthPrefixedGenerateCode(g: HexBinaryLengthPrefixed, cgState: 
CodeGeneratorState): Unit = {
+    // For the time being this is a very limited back end.
+    // So there are some restrictions to enforce.
+    val e = g.e
+    e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
+
+    val fieldName = e.namedQName.local
+    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
+    val conv = if (e.prefixedLengthElementDecl.byteOrderEv.constValue == 
ByteOrder.BigEndian) "be" else "le"
+    val intType = e.prefixedLengthElementDecl.optPrimType.get match {
+      case PrimType.Byte
+           | PrimType.Short
+           | PrimType.Int
+           | PrimType.Long
+           | PrimType.Integer => "int"
+      case PrimType.UnsignedByte
+           | PrimType.UnsignedShort
+           | PrimType.UnsignedInt
+           | PrimType.UnsignedLong
+           | PrimType.NonNegativeInteger => "uint"
+      case p => e.SDE("Prefixed length PrimType %s is not supported in C code 
generator.", p.toString)
+    }
+    val intLen = 
e.prefixedLengthElementDecl.elementLengthInBitsEv.constValue.get
+
+    def addStatements(deref: String): Unit = {
+      val field = s"instance->$fieldName$deref"
+      val i = if (deref.length > 2) deref.substring(1, deref.length - 1) else 
""
+      val lenVar = s"_l_$fieldName$i"
+      val initStatement =
+        s"""    $field.array = NULL;
+           |    $field.size = 0;
+           |    $field.dynamic = true;""".stripMargin
+      val parseStatement =
+        s"""    $intType${intLen}_t $lenVar;
+           |    parse_${conv}_$intType$intLen(&$lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_alloc(&$field, $lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_hexBinary(&$field, pstate);
+           |    if (pstate->error) return;""".stripMargin
+      val unparseStatement =
+        s"""    unparse_${conv}_$intType$intLen($field.size, ustate);
+           |    if (ustate->error) return;
+           |    unparse_hexBinary($field, ustate);
+           |    if (ustate->error) return;""".stripMargin
+      cgState.addSimpleTypeStatements(initStatement, parseStatement, 
unparseStatement)
+
+      if (e.hasFixedValue) {
+        val variable = s"${fieldName}_fixed$i"
+        val value = e.fixedValueAsString.grouped(2).mkString(", 0x")
+        val init2 = ""
+        val parse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    parse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", pstate);
+             |    if (pstate->error) return;""".stripMargin
+        val unparse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    unparse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", ustate);
+             |    if (ustate->error) return;""".stripMargin
+        cgState.addSimpleTypeStatements(init2, parse2, unparse2)
+      }
+    }
+    if (arraySize > 0)
+      for (i <- 0 until arraySize)
+        addStatements(s"[$i]")
+    else

Review comment:
       Agreed, I'm adding an array of prefixed length hex binary.

##########
File path: 
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/HexBinaryCodeGenerator.scala
##########
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.runtime2.generators
+
+import org.apache.daffodil.dpath.NodeInfo.PrimType
+import org.apache.daffodil.grammar.primitives.HexBinaryLengthPrefixed
+import org.apache.daffodil.grammar.primitives.HexBinarySpecifiedLength
+import org.apache.daffodil.schema.annotation.props.gen.BitOrder
+import org.apache.daffodil.schema.annotation.props.gen.ByteOrder
+import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind
+
+trait HexBinaryCodeGenerator extends BinaryAbstractCodeGenerator {
+
+  def hexBinaryLengthPrefixedGenerateCode(g: HexBinaryLengthPrefixed, cgState: 
CodeGeneratorState): Unit = {
+    // For the time being this is a very limited back end.
+    // So there are some restrictions to enforce.
+    val e = g.e
+    e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
+
+    val fieldName = e.namedQName.local
+    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
+    val conv = if (e.prefixedLengthElementDecl.byteOrderEv.constValue == 
ByteOrder.BigEndian) "be" else "le"
+    val intType = e.prefixedLengthElementDecl.optPrimType.get match {
+      case PrimType.Byte
+           | PrimType.Short
+           | PrimType.Int
+           | PrimType.Long
+           | PrimType.Integer => "int"
+      case PrimType.UnsignedByte
+           | PrimType.UnsignedShort
+           | PrimType.UnsignedInt
+           | PrimType.UnsignedLong
+           | PrimType.NonNegativeInteger => "uint"
+      case p => e.SDE("Prefixed length PrimType %s is not supported in C code 
generator.", p.toString)
+    }
+    val intLen = 
e.prefixedLengthElementDecl.elementLengthInBitsEv.constValue.get
+
+    def addStatements(deref: String): Unit = {
+      val field = s"instance->$fieldName$deref"
+      val i = if (deref.length > 2) deref.substring(1, deref.length - 1) else 
""
+      val lenVar = s"_l_$fieldName$i"
+      val initStatement =
+        s"""    $field.array = NULL;
+           |    $field.size = 0;
+           |    $field.dynamic = true;""".stripMargin
+      val parseStatement =
+        s"""    $intType${intLen}_t $lenVar;
+           |    parse_${conv}_$intType$intLen(&$lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_alloc(&$field, $lenVar, pstate);
+           |    if (pstate->error) return;
+           |    parse_hexBinary(&$field, pstate);
+           |    if (pstate->error) return;""".stripMargin
+      val unparseStatement =
+        s"""    unparse_${conv}_$intType$intLen($field.size, ustate);
+           |    if (ustate->error) return;
+           |    unparse_hexBinary($field, ustate);
+           |    if (ustate->error) return;""".stripMargin
+      cgState.addSimpleTypeStatements(initStatement, parseStatement, 
unparseStatement)
+
+      if (e.hasFixedValue) {
+        val variable = s"${fieldName}_fixed$i"
+        val value = e.fixedValueAsString.grouped(2).mkString(", 0x")
+        val init2 = ""
+        val parse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    parse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", pstate);
+             |    if (pstate->error) return;""".stripMargin
+        val unparse2 =
+          s"""    uint8_t $variable[] = {0x$value};
+             |    unparse_validate_fixed(memcmp($field.array, $variable, 
sizeof($variable)) == 0, "$fieldName", ustate);
+             |    if (ustate->error) return;""".stripMargin
+        cgState.addSimpleTypeStatements(init2, parse2, unparse2)
+      }
+    }
+    if (arraySize > 0)
+      for (i <- 0 until arraySize)
+        addStatements(s"[$i]")
+    else
+      addStatements("")
+  }
+
+  def hexBinarySpecifiedLengthGenerateCode(g: HexBinarySpecifiedLength, 
cgState: CodeGeneratorState): Unit = {
+    // For the time being this is a very limited back end.
+    // So there are some restrictions to enforce.
+    val e = g.e
+    e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, 
"Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.")
+    e.schemaDefinitionUnless(e.elementLengthInBitsEv.isConstant, "Runtime 
dfdl:length expressions not supported.")
+
+    val fieldName = e.namedQName.local
+    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0

Review comment:
       I already compute arraySize redundantly in 3 places.  If I use your 
suggested check, I'll have to refactor like you also pointed below.
   
   ```shell
   interran@GH3WPL13E:~/apache/daffodil-runtime2$ rg 'val arraySize = if 
\(e.occursCountKind'
   
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/HexBinaryCodeGenerator.scala
   36:    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
   106:    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
   
   
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/BinaryAbstractCodeGenerator.scala
   39:    val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) 
e.maxOccurs else 0
   interran@GH3WPL13E:~/apache/daffodil-runtime2$
   ```
   

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libcli/xml_reader.c
##########
@@ -217,6 +217,81 @@ strtounum(const char *numptr, uintmax_t maxval, const 
Error **errorptr)
     return value;
 }
 
+// Store an XML element's string of hexadecimal characters (two
+// nibbles per byte) into a byte array.  Allocate memory for byte
+// array if needed.  Return error if string does not fit into byte
+// array or does not contain valid hexadecimal characters.
+
+static const Error *
+hexToBinary(const char *hexstring, HexBinary *hexBinary)

Review comment:
       I checked and the Mini-XML library doesn't have any concept of schema 
types, so it doesn't have any hexBinary code.  GNOME's libxml2 library does 
have occurrences of hexBinary in its code 
(https://github.com/GNOME/libxml2/search?q=hexBinary), but it's not worth 
switching XML libraries for this small function.

##########
File path: 
daffodil-runtime2/src/test/resources/org/apache/daffodil/runtime2/ex_nums_unparse_runtime2.xml
##########
@@ -62,11 +67,14 @@
     <le_uint64>64</le_uint64>
     <le_uint8>8</le_uint8>
     <le_nonNegativeInteger8>8</le_nonNegativeInteger8>
+    <hexBinary7>1A2B3C4D5E6F70</hexBinary7>
+    <hexBinaryPrefixed>DEADBEEF</hexBinaryPrefixed>

Review comment:
       There are no tests with larger hex binary sizes, but the only code using 
256 as a size is in xml_writer.c's binaryToHex function and looks like this:
   
   ```c
       static char * text = NULL;
       static size_t capacity = 256;
       size_t numNibbles = hexBinary.size * 2;
       if (text == NULL || capacity < numNibbles)
       {
           free(text);
           capacity = numNibbles > capacity ? numNibbles : capacity;
           text = malloc(capacity + 1);
       }
   ```
   
   That code is already tested except for the `capacity < numNibbles` condition 
in the if statement and the `numNibbles` expression of the ternary operator `? 
:`.

##########
File path: 
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libruntime/parsers.c
##########
@@ -149,10 +150,47 @@ parse_fill_bytes(size_t end_position, PState *pstate)
 
     while (pstate->position < end_position)
     {
-        read_stream_update_position;
+        read_stream_update_position(&buffer.c_val, 1);
     }
 }
 
+// Allocate or reallocate memory for hexBinary field
+
+void
+parse_alloc(HexBinary *hexBinary, size_t num_bytes, PState *pstate)
+{
+    // Reuse old array if possible
+    if (hexBinary->size >= num_bytes)
+    {
+        hexBinary->size = num_bytes;
+        return;
+    }

Review comment:
       Agreed, but rather than adding a `capacity` field, I'm changing the code 
to always call `free` and `malloc` together.  That code will be simpler and if 
we get into a situation where we need to reuse previously allocated arrays, we 
can take measurements, modify this code, and take measurements again.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [daffodil] tuxji commented on a change in pull request #650: Extend C code generator to support hexBinary elements

Reply via email to