[GitHub] [daffodil] tuxji commented on a diff in pull request #800: Choose TDML implementation to run TDML tests

GitBox Mon, 06 Jun 2022 16:02:52 -0700


tuxji commented on code in PR #800:
URL: https://github.com/apache/daffodil/pull/800#discussion_r890631556



##########
daffodil-tdml-lib/src/main/scala/org/apache/daffodil/tdml/RunnerFactory.scala:
##########
@@ -37,28 +39,48 @@ import org.apache.daffodil.util.Misc
  */
 object Runner {
   def apply(dir: String, file: String,
+    useTDMLImplementation: TDMLImplementation = TDMLImplementation.Daffodil,

Review Comment:
   I made the change you suggested (TDMLImplementation.Daffodil), and I think 
it's nice to keep the type as `TDMLImplementation` instead of 
`Option[TDMLImplementation]`.  I think we need fewer and more concise lines of 
code that way, and we still can type something like 
`TDML_IMPLEMENTATION="daffodilC" sbt test....` which works as you would expect.
   
   



##########
daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd:
##########
@@ -502,11 +502,11 @@
             </xs:documentation>
           </xs:annotation>
         </xs:element>
-        <xs:element name="tdmlImplementation" type="xs:string" 
default="daffodil" minOccurs="0">
+        <xs:element name="tdmlImplementation" 
type="daf:TunableTDMLImplementation" default="daffodil" minOccurs="0">

Review Comment:
   I considered removing the tunable, but having the tunable turns out to be 
useful after all.  There's still a good use case for a TDML test like 
`ex_nums.tdml` to contain both runtime1-specific and runtime2-specific test 
cases side by side so you can compare how runtime1 and runtime2 differ in their 
error behavior.  Once you have runtime1 and runtime2 test cases side by side 
for comparison, it makes sense to configure each test case with the tunable so 
you can run the appropriate implementation on each test case with one single 
"daffodil test -i ex_nums.tdml" command.  
   
   You may wonder how "daffodil test -I daffodilC" affects a test case 
configured with "config-runtime1" and vice versa.  What happens is that the `-I 
implementation` option takes precedence over the tunable **IF** the option is 
anything but "-I daffodil".  You still can override any test's tunable with an 
explicit "-I daffodilC" or "-I ibm" option or a TDML_IMPLEMENTATION="daffodilC" 
variable, although an implementations attribute will still skip the test for a 
non-compatible implementation.



##########
daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala:
##########
@@ -397,6 +407,10 @@ class CLIConf(arguments: Array[String]) extends 
scallop.ScallopConf(arguments) {
     descr("List or execute TDML tests")
     helpWidth(width)
 
+    val implementation = opt[TDMLImplementation](short = 'I', argName = 
"implementation",
+      descr = "Implementation to run TDML tests. Choose one of %s. Defaults to 
%s."
+        .format(TDMLImplementation.allValues.mkString(", "), 
TDMLImplementation.allValues.head.toString),
+      default = TDMLImplementation.optionStringToEnum("I", 
sys.env.getOrElse("TDML_IMPLEMENTATION", "")))

Review Comment:
   The current `default` at line 413 is very concise and has one shortcoming 
although I like its behavior otherwise.  I rely on the fact that 
TDMLImplementation.optionStringToEnum ignores the empty string "" and invalid 
strings coming from TDML_IMPLEMENTATION by returning `None` instead of a 
TDMLImplementation.  I don't like that you see no error message if you make a 
mistake defining TDML_IMPLEMENTATION, but I like that TDML_IMPLEMENTATION will 
only provide a default, not override an explicitly passed `-I implementation` 
option.  Further down, line 1274 needs only one concise `getOrElse` to figure 
out the TDML implementation (all I had to write was two lines of code :),
   
   `val tdmlImplementation = 
testOpts.implementation.toOption.getOrElse(TDMLImplementation.Daffodil)` 
   
   I can't find the other comments you wrote in Main.scala, but I've changed 
all `TDMLImplementations.allValues.head` to `TDMLImplementation.Daffodil` as 
you suggested since I agree the default implementation shouldn't depend on the 
order of the implementations.
   
   I've actually tested running tests with "test -I ibm" and what happens is 
that all tests will run with the daffodil implementation unless you set up your 
classpath to load the ibmCrossTester classes before the regular daffodil 
classes.  I think it's nice to allow `-I ibm` in case someone figures out how 
to load both the daffodil and ibm implementations side by side in the future.



##########
daffodil-runtime2/src/test/resources/org/apache/daffodil/runtime2/ex_nums.tdml:
##########
@@ -38,21 +38,25 @@
 
   <tdml:defineConfig name="config-runtime2">
     <daf:tunables>
-      <daf:tdmlImplementation>daffodil-runtime2</daf:tdmlImplementation>
+      <daf:tdmlImplementation>daffodilC</daf:tdmlImplementation>
     </daf:tunables>
   </tdml:defineConfig>

Review Comment:
   I've removed the tunable from most runtime2 TDML files already.  This TDML 
file is different because it has error test cases that compare how runtime1 and 
runtime2 handle non-fixed values in elements with fixed attributes slightly 
differently (runtime1 requires validation="on" on parse, and produces no error 
messages on unparse).  I think it's convenient when changing this TDML file to 
be able to run this TDML file after each change with "daffodil test" only once 
in order to test both runtime1 and runtime2 implementations (no need to run 
"daffodil test" twice, the second time with "-I daffodilC").



##########
daffodil-runtime2/src/test/resources/org/apache/daffodil/runtime2/ex_nums.tdml:
##########
@@ -158,4 +187,78 @@
     </tdml:validationErrors>
   </tdml:parserTestCase>
 
+  <tdml:parserTestCase
+    config="config-runtime2"
+    model="ex_nums.dfdl.xsd"
+    name="runtime2_error_on"
+    validation="on">
+    <tdml:document>
+      <tdml:documentPart type="file">ex_nums.error.dat</tdml:documentPart>
+    </tdml:document>
+    <tdml:infoset>
+      <tdml:dfdlInfoset type="file">ex_nums.error.dat.xml</tdml:dfdlInfoset>
+    </tdml:infoset>
+    <tdml:validationErrors>
+      <tdml:error>value</tdml:error>
+      <tdml:error>boolean_false</tdml:error>
+      <tdml:error>does not match</tdml:error>
+    </tdml:validationErrors>
+  </tdml:parserTestCase>
+
+  <!--
+      In runtime2, unparse always writes non-fixed values with validation
+      diagnostics (doesn't even have a validation option)
+
+      c/daffodil unparse -o c/ex_nums.error.dat ex_nums.error.dat.xml
+  -->
+
+  <tdml:unparserTestCase
+    config="config-runtime2"
+    model="ex_nums.dfdl.xsd"
+    name="runtime2_error_unparse">
+    <tdml:infoset>

Review Comment:
   Yes, it's worth adding `implementations="daffodilC"` here and 
`implementations="daffodil"` above so that running with "daffodil test -I 
daffodilC" or "daffodil test -I ibm" only skips some tests instead of saying 
some tests failed.



##########
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libruntime/parsers.c:
##########
@@ -507,26 +507,25 @@ parse_le_uint8(uint8_t *number, size_t num_bits, PState 
*pstate)
     *number = (uint8_t)integer;
 }
 
-// Skip fill bytes until end bitPos0b is reached
+// Parse fill bits until end bitPos0b is reached
 
 void
-parse_fill_bytes(size_t end_bitPos0b, PState *pstate)
+parse_fill_bits(size_t end_bitPos0b, PState *pstate)
 {

Review Comment:
   Thanks for reviewing anyway this time.



##########
daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd:
##########
@@ -587,15 +587,23 @@
         <xs:restriction base="dfdlx:ParseUnparsePolicyEnum" />
       </xs:simpleType>
       <xs:simpleType>
-        <xs:restriction base="xs:string">
+        <xs:restriction base="xs:token">
           <xs:enumeration value="fromRoot" />
         </xs:restriction>
       </xs:simpleType>
     </xs:union>
   </xs:simpleType>
 
+  <xs:simpleType name="TunableTDMLImplementation">

Review Comment:
   If we keep the `tdmlImplementation` tunable, do you still want to call 
`TunableTDMLImplementation` something else?  It's a very localized change so I 
don't mind renaming it.



##########
daffodil-runtime2/src/main/resources/org/apache/daffodil/runtime2/c/libruntime/unparsers.c:
##########
@@ -401,27 +401,24 @@ unparse_le_uint8(uint8_t number, size_t num_bits, UState 
*ustate)
     unparse_endian_uint64(LITTLE_ENDIAN_DATA, number, num_bits, ustate);
 }
 
-// Add fill bytes until end bitPos0b is reached
+// Unparse fill bits until end bitPos0b is reached
 
 void
-unparse_fill_bytes(size_t end_bitPos0b, const uint8_t fill_byte, UState 
*ustate)
+unparse_fill_bits(size_t end_bitPos0b, const uint8_t fill_byte, UState *ustate)
 {
-    union
-    {
-        uint8_t bytes[1];
-    } buffer;
-    buffer.bytes[0] = fill_byte;
+    assert(ustate->bitPos0b <= end_bitPos0b);
 
-    // Update our last successful write position only if this loop
-    // finishes without any errors
-    size_t current_bitPos0b = ustate->bitPos0b;
-    while (current_bitPos0b < end_bitPos0b)
+    size_t fill_bits = end_bitPos0b - ustate->bitPos0b;
+    while (fill_bits)
     {
-        write_bits(buffer.bytes, BYTE_WIDTH, ustate);
+        size_t num_bits = (fill_bits >= BYTE_WIDTH) ? BYTE_WIDTH : fill_bits;
+        write_bits(&fill_byte, num_bits, ustate);

Review Comment:
   I think you may be right that we need to implement different behavior in 
both runtime1 and runtime2.  When I have time, I will look at your schema, 
write a test for it, investigate what "really happens" versus what "should 
happen", and create a new JIRA issue for it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [daffodil] tuxji commented on a diff in pull request #800: Choose TDML implementation to run TDML tests

Reply via email to