Re: [PR] Fix Bug with valueLength being overwritten after Trim [daffodil]

via GitHub Mon, 04 Nov 2024 08:57:27 -0800


stevedlawrence commented on code in PR #1338:
URL: https://github.com/apache/daffodil/pull/1338#discussion_r1827928917



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/ElementBaseGrammarMixin.scala:
##########
@@ -1341,7 +1356,13 @@ trait ElementBaseGrammarMixin
       case LengthKind.Delimited => body
       case LengthKind.Pattern => new SpecifiedLengthPattern(this, body)
       case LengthKind.Explicit if bitsMultiplier != 0 =>
-        new SpecifiedLengthExplicit(this, body, bitsMultiplier)
+        if (isSimpleType && primType == PrimType.HexBinary) {
+          // hexBinary has some checks that need to be done that 
SpecifiedLengthExplicit
+          // gets in the way of

Review Comment:
   I think instead of this comment, we can say HexBinary has it's own 
HexBinarySpecifiedLength parser that handles calculating the length, so we do 
not need the SpecifiedLengthExplicit parser?
   
   In fact, do we need to exclude a number of other primitive types that do 
their own explicit length handling? Looking at the current code base, I think 
maybe only simple types that are strings and complex types use the 
SpecifiedLengthExplicit parser? I think all other primitives implement their 
own specified length handling?
   
   So maybe this wants to be
   
   ```scala
   if (isComplexType || primType == PrimType.String) {
     SpecifiedLengthExplicit(...)
   } else {
     // non-string simple types have their own custom parsers/unparsers for 
handling explicit lengths
     body
   }
   ```
   
   In fact, I wonder if we eventually want to refactor all of this to 
completely get rid of all the custom explicit/implicit length parsers? We just 
have various SpecifiedLength parser that sets a bit limit (based on a pattern, 
a prefix length, evaluaating a length expression etc) and then we just have a 
single parser that just reads all bit up until that current bit limit. 
Separation of concerns kind of thing. It would get rid of this condiation and 
all these BinaryIntegerKnownLength/RuntimeLength/PrefixLength/etc parsers. 
There's just a single BinaryNumberParser, and it just gets the length from the 
bitLimit.
   
    Maybe that generality would take performance hit? I'm also not exactly sure 
how that would work with unparsing--the SpecifiedLengthUnparser would need to 
somehow pass the calculated length to the child unparser, I guess it could 
still use bitLimit since that is a thing in UState?



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/primitives/PrimitivesLengthKind.scala:
##########
@@ -186,21 +185,11 @@ case class HexBinaryEndOfBitLimit(e: ElementBase) extends 
Terminal(e, true) {
 case class HexBinaryLengthPrefixed(e: ElementBase) extends Terminal(e, true) {

Review Comment:
   This class is now exactly the same as HexBinaryEndOfBitLimit, suggest we 
just delete it an use that for prefixed hex binary.



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/primitives/PrimitivesBCD.scala:
##########
@@ -52,20 +52,10 @@ class BCDIntegerKnownLength(val e: ElementBase, 
lengthInBits: Long) extends Term
 
 class BCDIntegerPrefixedLength(val e: ElementBase) extends Terminal(e, true) {
 
-  override lazy val parser = new BCDIntegerPrefixedLengthParser(
-    e.elementRuntimeData,
-    e.prefixedLengthBody.parser,
-    e.prefixedLengthElementDecl.elementRuntimeData,
-    e.lengthUnits,
-    e.prefixedLengthAdjustmentInUnits
-  )
+  override lazy val parser = new 
BCDIntegerPrefixedLengthParser(e.elementRuntimeData)

Review Comment:
   I wonder if we want to rename these something like 
`BCDIntegerBitLimitLengthParser` and `BCDIntegerMinimumLengthUnparser`, making 
it clear that they aren't really doing anything specifically with prefixed 
length, and better describes the behavior of how they actuall parse/unparse 
things?
   
   And when we implement things like lengthKind endOfParent, we could probably 
just use the same BitLimitParser, for example.



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/BinaryNumberTraits.scala:
##########
@@ -165,3 +165,21 @@ trait PrefixedLengthParserMixin {
     }
   }
 }
+
+/**
+ * This mixin doesn't require parsing the prefix length element and just uses
+ * the state's bitLimit and position to get the bitLength instead
+ */
+trait PrefixedLengthParserMixin2 {

Review Comment:
   Need a different name for this. Numbers are not descriptive enough. Maybe we 
renames these something like `PrefixLengthFromParserMixin` and 
`PrefixLengthFromBitLimitMixin`? Something to make it more clear how they 
differ without having to read the code.



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/ElementBaseGrammarMixin.scala:
##########
@@ -1341,7 +1356,13 @@ trait ElementBaseGrammarMixin
       case LengthKind.Delimited => body
       case LengthKind.Pattern => new SpecifiedLengthPattern(this, body)
       case LengthKind.Explicit if bitsMultiplier != 0 =>
-        new SpecifiedLengthExplicit(this, body, bitsMultiplier)
+        if (isSimpleType && primType == PrimType.HexBinary) {
+          // hexBinary has some checks that need to be done that 
SpecifiedLengthExplicit
+          // gets in the way of
+          body
+        } else {
+          new SpecifiedLengthExplicit(this, body, bitsMultiplier)
+        }
       case LengthKind.Explicit => {
         Assert.invariant(!knownEncodingIsFixedWidth)
         Assert.invariant(lengthUnits eq LengthUnits.Characters)

Review Comment:
   Below this we have cases for implicit lengths. Do we need to do anything 
special for non-string simple types? I think those primitives have custom 
parsers that handle the implict length logic and don't need a 
SpecifiedLengthImplicit gramar? My concern is we could be adding that grammar 
and it would do something like set a bit limit, but the child paser that 
actually parsrers a the thing would just use it's own calculate and wouldn't 
need the bit limit, so we are just wasting effort.



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/ElementBaseGrammarMixin.scala:
##########
@@ -72,13 +72,21 @@ trait ElementBaseGrammarMixin
     }
   }
 
-  protected lazy val isDelimitedPrefixedPattern = {
+  protected lazy val isPrefixed: Boolean = {
+    import LengthKind._
+    lengthKind match {
+      case Prefixed => true
+      case _ => false
+    }
+  }

Review Comment:
   This can just be `lengthKind eq LengthKind.Prefixed`. Don't really need a 
match/case if we are just going to return true/false. 



##########
daffodil-core/src/main/scala/org/apache/daffodil/core/grammar/ElementBaseGrammarMixin.scala:
##########
@@ -72,13 +72,21 @@ trait ElementBaseGrammarMixin
     }
   }
 
-  protected lazy val isDelimitedPrefixedPattern = {
+  protected lazy val isPrefixed: Boolean = {
+    import LengthKind._
+    lengthKind match {
+      case Prefixed => true
+      case _ => false
+    }
+  }
+
+  protected lazy val isDelimitedPrefixedPattern: Boolean = {
     import LengthKind._
     lengthKind match {
       case Delimited =>
         true // don't test for hasDelimiters because it might not be our 
delimiter, but a surrounding group's separator, or it's terminator, etc.
       case Pattern => true
-      case Prefixed => true
+      case Prefixed => isPrefixed

Review Comment:
   I would suggest just returning true here, isPrefixed doesn't do anything 
except check lengthKind == Prefixed, which is exactly what this match case does.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Fix Bug with valueLength being overwritten after Trim [daffodil]

Reply via email to