[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #988: Update parse error message

via GitHub Thu, 23 Mar 2023 07:52:51 -0700


stevedlawrence commented on code in PR #988:
URL: https://github.com/apache/daffodil/pull/988#discussion_r1146244774



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/HexBinaryLengthParsers.scala:
##########
@@ -64,9 +64,12 @@ sealed abstract class HexBinaryLengthParser(override val 
context: ElementRuntime
       // limiting the size of hex binarie data.
       val array = new Array[Byte](nBytes.toInt)
       val buffer = ByteBuffer.wrap(array)
-      writeBitsInChunks(start, nBits) { case (bytes, nBytes) =>
+      val written = writeBitsInChunks(start, nBits) { case (bytes, nBytes) =>
         buffer.put(bytes, 0, nBytes)
       }
+      if (written != nBits) {
+        PENotEnoughBits(start, nBits, start.dataInputStream)
+      }

Review Comment:
   Does this also need the same logic to capture `DataLocation` and 
`SchemaFileLocation` since it's calling `writeBitsInChunks`?
   
   If so, does it make sense for `writeBitsInChunks` to capture `DataLocation` 
and `SchemaFileLocation` and call `PENotEnoughBits` itself instead of requiring 
that callers do it, and maybe use the wrong PENotEnoughBits if they don't do 
the capture/



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -90,6 +85,53 @@ sealed trait Parser extends Processor {
     )
   }
 
+  def PENotEnoughBits(
+    pstate: PState,
+    sfl: SchemaFileLocation,
+    dataLoc: DataLocation,
+    neededBits: Long,
+    source: InputSourceDataInputStream,
+  ): Unit = {
+    val startPos = dataLoc.bitPos1b - 1
+    val remainingLimitedBits = {
+      if (source.bitLimit0b.isEmpty) MaybeULong.Nope
+      else {
+        val lim = source.bitLimit0b.get
+        Assert.invariant(lim >= 0)
+        val nBits = lim - startPos
+        MaybeULong(nBits)
+      }
+    }
+
+    val remainingBits = {
+      if (source.hasReachedEndOfData) {
+        val bitsAvailable = {
+          val fragmentBitsRead = startPos % 8
+          source.knownBytesAvailable * 8 - fragmentBitsRead
+        }
+        if (remainingLimitedBits.isEmpty || bitsAvailable < 
remainingLimitedBits.get)
+          MaybeULong(bitsAvailable)
+        else
+          remainingLimitedBits
+      } else remainingLimitedBits
+    }

Review Comment:
   I wonder if `remainingBits` should be a `ULong` instead of a `MaybeULong` 
now? If we ever call `PENotEnoughBits`, then must have hit the bit limit and 
`remainingLimitedBits` must be defined, or we hit the end of data, and 
bitsAvailable must be defined. Either way, we always have some number that 
represents the number of available bits. So maybe this last part just wants to 
be:
   
   
   ```scala
        {
           ...
           if (remainingLimitedBits.isEmpty || bitsAvailable < 
remainingLimitedBits.get)
             bitsAvailable
           else
             remainingLimitedBits.getULong
         } else remainingLimitedBits.getULong
   ```
   
   And if you do the startPos adjustment at the end, then that becomes a bit 
cleaner since remainingBits is just a `ULong` instead of a `MaybeULong` and you 
don't have to check if it's defined.



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -90,6 +85,53 @@ sealed trait Parser extends Processor {
     )
   }
 
+  def PENotEnoughBits(

Review Comment:
   I wonder if we should remove the above `PENotEnoughBits` that accepts a 
`remainingBits` parameer? Nothing should every calculate their own remaining 
bits and call that since they'll likely do it wrong and not use information 
from DataInputStream, DataLocation, etc.



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -90,6 +85,53 @@ sealed trait Parser extends Processor {
     )
   }
 
+  def PENotEnoughBits(
+    pstate: PState,
+    sfl: SchemaFileLocation,
+    dataLoc: DataLocation,
+    neededBits: Long,
+    source: InputSourceDataInputStream,
+  ): Unit = {
+    val startPos = dataLoc.bitPos1b - 1
+    val remainingLimitedBits = {
+      if (source.bitLimit0b.isEmpty) MaybeULong.Nope
+      else {
+        val lim = source.bitLimit0b.get
+        Assert.invariant(lim >= 0)
+        val nBits = lim - startPos
+        MaybeULong(nBits)
+      }
+    }
+
+    val remainingBits = {
+      if (source.hasReachedEndOfData) {
+        val bitsAvailable = {
+          val fragmentBitsRead = startPos % 8
+          source.knownBytesAvailable * 8 - fragmentBitsRead

Review Comment:
   I think this is incorrect? It only uses `startPos` to know how many fragment 
bits to remove, but doesn't adjust the total known bits available based on 
`startPos`, since `knownBytesAvailable` is based on `dis.bitPos0b`.
   
   I think we could do something like this:
   ```scala
   val bitsAvailable = {
     val fragmentBitsReadFromSourcePos = source.bitPos0b % 8
     val bitsAvailableFromSourcePos = source.knownBytesAvailable * 8 - 
fragmentBitsReadFromSourcePos
     val bitsBetweenStartPosAndSourcePos = source.bitPos0b - startPos
     val bitsAvailableFromStartPos = bitsAvailableFromSourcePos + 
bitsBetweenStartPosandSource
     bitsAvailableFromStartPos
   }
   ```
   
   There might be better variable names, but the long ones do make it clear 
what's being calculated and why.
   
   So this uses the previous logic to figure out how many bits are available 
starting from the dis.bitPos0b, then just adds the number of bits between the 
`DataLocation` and the `pstate currentLocation`, since the two might not be the 
same. 
   
   In fact, and it might be cleaner, you could maybe have all your old logic to 
calculate remainingBits (which is all dis.bitPos0b based), and then at the very 
end do something like this to adjust the remaining bits based on the 
DataLocation, e.g.
   
   ```scala
   val remainingBits = ... // all old logic
   val bitsBetweenStartPosAndCurPos = (dis.bitPos0b - startPos)
   val remainingBitsFromStartPos = remainingBits + bitsBetweenStartPosAndCurPos 
   PENotEnoughBits(..., neededBits, remainingBitsFromStartPos)
   ```
   If `dis.bitPos0b` and `sourcePos` are the same (i.e. most cases except for 
things that use chunkWithBits), then `bitsBetweenStartPosAndCurPos` is just 0 
and doesn't have an affect. But in the case of chunkWithBits where they aren't 
the same, we just add the difference between where chunksWithBits started 
(startPos) and where it we start calculating remainingBits from (dis.bitPos0b).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #988: Update parse error message

Reply via email to