[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #988: Update parse error message

via GitHub Fri, 17 Mar 2023 06:22:35 -0700


stevedlawrence commented on code in PR #988:
URL: https://github.com/apache/daffodil/pull/988#discussion_r1140230245



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/BlobLengthParsers.scala:
##########
@@ -54,14 +53,7 @@ trait ByteChunkWriter { self: PrimParser =>
         writeChunk(array, bytesToPut.toInt)
         remainingBitsToGet -= bitsToGet
       } else {
-        val remainingBits =
-          if (dis.remainingBits.isDefined) {
-            val totalBitsRead = nBits - remainingBitsToGet
-            MaybeULong(dis.remainingBits.get + totalBitsRead)
-          } else {
-            MaybeULong.Nope
-          }
-        self.PENotEnoughBits(start, nBits, remainingBits)
+        self.PENotEnoughBits(start, nBits, dis)

Review Comment:
   This change could potentially lead to confusion because blobs are read in 
chunks?
   
   As an example, let's say we have 3 bytes in the data, our blob size if 4 
bytes, and the chunk size is 1 byte.
   
   So `nBits` will be 4 * 8 = 32 bits to read, with only 24 bits available. 
Because we chunk the reads, we will read 8 bits at a time and successfully read 
all 24 bits. Then we will go to read the last byte since we need 32 total. That 
doesn't exist so we call `PENotENoughBits` passing in 32 as the bits we needed. 
The remaining bits calculation will say there are zero bits available, so the 
error would say something like
   
   > bit position 24, 32 bits needed, 0 available
   
   Which is may be confusing since we needed 32 bits starting at bitPos = 0, 
not 24.
   
   Really, we probably want to say:
   
   > bit position 0, 32 bits needed, 24 available
   
   But that's hard because by the time we get to this point the bit position 
has changed because of the chunk reads, and there aren't actually 24 bits 
available at this new position, so PENotENoughBits can't easily report that.
   
   Maybe instead we change this line to this:
   
   ```scala
   self.PENotEnoughBits(start, bitsToGet, dis)
   ```
   
   Which would report something like this:
   
   > bit position 24, 8 bits needed, 0 available
   
   That's maybe a bit confusing if users aren't aware chunking is going on. 
They might say "my blob length is 32, why does it say only 8 are needed". But 
maybe that's fine? Maybe we just teach users that blobs are chunked and if 
there isn't enough data the error will be about the last chunk read that failed 
and at least these numbers are accurate?
   



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -65,7 +66,37 @@ sealed trait Parser extends Processor {
     pstate.setFailed(new ParseError(One(sfl), One(dataLoc), s, args: _*))
   }
 
-  def PENotEnoughBits(pstate: PState, neededBits: Long, remainingBits: 
MaybeULong) = {
+  def PENotEnoughBits(
+    pstate: PState,
+    neededBits: Long,
+    source: InputSourceDataInputStream,
+  ): Unit = {
+    val remainingLimitedBits = {
+      if (source.bitLimit0b.isEmpty) MaybeULong.Nope
+      else {
+        val lim = source.bitLimit0b.get
+        Assert.invariant(lim >= 0)
+        val nBits = lim - source.bitPos0b
+        MaybeULong(nBits)
+      }
+    }
+
+    val remainingBits = {
+      if (source.hasReachedEndOfData) {
+        val bitsAvailable = {
+          if (source.bitPos0b % 8 == 0)
+            source.knownBytesAvailable * 8
+          else {
+            source.knownBytesAvailable * 8 - (source.bitPos0b % 8)

Review Comment:
   Technically we don't need this if-statement. If `sourceBitPos0b % 8` is 
zero, then it will just subtract zero. Maybe if we want to make it more clear 
why this % 8 thing is involved, we could do something like
   
   ```scala
   val fragmentBitsConsumed = source.bitPos0b % 8
   val bitsAvailable = (source.knownBytesAvailable * 8) - fragmentBitsConsumed
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #988: Update parse error message

Reply via email to