[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #988: Update parse error message

via GitHub Fri, 17 Mar 2023 08:09:08 -0700


stevedlawrence commented on code in PR #988:
URL: https://github.com/apache/daffodil/pull/988#discussion_r1140359093



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/BlobLengthParsers.scala:
##########
@@ -54,14 +53,7 @@ trait ByteChunkWriter { self: PrimParser =>
         writeChunk(array, bytesToPut.toInt)
         remainingBitsToGet -= bitsToGet
       } else {
-        val remainingBits =
-          if (dis.remainingBits.isDefined) {
-            val totalBitsRead = nBits - remainingBitsToGet
-            MaybeULong(dis.remainingBits.get + totalBitsRead)
-          } else {
-            MaybeULong.Nope
-          }
-        self.PENotEnoughBits(start, nBits, remainingBits)
+        self.PENotEnoughBits(start, nBits, dis)

Review Comment:
   We can do capture the bitPosition0b (and in some cases we do), but the PE 
that gets created still uses the state a the time of creation. So we would need 
to reset the bitPosition back to the beginning of the blob prior to calling 
PENotEnoughBits. But, that might cause additional problems. For example, the 
BucketingInputStream has a maximum size it will store, after which it starts 
throwing away older bytes--if we reset back into those discarded bytes we might 
get other errors. And with a big blob that doesn't seem impossible.
   
   Maybe alternatively we just save the DataLocation instead of the bitPosition 
from the beginning of the blob and manually create a PE with that DataLocation?
   
   There is an `PENotEnoughBits` function that does accept a DataLocation just 
below the one you modified:
   ```
     def PENotEnoughBits(
       pstate: PState,
       sfl: SchemaFileLocation,
       dataLoc: DataLocation,
       neededBits: Long,
       remainingBits: MaybeULong,
     ) = {
   ```
   That gets a correct starting position. And we would need to adjust your 
`remainingBits` logic to use the `bitPosition` from the `DataLocation` instead 
of the `DataInputstream` and account for differences between the 
DataInputStream and DataLocation bitPositions, but I *think* that should be 
doable? Might also make sense to have the PENotEnoughBits function you modified 
to call this one,  and move your logic into that so they use the same logic.
   
   All doable, but could get messy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #988: Update parse error message

Reply via email to