stevedlawrence commented on code in PR #988:
URL: https://github.com/apache/daffodil/pull/988#discussion_r1140359093
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/BlobLengthParsers.scala:
##########
@@ -54,14 +53,7 @@ trait ByteChunkWriter { self: PrimParser =>
writeChunk(array, bytesToPut.toInt)
remainingBitsToGet -= bitsToGet
} else {
- val remainingBits =
- if (dis.remainingBits.isDefined) {
- val totalBitsRead = nBits - remainingBitsToGet
- MaybeULong(dis.remainingBits.get + totalBitsRead)
- } else {
- MaybeULong.Nope
- }
- self.PENotEnoughBits(start, nBits, remainingBits)
+ self.PENotEnoughBits(start, nBits, dis)
Review Comment:
We can do capture the bitPosition0b (and in some cases we do), but the PE
that gets created still uses the state a the time of creation. So we would need
to reset the bitPosition back to the beginning of the blob prior to calling
PENotEnoughBits. But, that might cause additional problems. For example, the
BucketingInputStream has a maximum size it will store, after which it starts
throwing away older bytes--if we reset back into those discarded bytes we might
get other errors. And with a big blob that doesn't seem impossible.
Maybe alternatively we just save the DataLocation instead of the bitPosition
from the beginning of the blob and manually create a PE with that DataLocation?
There is an `PENotEnoughBits` function that does accept a DataLocation just
below the one you modified:
```
def PENotEnoughBits(
pstate: PState,
sfl: SchemaFileLocation,
dataLoc: DataLocation,
neededBits: Long,
remainingBits: MaybeULong,
) = {
```
That gets a correct starting position. And we would need to adjust your
`remainingBits` logic to use the `bitPosition` from the `DataLocation` instead
of the `DataInputstream` and account for differences between the
DataInputStream and DataLocation bitPositions, but I *think* that should be
doable? Might also make sense to have the PENotEnoughBits function you modified
to call this one, and move your logic into that so they use the same logic.
All doable, but could get messy.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]