stevedlawrence commented on code in PR #988:
URL: https://github.com/apache/daffodil/pull/988#discussion_r1140230245
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/BlobLengthParsers.scala:
##########
@@ -54,14 +53,7 @@ trait ByteChunkWriter { self: PrimParser =>
writeChunk(array, bytesToPut.toInt)
remainingBitsToGet -= bitsToGet
} else {
- val remainingBits =
- if (dis.remainingBits.isDefined) {
- val totalBitsRead = nBits - remainingBitsToGet
- MaybeULong(dis.remainingBits.get + totalBitsRead)
- } else {
- MaybeULong.Nope
- }
- self.PENotEnoughBits(start, nBits, remainingBits)
+ self.PENotEnoughBits(start, nBits, dis)
Review Comment:
This change could potentially lead to confusion because blobs are read in
chunks?
As an example, let's say we have 3 bytes in the data, our blob size if 4
bytes, and the chunk size is 1 byte.
So `nBits` will be 4 * 8 = 32 bits to read, with only 24 bits available.
Because we chunk the reads, we will read 8 bits at a time and successfully read
all 24 bits. Then we will go to read the last byte since we need 32 total. That
doesn't exist so we call `PENotENoughBits` passing in 32 as the bits we needed.
The remaining bits calculation will say there are zero bits available, so the
error would say something like
> bit position 24, 32 bits needed, 0 available
Which is may be confusing since we needed 32 bits starting at bitPos = 0,
not 24.
Really, we probably want to say:
> bit position 0, 32 bits needed, 24 available
But that's hard because by the time we get to this point the bit position
has changed because of the chunk reads, and there aren't actually 24 bits
available at this new position, so PENotENoughBits can't easily report that.
Maybe instead we change this line to this:
```scala
self.PENotEnoughBits(start, bitsToGet, dis)
```
Which would report something like this:
> bit position 24, 8 bits needed, 0 available
That's maybe a bit confusing if users aren't aware chunking is going on.
They might say "my blob length is 32, why does it say only 8 are needed". But
maybe that's fine? Maybe we just teach users that blobs are chunked and if
there isn't enough data the error will be about the last chunk read that failed
and at least these numbers are accurate?
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -65,7 +66,37 @@ sealed trait Parser extends Processor {
pstate.setFailed(new ParseError(One(sfl), One(dataLoc), s, args: _*))
}
- def PENotEnoughBits(pstate: PState, neededBits: Long, remainingBits:
MaybeULong) = {
+ def PENotEnoughBits(
+ pstate: PState,
+ neededBits: Long,
+ source: InputSourceDataInputStream,
+ ): Unit = {
+ val remainingLimitedBits = {
+ if (source.bitLimit0b.isEmpty) MaybeULong.Nope
+ else {
+ val lim = source.bitLimit0b.get
+ Assert.invariant(lim >= 0)
+ val nBits = lim - source.bitPos0b
+ MaybeULong(nBits)
+ }
+ }
+
+ val remainingBits = {
+ if (source.hasReachedEndOfData) {
+ val bitsAvailable = {
+ if (source.bitPos0b % 8 == 0)
+ source.knownBytesAvailable * 8
+ else {
+ source.knownBytesAvailable * 8 - (source.bitPos0b % 8)
Review Comment:
Technically we don't need this if-statement. If `sourceBitPos0b % 8` is
zero, then it will just subtract zero. Maybe if we want to make it more clear
why this % 8 thing is involved, we could do something like
```scala
val fragmentBitsConsumed = source.bitPos0b % 8
val bitsAvailable = (source.knownBytesAvailable * 8) - fragmentBitsConsumed
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]