stevedlawrence commented on code in PR #988:
URL: https://github.com/apache/daffodil/pull/988#discussion_r1146244774
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/HexBinaryLengthParsers.scala:
##########
@@ -64,9 +64,12 @@ sealed abstract class HexBinaryLengthParser(override val
context: ElementRuntime
// limiting the size of hex binarie data.
val array = new Array[Byte](nBytes.toInt)
val buffer = ByteBuffer.wrap(array)
- writeBitsInChunks(start, nBits) { case (bytes, nBytes) =>
+ val written = writeBitsInChunks(start, nBits) { case (bytes, nBytes) =>
buffer.put(bytes, 0, nBytes)
}
+ if (written != nBits) {
+ PENotEnoughBits(start, nBits, start.dataInputStream)
+ }
Review Comment:
Does this also need the same logic to capture `DataLocation` and
`SchemaFileLocation` since it's calling `writeBitsInChunks`?
If so, does it make sense for `writeBitsInChunks` to capture `DataLocation`
and `SchemaFileLocation` and call `PENotEnoughBits` itself instead of requiring
that callers do it, and maybe use the wrong PENotEnoughBits if they don't do
the capture/
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -90,6 +85,53 @@ sealed trait Parser extends Processor {
)
}
+ def PENotEnoughBits(
+ pstate: PState,
+ sfl: SchemaFileLocation,
+ dataLoc: DataLocation,
+ neededBits: Long,
+ source: InputSourceDataInputStream,
+ ): Unit = {
+ val startPos = dataLoc.bitPos1b - 1
+ val remainingLimitedBits = {
+ if (source.bitLimit0b.isEmpty) MaybeULong.Nope
+ else {
+ val lim = source.bitLimit0b.get
+ Assert.invariant(lim >= 0)
+ val nBits = lim - startPos
+ MaybeULong(nBits)
+ }
+ }
+
+ val remainingBits = {
+ if (source.hasReachedEndOfData) {
+ val bitsAvailable = {
+ val fragmentBitsRead = startPos % 8
+ source.knownBytesAvailable * 8 - fragmentBitsRead
+ }
+ if (remainingLimitedBits.isEmpty || bitsAvailable <
remainingLimitedBits.get)
+ MaybeULong(bitsAvailable)
+ else
+ remainingLimitedBits
+ } else remainingLimitedBits
+ }
Review Comment:
I wonder if `remainingBits` should be a `ULong` instead of a `MaybeULong`
now? If we ever call `PENotEnoughBits`, then must have hit the bit limit and
`remainingLimitedBits` must be defined, or we hit the end of data, and
bitsAvailable must be defined. Either way, we always have some number that
represents the number of available bits. So maybe this last part just wants to
be:
```scala
{
...
if (remainingLimitedBits.isEmpty || bitsAvailable <
remainingLimitedBits.get)
bitsAvailable
else
remainingLimitedBits.getULong
} else remainingLimitedBits.getULong
```
And if you do the startPos adjustment at the end, then that becomes a bit
cleaner since remainingBits is just a `ULong` instead of a `MaybeULong` and you
don't have to check if it's defined.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -90,6 +85,53 @@ sealed trait Parser extends Processor {
)
}
+ def PENotEnoughBits(
Review Comment:
I wonder if we should remove the above `PENotEnoughBits` that accepts a
`remainingBits` parameer? Nothing should every calculate their own remaining
bits and call that since they'll likely do it wrong and not use information
from DataInputStream, DataLocation, etc.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/Parser.scala:
##########
@@ -90,6 +85,53 @@ sealed trait Parser extends Processor {
)
}
+ def PENotEnoughBits(
+ pstate: PState,
+ sfl: SchemaFileLocation,
+ dataLoc: DataLocation,
+ neededBits: Long,
+ source: InputSourceDataInputStream,
+ ): Unit = {
+ val startPos = dataLoc.bitPos1b - 1
+ val remainingLimitedBits = {
+ if (source.bitLimit0b.isEmpty) MaybeULong.Nope
+ else {
+ val lim = source.bitLimit0b.get
+ Assert.invariant(lim >= 0)
+ val nBits = lim - startPos
+ MaybeULong(nBits)
+ }
+ }
+
+ val remainingBits = {
+ if (source.hasReachedEndOfData) {
+ val bitsAvailable = {
+ val fragmentBitsRead = startPos % 8
+ source.knownBytesAvailable * 8 - fragmentBitsRead
Review Comment:
I think this is incorrect? It only uses `startPos` to know how many fragment
bits to remove, but doesn't adjust the total known bits available based on
`startPos`, since `knownBytesAvailable` is based on `dis.bitPos0b`.
I think we could do something like this:
```scala
val bitsAvailable = {
val fragmentBitsReadFromSourcePos = source.bitPos0b % 8
val bitsAvailableFromSourcePos = source.knownBytesAvailable * 8 -
fragmentBitsReadFromSourcePos
val bitsBetweenStartPosAndSourcePos = source.bitPos0b - startPos
val bitsAvailableFromStartPos = bitsAvailableFromSourcePos +
bitsBetweenStartPosandSource
bitsAvailableFromStartPos
}
```
There might be better variable names, but the long ones do make it clear
what's being calculated and why.
So this uses the previous logic to figure out how many bits are available
starting from the dis.bitPos0b, then just adds the number of bits between the
`DataLocation` and the `pstate currentLocation`, since the two might not be the
same.
In fact, and it might be cleaner, you could maybe have all your old logic to
calculate remainingBits (which is all dis.bitPos0b based), and then at the very
end do something like this to adjust the remaining bits based on the
DataLocation, e.g.
```scala
val remainingBits = ... // all old logic
val bitsBetweenStartPosAndCurPos = (dis.bitPos0b - startPos)
val remainingBitsFromStartPos = remainingBits + bitsBetweenStartPosAndCurPos
PENotEnoughBits(..., neededBits, remainingBitsFromStartPos)
```
If `dis.bitPos0b` and `sourcePos` are the same (i.e. most cases except for
things that use chunkWithBits), then `bitsBetweenStartPosAndCurPos` is just 0
and doesn't have an affect. But in the case of chunkWithBits where they aren't
the same, we just add the difference between where chunksWithBits started
(startPos) and where it we start calculating remainingBits from (dis.bitPos0b).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]