mbeckerle commented on a change in pull request #539:
URL: https://github.com/apache/daffodil/pull/539#discussion_r620376115
##########
File path:
daffodil-io/src/main/scala/org/apache/daffodil/io/InputSourceDataInputStream.scala
##########
@@ -117,6 +118,57 @@ final class InputSourceDataInputStream private (val
inputSource: InputSource)
@inline
override final def bitLimit0b: MaybeULong = cst.bitLimit0b
+ /**
+ * Tells us if the underlying input source has detected end-of-data
+ * (the read(...) call returned -1.
+ *
+ * But this does NOT tell us we are positioned at the end, only whether
+ * in the course of reading, we encountered the end of data. If we
+ * backtracked we could have seen the end of data, but backed up in
+ * the data to an earlier position.
+ */
+ def hasReachedEndOfData: Boolean = inputSource.hasReachedEndOfData
+
+ /**
+ * Determine if we're positioned at the end of data without
+ * doing any additional blocking operation such as reading more
+ * data to test if there is any.
+ *
+ * This depends on the underlying inputSource keeping track of
+ * whether it has previously hit the end of data or not.
+ *
+ * @return
+ */
+ final def isAtEnd(): Boolean = {
+ if (bitLimit0b.isDefined) {
+ bitPos0b == bitLimit0b.get
Review comment:
Ah, yes it appears we have a stack discipline issue here. We narrow the
available data by assigning a bit limit whenever we have a specified length.
I just searched for uses of isAtEnd.
I'm quite convinced we should remove isAtEnd from the API. We only use it in
tests. We know it is incorrect for it to be reading more data. You point out
here that because it records the -1/End-of-Data in state of the ISDIS, we can't
call it on an inner narrowed specified length scope or we'll be getting isAtEnd
.
Algorithmically, the right thing is for applications to call
ISDIS.areBytesAvailable *before* a parse, and not expect upon return from the
parse for it to be known whether we are at the end of the data stream or not.
This code thunk we use to document streaming is wrong:
`
InputSourceDataInputStream is = new InputSourceDataInputStream(dataStream);
JDOMInfosetOutputter jdomOutputter = new JDOMInfosetOutputter();
boolean keepParsing = true;
while (keepParsing) {
jdomOutputter.reset();
ParseResult pr = dp.parse(is, jdomOutputter);
...
keepParsing = !pr.location().isAtEnd() && !pr.isError();
}`
It should be:
`
InputSourceDataInputStream is = new InputSourceDataInputStream(dataStream);
JDOMInfosetOutputter jdomOutputter = new JDOMInfosetOutputter();
var keepParsing = true
while (keepParsing && is.isDefinedForLength(1)) { // there is at least 1 bit
to parse!
jdomOutputter.reset();
ParseResult pr = dp.parse(is, jdomOutputter);
...
keepParsing = !pr.isError();
}`
where we test for data availability at the top, before the call to parse,
and there is no checking of isAtEnd at all.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]