The underlying cause is that Daffodil doesn't read all the data in at once. Instead it only reads and parses data in small-ish chunks and discards chunks once it is done with them. The benefit to this approach is that it allows Daffodil use smaller amounts of memory than might otherwise be needed if it had read the entire file in all at once. In fact, it even allows Daffodil to parse files that are larger than could fit in JVM memory.

However, this means that when a parse fails, Daffodil might not have read the entire file, and so it doesn't actually know how much is really left. All it knows about is the size of the chunks that still remain.

If we wanted to fix this, once parsing completes we could try to consume data until we hit EOF, which would give us an accurate number of remaining bits. But if the input is coming from a stream, then EOF could take a while or not actually happen, and Daffodil would appear to hang. So instead we just bail and report as much as we know about.

Alternatively we could check if the input is a file vs a stream and then do simple file size calculation, but thus far is hasn't been a high priority for the extra complication.


On 2023-04-13 12:10 PM, Roger L Costello wrote:
I am gradually adding to my DFDL schema. I expect there to be "Left over data". 
But it would be nice if Daffodil accurately told me how much left over data there is. Or 
at least, it would be nice of Daffodil didn't (apparently) make things up. Let me explain.

I ran my DFDL schema and got this message:

[error] Left over data. Consumed 109767424 bit(s) with at least 5376 bit(s) 
remaining.

Okay, so I have about 5,000 bits remaining to be parsed.

I added more stuff into my DFDL schema. The schema now gobbles up more of the 
input. I expect the number of bits consumed to increase and the number of 
left-over bits to decrease. Here's what Daffodil gives:

[error] Left over data. Consumed 191712176 bit(s) with at least 46160 bit(s) 
remaining.

Daffodil reports that more bits were consumed: 109,767,424 --> 191,712,176

Good. Makes sense.

Daffodil reports that there are more remaining bits:  5,376 --> 46,160

Huh? That's crazy.

Why can't Daffodil accurately tell the number of remaining bits?

/Roger

Reply via email to