I first encountered left-over-data with a dead-simple file format. Just a
top level element named "records" with a minOccurs="0"
maxOccurs="unbounded" array of elements named "record".

Due to minOccurs="0" such a schema is very happy to "successfully" parse
zero records, and tell you the entire file contents are "left over data".

I learned one often wants to have minOccurs="1" to force it to at least be
successful on one record.



On Fri, Apr 15, 2022 at 9:48 AM Roger L Costello <coste...@mitre.org> wrote:

> Hi Folks,
>
> Have you encountered the “left over data” error message? If you’ve worked
> with Daffodil for more than 5 minutes, you undoubtedly have.
>
> The problem with that error message is it gives you absolutely no clue
> what’s causing the problem.
>
> Perhaps if we start cataloging the things that triggered the error
> message, then the Daffodil team will be able to provide better diagnostics.
> Here’s my contribution to said catalog.
>
> -----------------------
>
> In recent weeks I have encountered the dreaded “left over data” error
> message twice. After enormous effort I was able to figure out what the
> problems were in my DFDL schema. First I need to describe my DFDL schema.
>
> My DFDL schema consists of a series of element declarations and within
> each element are declarations of subelements:
>
> A
>     A.1
>     A.2
>     …
> B
>     B.1
>     B.2
>     …
> …
>
> Each subelement is of type string and uses a regex to describe the
> subelement’s data (i.e., the subelements use dfdl:lengthKind=”pattern” and
> dfdl:lengthPattern=”regex”)
>
> The first time that I got the “left over data” error message I found the
> cause was due to this bug in my DFDL schema: a dfdl:lengthPattern listed
> the regex alternatives in the wrong order (shortest to longest instead of
> longest to shortest). The error message said that Daffodil stopped
> consuming input at element G. The actual element containing the regex in
> wrong order was element G.2 (Daffodil stopped consuming input pretty near
> the problem)
>
> After I fixed that bug I immediately got another “left over data” error at
> element J. After much more effort I found the bug: a regex erroneously had
> spaces in it. In this case, the error message said that Daffodil stopped
> consuming input at element J. The actual element containing the regex with
> spaces was element K.5 (Daffodil stopped consuming input pretty far from
> the problem)
>
> /Roger
>

Reply via email to