I first encountered left-over-data with a dead-simple file format. Just a top level element named "records" with a minOccurs="0" maxOccurs="unbounded" array of elements named "record".
Due to minOccurs="0" such a schema is very happy to "successfully" parse zero records, and tell you the entire file contents are "left over data". I learned one often wants to have minOccurs="1" to force it to at least be successful on one record. On Fri, Apr 15, 2022 at 9:48 AM Roger L Costello <coste...@mitre.org> wrote: > Hi Folks, > > Have you encountered the “left over data” error message? If you’ve worked > with Daffodil for more than 5 minutes, you undoubtedly have. > > The problem with that error message is it gives you absolutely no clue > what’s causing the problem. > > Perhaps if we start cataloging the things that triggered the error > message, then the Daffodil team will be able to provide better diagnostics. > Here’s my contribution to said catalog. > > ----------------------- > > In recent weeks I have encountered the dreaded “left over data” error > message twice. After enormous effort I was able to figure out what the > problems were in my DFDL schema. First I need to describe my DFDL schema. > > My DFDL schema consists of a series of element declarations and within > each element are declarations of subelements: > > A > A.1 > A.2 > … > B > B.1 > B.2 > … > … > > Each subelement is of type string and uses a regex to describe the > subelement’s data (i.e., the subelements use dfdl:lengthKind=”pattern” and > dfdl:lengthPattern=”regex”) > > The first time that I got the “left over data” error message I found the > cause was due to this bug in my DFDL schema: a dfdl:lengthPattern listed > the regex alternatives in the wrong order (shortest to longest instead of > longest to shortest). The error message said that Daffodil stopped > consuming input at element G. The actual element containing the regex in > wrong order was element G.2 (Daffodil stopped consuming input pretty near > the problem) > > After I fixed that bug I immediately got another “left over data” error at > element J. After much more effort I found the bug: a regex erroneously had > spaces in it. In this case, the error message said that Daffodil stopped > consuming input at element J. The actual element containing the regex with > spaces was element K.5 (Daffodil stopped consuming input pretty far from > the problem) > > /Roger >