I think that's right, but might be a bit of an oversimplification. For example, it doesn't talk about defaults, only sort of implies the concept of speculative parsing, doesn't talk about what happens if M occurrences aren't found, etc. A more exact, but maybe a bit more complex description is section 16.1.2 in the spec [1]:
> The enum 'implicit' should be used when the number of occurrences is to be > established using speculative parsing, and there are lower and upper bounds > to control the speculation. The bounds are provided by the XSDL minOccurs and > XSDL maxOccurs properties. > > When parsing, up to maxOccurs occurrences are expected in the data. It is a > processing error if less than minOccurs occurrences are found or defaulted. > The parser stops looking for occurrences when either minOccurs have been > found or defaulted and speculative parsing does not find another occurrence, > or maxOccurs have been found or defaulted. > > When unparsing, up to maxOccurs occurrences are expected in the infoset. It > is a processing error if less than minOccurs occurrences are found or > defaulted, or if more than maxOccurs occurrences are found. [1] https://daffodil.apache.org/docs/dfdl/#_Toc398030791 On 5/10/19 1:48 PM, Costello, Roger L. wrote: > Excellent! Thank you Steve. > > Is the following an accurate description of dfdl:occursCountKind='implicit'? > > Suppose an element declaration has dfdl:occursCountKind='implicit' with > minOccurs="M" and maxOccurs="N"… This instructs Daffodil to consume between M > and N values. There's no concept of lookahead or smarts about how many values > might appear after the element. Daffodil just keeps consuming values until > either it consumes N values or one of the values fails to parse (i.e., the > value > fails to meet the element’s requirements). > > -----Original Message----- > From: Steve Lawrence <[email protected]> > Sent: Friday, May 10, 2019 11:56 AM > To: [email protected] > Subject: [EXT] Re: Why am I getting this error message: Failed to parse infix > separator. Cause: Parse Error: Separator '%NL;' not found. > > dfdl:occursCountKind="implicit" just says to parse somewhere between minOcurs > and maxOccurs elements. There's no concept of lookahead or smarts about how > many > elements might appear after it. It literally just keeps trying to parse B > elements until either we reach maxOccurs of them or one of them fails to > parse. > The assert was used to cause it to fail to parse when it reached something > that > didn't look like a B. > > And yeah, my schema is just plain wrong. Assert pattern matches the data > stream, > but my intention was to match the parsed value. The assert pattern could > probably be changed, but I think it's a bit more clear to put a pattern > restriction on the B element and change the assert to call checkConstraints. > So > something like this: > > <xs:element name="B" maxOccurs="50" dfdl:occursCountKind="implicit"> > > <xs:simpleType> > > <xs:annotation> > > <xs:appinfo source="http://www.ogf.org/dfdl/"> > > <dfdl:assert test="{ dfdl:checkConstraints(.) }" /> > > </xs:appinfo> > > </xs:annotation> > > <xs:restriction base="xs:string"> > > <xs:pattern value=".*[^0-9].*" /> > > </xs:restriction> > > </xs:simpleType> > > </xs:element> > > So each B is parsed, then we assert that the parsed value validates according > to > the pattern value. When a value doesn't validate, that's how we know we have > reached the C elements. > > - Steve > > On 5/10/19 11:36 AM, Costello, Roger L. wrote: > > > Hi Steve, > > > > > > I guess that I don't understand dfdl:occursCountKind="implicit". I thought > it > means: "Hey Daffodil, figure out the appropriate occurrences of B elements by > inferring from the occurrence needs of its following elements." In this case, > C's are the following elements and the number of occurrences of C is equal to > the value of the first A element. That is, the occurrence needs for C is > easily > determined, so the occurrence needs of B should be easily inferred. That is, > it > seems to me that Daffodil should be able to recognize that these values: > > > > > > 100 > > > 200 > > > 300 > > > 400 > > > 500 > > > 600 > > > > > > are for the C element and the declaration for the B element should not > need > an assert to specify, "Give me only strings up till the point where digits > are > encountered." By adding dfdl:assert to the schema it is effectively neutering > the dfdl:occursCountKind="implicit". I am confused. > > > > > > Second question: I modified the schema as you suggested. See below. > However, > I now get this error message: > > > > > > [error] Parse Error: Failed to populate C[2]. Missing infix separator. > Cause: > Parse Error: Separator '%NL;' not found. > > > > > > <xs:element name="input"> > > > <xs:complexType> > > > <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix"> > > > <xs:element name="A" type="xs:integer" minOccurs="3" > > > maxOccurs="3" dfdl:occursCountKind="fixed" /> > > > <xs:element name="B" type="xs:string" maxOccurs="50" > > > dfdl:occursCountKind="implicit"> > > > <xs:annotation> > > > <xs:appinfo source="http://www.ogf.org/dfdl/"> > > > <dfdl:assert testKind="pattern" > testPattern=".*[^0-9].*" /> > > > </xs:appinfo> > > > </xs:annotation> > > > </xs:element> > > > <xs:element name="C" type="xs:integer" maxOccurs="unbounded" > > > dfdl:occursCountKind="expression" > > > dfdl:occursCount="{ ../A[1] }" /> > > > </xs:sequence> > > > </xs:complexType> > > > </xs:element> > > > > > > -----Original Message----- > > > From: Steve Lawrence <[email protected] <mailto:[email protected]>> > > > Sent: Friday, May 10, 2019 9:08 AM > > > To: [email protected] <mailto:[email protected]> > > > Subject: [EXT] Re: Why am I getting this error message: Failed to parse > infix > separator. Cause: Parse Error: Separator '%NL;' not found. > > > > > > The issue is that element B can be 50 or fewer strings. And although 100, > 200, etc. look like numbers, they are also completely valid strings. So > Daffodil > will just keep consuming every line after the first three numbers as B > elements. > Daffodil still expects a separator followed by some C's, but we hit the end > of > the data and error out saying we were looking for that separator. > > > > > > So we need to somehow tell Daffodil to stop looking for B's. One solution > here is to add an assertion to test that each B element does not look like a > not > a number. The DFDL expression language doesn't have a good way to test if a > string is a number or not, but a regex pattern test could work: > > > > > > <xs:element name="B" type="xs:string" maxOccurs="50" > > > dfdl:occursCountKind="implicit"> > > > <xs:annotation> > > > <xs:appinfo source="http://www.ogf.org/dfdl/"> > > > <dfdl:assert testKind="pattern" testPattern=".*[^0-9].*" /> > > > </xs:appinfo> > > > </xs:annotation> > > > </xs:element> > > > > > > This regular expression says that all B element must contains at least one > character that is not a numeric digit. So when Daffodil gets to "100", the > assertion will fail since it is all numbers, and we'll stop parsing B's and > start looking for C's. > > > > > > - Steve > > > > > > > > > On 5/10/19 8:00 AM, Costello, Roger L. wrote: > > >> Hello DFDL community, > > >> > > >> My input file consists of exactly 3 integers, each on a new line, > > >> followed by an arbitrary number of strings, again, each on a new > > >> line, followed by a number of integers, the number being determined by > the > first integer in the file. For example: > > >> > > >> 6 > > >> 1 > > >> 2 > > >> Banana > > >> Orange > > >> Apple > > >> Grape > > >> 100 > > >> 200 > > >> 300 > > >> 400 > > >> 500 > > >> 600 > > >> > > >> Below is my DFDL schema. It generates this error: > > >> > > >> *[error] Parse Error: Failed to parse infix separator. Cause: Parse Error: > > >> Separator '%NL;' not found.* > > >> > > >> Why is that error is being generated? How to fix the DFDL schema? > > >> /Roger > > >> > > >> <xs:elementname="input"> > > >> <xs:complexType> > > >> <xs:sequencedfdl:separator="%NL;"dfdl:separatorPosition="infix"> > > >> <xs:elementname="A"type="xs:integer" > > >> minOccurs="3"maxOccurs="3" > > >> dfdl:occursCountKind="fixed"/> > > >> <xs:elementname="B"type="xs:string"maxOccurs="50" > > >> dfdl:occursCountKind="implicit"/> > > >> <xs:elementname="C"type="xs:integer"maxOccurs="unbounded" > > >> dfdl:occursCountKind="expression" > > >> dfdl:occursCount="{ ../A[1] }"/> > > >> </xs:sequence> </xs:complexType> </xs:element> > > >> > > > >
