Can't say I have seen this. Of the various parses that are possible in this seemingly ambiguous situation, which one is correct? Or is any one of them correct?
If you create a DFDL schema for this, you'll always get one outer and one inner sequence, where the inner sequence has the maximum number of elements in it. On Wed, Dec 15, 2021 at 8:31 AM Roger L Costello <coste...@mitre.org> wrote: > > Hi Folks, > > Have you seen a data format like this: there is a pair of nested lists -- an > outerList that has an innerList. The lists can be repeated an arbitrary > number of times. There is no punctuation (separator) at the end of each > outerList, but there is punctuation at the end. > > For example, suppose the data format consists of input data that is a series > of "A" characters and at the end is a "Z" character. Here are sample inputs: > > Z > AZ > AAZ > AAAZ > AAAAZ > ... > > Here is a grammar for this: > ------------------------------------------------------------ > start: outerList 'Z' > > outerList: /* empty */ > | outerList outerListItem > > outerListItem: innerList > > innerList: /* empty */ > | innerList innerListItem > > innerListItem: 'A' > ------------------------------------------------------------ > So, this input: > > AAAZ > > could be due to one outerListItem and three innerListItems, or two > outerListItems and (0,3), (1,2), (2,1), or (3,0) innerListItems, or ... > > It is my understanding that this is rare but does exist. A book that I am > reading says: > > In practice, it's pretty rare to have a pair of nested > lists with no punctuation. It's confusing to parsers, > and it's confusing to humans, too. > > Have you seen a data format like this? > > /Roger