My initial survey of link16 found 8 instances of this; which I don't think 
changes the calculus substantially.


I suspect that adding lookahead to Daffodil would be less work than getting the 
Link16 schema working with late-discriminators; and we would then have support 
for lookahead in the future should we encounter other formats that require it.


If this were a more complicated/invasive feature, I would share your concern 
about adding it to DFDL, but it does not strike me as something that would 
impose significant maintenance or future development constraints.

________________________________
From: Beckerle, Mike <[email protected]>
Sent: Wednesday, May 29, 2019 6:39:27 PM
To: [email protected]
Subject: Re: Proposal: Add support for lookahead in DFDL expressions


Yup. This is a real situation. I think this happens in fixed length data for 
very mundane reasons explained in the attached slides.


Practically speaking, in Link16 this comes up like twice or 3 times, so this 
isn't worth enhancing DFDL/Daffodil for unless this phenomenon is observed in 
more places. I have only seen it in Link16, though per the attached slides the 
problem could occur a lot in fixed length legacy data formats.


The workaround is just a choice, where you have a "late discriminator" on the 
tag, when you finally get to it.

________________________________
From: Sloane, Brandon <[email protected]>
Sent: Wednesday, May 29, 2019 1:00:04 PM
To: [email protected]
Subject: Proposal: Add support for lookahead in DFDL expressions

In developing schema for link16, I have encountered a situation that I do not 
believe Daffodil currently has a good solutions for. It is a tagged union, 
where the tag comes after the union.


In theory, the schema would look like:


<xs:choice dfdl:choiceDispatchKey="{ tag }">

  <xs:element name="a" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="1"/>

  <xs:element name="b" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="2"/>

</xs:choice>

<xs:element name="tag" type="xs:int" dfdl:length="8" />


Obviously, this schema doe not work because it requires look-ahead. In general 
it is not possible to make this sort of schema work, because in order to 
determine where in the bitstream <tag> is, one would first need to know the 
length of the choice, which cannot (generally speaking) be determined before 
parsing completes.


However, in this case, the lookahead is possible in principle, because the 
choice happens to be fixed length (as it should be in any sane format where the 
tag follows the union).


I believe that we can support this usecase with a much less invasive mechanism 
than infoset lookahead. In particular, we can support this with bytestream 
lookahead in the DFDL expressions, as below:


<xs:choice dfdl:choiceDispatchKey="{ dfdl:lookAhead(16,8) }">

  <xs:element name="a" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="1"/>

  <xs:element name="b" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="2"/>

</xs:choice>

<xs:element name="tag" type="xs:int" dfdl:length="8" />


The dfdl:lookAhead function takes as input a relative offset, o, and length, n, 
and returns the n bits located o bits passed the current location, interperated 
as an unsigned integer.


>From am implementation standpoint, there should be no difficulty in adding 
>this, as the parser need only peek into the buffer it already has.


Brandon T. Sloane

Associate, Services

[email protected] | tresys.com

Reply via email to