Re: Is it okay to officially publish a DFDL schema that produces warnings on valid input data?

Steve Lawrence Sun, 10 Nov 2019 06:32:53 -0800

When unparsing a choice, we use the infoset to determine which branch of
the choice to unparse. For example, say we had this choice:


  <xs:choice>
    <xs:element name="A" type="xs:string" ... />
    <xs:element name="B" type="xs:int" ... />
  </xs:choice>

If the infoset contained the "A" element, then we would unparse the
first branch of the choice. If the infoset contained the "B" element,
then we would unparse the second.

However, in this new choice you have, both branches only contain a
sequence, which do not have a representation in the infoset. So when
unparsing we don't know which branch to take.

That warning is trying to alert you that Daffodil will just have to pick
one, and that it might not be the one you expected. Daffodil will
currently always unparse the first of the ambiguous branches.

So this warning is actually normal and expected in this case. I think it
would be reasonable to ignore this warning.


On 11/10/19 8:54 AM, Costello, Roger L. wrote:
> Mike wrote:
> 
> I suggest adding this
> 
> <choice>
> 
>    <sequence dfdl:initiator="%NL;" />
> 
>    <sequence />
> 
> </choice>
> 
> At the end of the schema after the repeating row element.
> 
> This will absorb and discard any final newline.
> 
> Oh! That is a wicked cool idea! I gave it a try. Daffodil doesn’t seem to 
> like it:
> 
> [warning] Schema Definition Warning: Multiple choice branches are associated 
> with the end of element {}csv.
> 
> Note that elements with dfdl:outputValueCalc cannot be used to distinguish 
> choice branches.
> 
> Note that choice branches with entirely optional content are not allowed.
> 
> What does that message mean? How to fix it?
> 
> /Roger
> 
> *From:* Beckerle, Mike <[email protected]>
> *Sent:* Sunday, November 10, 2019 7:56 AM
> *To:* [email protected]
> *Subject:* [EXT] Re: Is it okay to officially publish a DFDL schema that 
> produces warnings on valid input data?
> 
> I would avoid this.
> 
> One thing you need to take a position on is whether on unparsing you generate 
> this final new line, or not, or try to preserve whatever the file had 
> originally.
> 
> Choosing to always generate this, or always omit it is canonicalization.
> 
> I suggest adding this
> 
> <choice>
> 
>    <sequence dfdl:initiator="%NL;" />
> 
>    <sequence />
> 
> </choice>
> 
> At the end of the schema after the repeating row element.
> 
> This will absorb and discard any final newline.
> 
> If you want to preserve the final newline then you have to model it as data 
> so 
> change the first branch of the choice above and make it an element named 
> 'finalNewLine' with initiator and type string with explicit length 0.
> 
> --------------------------------------------------------------------------------
> 
> *From:*Costello, Roger L. <[email protected] <mailto:[email protected]>>
> *Sent:* Saturday, November 9, 2019 8:05:19 AM
> *To:* [email protected] <mailto:[email protected]> 
> <[email protected] <mailto:[email protected]>>
> *Subject:* Is it okay to officially publish a DFDL schema that produces 
> warnings 
> on valid input data?
> 
> Hi Folks,
> 
> Suppose you are creating the official, standard DFDL schema for a data 
> format. 
> Would you be okay with officially releasing a schema that generates warnings 
> on 
> data that is valid?
> 
> Here’s an example. The RFC for CSV (RFC 4180) says that CSV files consist of 
> records separated by newlines. Each record consists of fields separated by 
> commas. The last record may or may not have a new line.
> 
> Suppose the last record of a CSV file has newline. My DFDL schema generates 
> this 
> warning:
> 
> *[warning] Left over data. Consumed 1680 bit(s) with at least 16 bit(s) 
> remaining.*
> 
> I am thinking that that warning is okay. Why? Because when the last record 
> has a 
> newline, then the file /really does/ have left over data – the newline on the 
> last record. So, a warning is not unreasonable.
> 
> Well, that’s what I think. I might be thinking wrongly. What do you think? 
> Would 
> you ever officially release a DFDL schema that generates warnings on valid 
> input 
> data?
> 
> /Roger
>

Re: Is it okay to officially publish a DFDL schema that produces warnings on valid input data?

Reply via email to