Multiple issues here. Definitely 2 bugs, and a maybe third bug.
First: "group[4]" notation in diagnostic message. Second: warning occurs only if you remove the maxOccurs="unbounded" Third: how should I write this so as not to get the warning? First Issue: Let me first discuss the "group[4]" notation, and poor diagnostic message. I created bug ticket https://issues.apache.org/jira/browse/DAFFODIL-2190 This is clearly a bug because you don't know what this notation means, and even if you did, it's confusing because it is inconsistent with XPath. There is only one group definition, and only one group reference which is the hidden one which doesn't even use the word "group" except in the middle of the dfdl:hiddenGroupRef property name. We want a way to refer to a component of the schema to be unambiguous about where the problem lies, and ideally this identifier should not just be the line number information. So there is something called XSD component designators (https://www.w3.org/TR/2010/CR-xmlschema-ref-20100119/) which is a w3c thing. We reviewed this and it's very verbose, lacks important distinctions, etc. It's also unfamiliar to most users, and I've not seen it in use much. So setting that aside for the moment, maybe a file URI + XPath into your schema like: fileURI?xpath=xs:schema/xs:element[@name="input"]/xs:complexType/xs:sequence/xs:sequence That's more precise, but still pretty verbose though. We also have a daffodil-internal thing called a short schema component designator, which would be like this: e=input:ct:s:s That's more compact, but you have to learn the notation and it is daffodil specific. I'm not sure we can do better than "xs:sequence at Line 22" plus a filename. Second issue: warning occurs only if you remove the maxOccurs="unbounded" Created https://issues.apache.org/jira/browse/DAFFODIL-2191 Just adding/removing the maxOccurs="unbounded" shouldn't be causing this warning to be suppressed. If this warning is going to occur, it should be occurring regardless of maxOccurs. Third Issue: How should you write this to achieve what you want, and suppress the warning? So you do have a format here, where there is an optional thing, and the alignment of it, and the alignment of what is after it aren't the same. This kind of thing could cause problems with ambiguity about parsing, except that in your case the padToByteBoundary is last in the sequence, and it is a zero-length sequence that exists purely to force alignment. So the ambiguity that the warning is about can't arise in your case. The warning is annoying because it seems spurious. Here's one way to eliminate it. Revise padToByteBoundary: <xs:group name="padToByteBoundary"> <xs:sequence dfdl:alignment="1"> <!-- <<<<<< Notice: alignment 1 here --> <xs:sequence dfdl:alignment="8" dfdl:fillByte="%#r00;"/> </xs:sequence> </xs:group> This silences the warning because it eliminates any possibility of ambiguity regardless of whether the inner sequence above has contents or is just about alignment. Maybe there are other solutions to this, but I am not sure it is well motivated to make Daffodil recognize the special case when an empty sequence exists purely for alignment reasons and to detect and avoid the warning in that case. OTOH, this work around does seem awfully hard to invent. So maybe we do need some more schema compiler support for detecting alignment that doesn't create potential ambiguity. -mikeb ________________________________ From: Costello, Roger L. <[email protected]> Sent: Monday, August 5, 2019 11:41:06 AM To: [email protected] <[email protected]> Subject: Is this a bug in Daffodil? Hello DFDL community, My (binary) input is this: [cid:[email protected]] Parsing the input with this DFDL schema: <xs:element name="input"> <xs:complexType> <xs:sequence> <xs:element name="Section_1" type="Section_1_type" minOccurs="0" maxOccurs="unbounded" dfdl:occursCountKind="implicit"/> <xs:element name="Section_2" type="Section_2_type" minOccurs="0" maxOccurs="unbounded" dfdl:occursCountKind="implicit"/> <xs:element name="Section_3" type="Section_3_type" minOccurs="0" maxOccurs="unbounded" dfdl:occursCountKind="implicit"/> <xs:sequence dfdl:hiddenGroupRef="padToByteBoundary" /> </xs:sequence> </xs:complexType> </xs:element> produces this XML: <input> <Section_1> <code>1</code> <three-bits>3</three-bits> <one-bit>1</one-bit> </Section_1> <Section_1> <code>1</code> <three-bits>3</three-bits> <one-bit>1</one-bit> </Section_1> </input> If I remove maxOccurs="unbounded" on Section_3, then I get this warning message: [warning] Schema Definition Warning: Section_3 is an optional element or a variable-occurrence array and its alignment (1) is not the same as group[4]'s alignment (8). There is only one xs:group in my DFDL schema, so the warning message is clearly wrong. Despite the warning message, I get the same XML output. This seems like a bug in Daffodil. Yes? Below is my complete DFDL schema. /Roger <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/"> <xs:include schemaLocation="default-dfdl-properties/defaults.dfdl.xsd" /> <xs:include schemaLocation="unsignedint-types.dfdl.xsd" /> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:format ref="default-dfdl-properties" /> </xs:appinfo> </xs:annotation> <xs:element name="input"> <xs:complexType> <xs:sequence> <xs:element name="Section_1" type="Section_1_type" minOccurs="0" maxOccurs="unbounded" dfdl:occursCountKind="implicit"/> <xs:element name="Section_2" type="Section_2_type" minOccurs="0" maxOccurs="unbounded" dfdl:occursCountKind="implicit"/> <xs:element name="Section_3" type="Section_3_type" minOccurs="0" maxOccurs="unbounded" dfdl:occursCountKind="implicit"/> <xs:sequence dfdl:hiddenGroupRef="padToByteBoundary" /> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="Section_1_type"> <xs:sequence> <xs:element name="code" type="unsignedint2"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{. eq 1}" /> </xs:appinfo> </xs:annotation> </xs:element> <xs:element name="three-bits" type="unsignedint3" /> <xs:element name="one-bit" type="unsignedint1" /> </xs:sequence> </xs:complexType> <xs:complexType name="Section_2_type"> <xs:sequence> <xs:element name="code" type="unsignedint2"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{. eq 2}" /> </xs:appinfo> </xs:annotation> </xs:element> <xs:element name="four-bits" type="unsignedint4" /> <xs:element name="two-bits" type="unsignedint2" /> </xs:sequence> </xs:complexType> <xs:complexType name="Section_3_type"> <xs:sequence> <xs:element name="code" type="unsignedint2"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{. eq 3}" /> </xs:appinfo> </xs:annotation> </xs:element> <xs:element name="four-bits" type="unsignedint4" /> <xs:element name="five-bits" type="unsignedint5" /> </xs:sequence> </xs:complexType> <xs:group name="padToByteBoundary"> <xs:sequence dfdl:alignment="8" dfdl:alignmentUnits="bits" dfdl:fillByte="%#r00;"/> </xs:group> </xs:schema>
