So our warning is not correct here. It doesn't distinguish discriminators/asserts with test kind 'pattern' from those with text kind 'expression'.
The DFDL spec Section 9.5 Evaluation Order for Statement Annotations says that statement annotations on a sequence execute 1. dfdl:discriminator or dfdl:assert(s) with testKind 'pattern' (parsing only) 2. dfdl:newVariableInstance(s) - in lexical order, innermost schema component first 3. dfdl:setVariable(s) - in lexical order, innermost schema component first 4. dfdl:sequence or dfdl:choice or dfdl:group following property scoping rules and evaluating any property expressions (corresponds to ComplexContent grammar region) 5. dfdl:discriminator or dfdl:assert(s) with testKind 'expression' (parsing only) Step 4 is the parsing of the contents of the sequence. 1, 2, and 3 happen before the sequence contents are parsed. Only 5 happens after. So only 5, written at the top of a sequence with some content besides just the annotation, has the counterintuitive placement issue where the sequence body happens first, then the discriminator/assert that appears lexically before it. There was some rationale for this odd rule that I can't crisply recall, having to do with schema authors being able to write discriminators without having to worry about whether they are forward referencing or not. But a pattern discriminator should not be triggering the warning. I'll submit a bug report on this. On Thu, Jun 13, 2024 at 8:12 AM Steve Lawrence <slawre...@apache.org> wrote: > Discriminators/asserts annotated on sequences are evaluated at the end of > the > sequence after the content has been evaluated. This is usually not what is > intended so Daffodil outputs this warning. To get it to evaluate before > the > content (like what you intend in this case) you need to wrap the > discriminator/assert in a sequence, for example: > > <xs:sequence> > <xs:sequence> > <xs:annotation> > <xs:appinfo source="http://www.ogf.org/dfdl/"> > <dfdl:discriminator ... /> > </xs:appinfo> > </xs:annotation> > </xs:sequence> > <xs:element name='A' type="unsignedint1"/> > </xs:sequence> > > Tha should resolve the warning. > > Also, note that lengthKind="pattern" is character based, so might not work > as > expected since your data is bit-based. In order for a pattern > discriminator to > work you need to decode 4-bit characters (since both fields you care about > are 4 > bits) after skipping the first bit. We have the X-DFDL-HEX-MSBF encoding > that > can decode 4-bit characters to a hex string, but there is no way to skip > the > first bit. You could move the discriminator sequence to after the A > element, > which works since at that point A is already parsed and the first bit is > skipped, e.g. > > <xs:sequence> > <xs:element name='A' type="unsignedint1"/> > <xs:sequence dfdl:encoding="X-DFDL-HEX-MSBF"> > <xs:annotation> > <xs:appinfo source="http://www.ogf.org/dfdl/"> > <dfdl:discriminator testKind="pattern" testPattern="(79)|(7A)" > /> > </xs:appinfo> > </xs:annotation> > </xs:sequence> > </xs:sequence> > > But that feels a bit complex and unclear to me. > > I would suggest instead using the dfdlx:lookAhead extension function which > was > added exactly for this kind of thing. For example, your discriminator > would look > something like this: > > <dfdl:discriminator testKind="expression" test="{ > ((dfdlx:lookAhead(1, 4) eq 7) and (dfdlx:lookAhead(5, 4) eq 9)) or > ((dfdlx:lookAhead(1, 4) eq 7) and (dfdlx:lookAhead(5, 4) eq 10)) > }" /> > > It's a bit longer, but I think the intention is much ore clear. It's also > probably going to be significantly more efficient since it doesn't have to > decode characters and then run a regex on them. > > > On 2024-06-13 07:09 AM, Roger L Costello wrote: > > Hi Folks, > > > > I need my DFDL to look-ahead: > > > > The decimal value of the first bit of the input should be wrapped in an > <A> tag if the decimal value of the next 4 bits equals 7 and the decimal > value of the 4 bits after that equals 9, or the decimal value of the next 4 > bits equals 7 and the decimal value of the 4 bits after that equals 10, > otherwise the first bit of the input should be wrapped in a <B> tag. > > > > Below is my attempt at implementing this. Daffodil gives this warning > message: > > > > Schema Definition Warning: Counterintuitive placement detected. Wrap the > discriminator or assert in an empty sequence to evaluate before the > contents. (id: discouragedDiscriminatorPlacement) > > > > Here is my DFDL: > > > > <xs:element name="test"> > > <xs:complexType> > > <xs:sequence> > > <xs:choice> > > <xs:sequence> > > <xs:annotation> > > <xs:appinfo source="http://www.ogf.org/dfdl/"> > > <dfdl:discriminator testKind="pattern" > testPattern="(.x7x9)|(.x7xA)" /> > > </xs:appinfo> > > </xs:annotation> > > <xs:element name='A' type="unsignedint1"/> > > </xs:sequence> > > <xs:sequence> > > <xs:element name='B' type="unsignedint1"/> > > </xs:sequence> > > </xs:choice> > > <xs:element name='C' type="unsignedint4"/> > > <xs:element name='D' type="unsignedint4"/> > > </xs:sequence> > > </xs:complexType> > > </xs:element> > >