Thank you Steve! I implemented your idea (I made a small change to the
expression in the discriminator). When I run it, parsing does not stop when all
zeroes are encountered; it keeps going to the end of the input file. Here is an
excerpt from the generated XML file; you can see that parsing continued even
when all zeroes are encountered:
<Import_Directory_Table>
...
<Import_Directory_Entry>
<Import_Lookup_Table_RVA>
<address_in_hex>BFB50002</address_in_hex>
</Import_Lookup_Table_RVA>
<Date_Time_Stamp>1970-01-04T00:49:04</Date_Time_Stamp>
<Forwarder_Chain>0</Forwarder_Chain>
<Name_RVA>
<address_in_hex>00407E90</address_in_hex>
</Name_RVA>
<Import_Address_Table_RVA>
<address_in_hex>0000015E</address_in_hex>
</Import_Address_Table_RVA>
</Import_Directory_Entry>
<Import_Directory_Entry>
<Import_Lookup_Table_RVA>
<address_in_hex>00000000</address_in_hex>
</Import_Lookup_Table_RVA>
<Date_Time_Stamp>1970-01-01T00:00:00</Date_Time_Stamp>
<Forwarder_Chain>0</Forwarder_Chain>
<Name_RVA>
<address_in_hex>00000000</address_in_hex>
</Name_RVA>
<Import_Address_Table_RVA>
<address_in_hex>00000000</address_in_hex>
</Import_Address_Table_RVA>
</Import_Directory_Entry>
<Import_Directory_Entry>
<Import_Lookup_Table_RVA>
<address_in_hex>00000000</address_in_hex>
</Import_Lookup_Table_RVA>
<Date_Time_Stamp>1970-01-01T00:00:00</Date_Time_Stamp>
<Forwarder_Chain>0</Forwarder_Chain>
<Name_RVA>
<address_in_hex>00000000</address_in_hex>
</Name_RVA>
<Import_Address_Table_RVA>
<address_in_hex>00000000</address_in_hex>
</Import_Address_Table_RVA>
</Import_Directory_Entry>
...
</Import_Directory_Table>
Below is my schema. Do you see any errors in it? Perhaps there is a bug in
Daffodil? /Roger
<xs:element name="Import_Directory_Table">
<xs:complexType>
<xs:sequence>
<xs:element ref="Import_Directory_Entry" maxOccurs="unbounded" />
<xs:element name="Terminating_Bytes" type="xs:hexBinary"
dfdl:lengthKind="explicit" dfdl:length="20">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:assert>
{ . eq
xs:hexBinary("0000000000000000000000000000000000000000") }
</dfdl:assert>
</xs:appinfo>
</xs:annotation>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Import_Directory_Entry">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:discriminator test="{
fn:not(
(./Import_Lookup_Table_RVA/address_in_hex eq '00000000') and
(xs:dateTime(./Date_Time_Stamp) eq
xs:dateTime('1970-01-01T00:00:00')) and
(xs:int(./Forwarder_Chain) eq 0) and
(./Name_RVA/address_in_hex eq '00000000') and
(./Import_Address_Table_RVA/address_in_hex eq '00000000')
)
}" />
</xs:appinfo>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element name="Import_Lookup_Table_RVA" type="addressType" />
<xs:element name="Date_Time_Stamp" type="xs:dateTime"
dfdl:length="4"
dfdl:lengthKind="explicit"
dfdl:binaryCalendarRep="binarySeconds"
dfdl:lengthUnits="bytes"
dfdl:binaryCalendarEpoch="1970-01-01T00:00:00" />
<xs:element name="Forwarder_Chain" type="unsignedint32" />
<xs:element name="Name_RVA" type="addressType" />
<xs:element name="Import_Address_Table_RVA" type="addressType" />
</xs:sequence>
</xs:complexType>
</xs:element>
-----Original Message-----
From: Steve Lawrence <[email protected]>
Sent: Tuesday, February 5, 2019 8:05 AM
To: [email protected]; Costello, Roger L. <[email protected]>
Subject: [EXT] Re: The number of occurrences is dictated by the last value
being all nulls ... how to express this?
Instead of using dfdl:lengthKind="pattern", I think it might be better to use a
discriminator that fails when all the values in an entry are zero (whatever
zero means for each of the values), followed by an element to consume those
zero bytes. So something along these lines:
<xs:element name="Import_Directory_Table">
<xs:complexType>
<xs:sequence>
<xs:element ref="Import_Directory_Entry" maxOccurs="unbounded" />
<xs:element name="Terminating_Bytes" type="xs:hexBinary"
dfdl:lengthKind="explicit" dfdl:length="20">
<annotation>
<appinfo source="http://www.ogf.org/dfdl/">
<dfdl:assert>
{ . eq xs:hexBinary("0000000000000000000000000000000000000000") }
</dfdl:assert>
</appinfo>
</annotation>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Import_Directory_Entry">
<annotation>
<appinfo source="http://www.ogf.org/dfdl/">
<dfdl:discriminator>
{
(./Import_Lookup_Table_RVA ne 0) and
(./Date_Time_Stamp ne xs:date("1970-01-01T00:00:00")) and
(./Forwarder_Change ne 0) and
(./Name_RVA ne 0) and
(./Import_Address_Table_RVA ne 0)
}
</dfdl:discriminator>
</appinfo>
</annotation>
<xs:complexType>
<xs:sequence>
<!-- your elements here -->
</xs:sequence>
</xs:complexType>
</xs:element>
So this will parse an unbounded number of Import_Directory_Entry's until either
the discriminator fails. This discriminator is designed to fail when all of the
values are zero. Note that this requires some thought about what "zero" means
for each of the values. I'm not sure of the type of addressType, but for the
dateTimeStamp, a zero value is represented as the unix epoch. The above assumes
address Type is an int, but that maybe need to be adjusted.
Eventually that discriminator will fail when the values are all zero and so it
will backtrack that last terminating entry. We then consume those
20 bytes and assert that they were all zero. It might be reasonable to put the
Terminating_Bytes element in a hidden group since it's effectively a constant.
This would leave you with just the non-zero entries in the infoset.
- Steve
On 2/4/19 6:00 PM, Costello, Roger L. wrote:
> Hello DFDL community,
>
> My input file contains a bunch of Import Directory Entries. The last
> Entry is empty (filled with null values). I figured that the way to
> handle this is with a lengthPattern (see below), but that results in this
> error message:
>
> [error] Schema Definition Error: Element element reference
> {}Import_Directory_Entry does not meet the requirements of
> Pattern-Based lengths and Scanability.
>
> What is the right way to express this? /Roger
>
> <xs:elementname="Import_Directory_Table">
> <xs:complexType>
> <xs:sequence>
> <xs:elementref="Import_Directory_Entry"maxOccurs="unbounded"
> dfdl:lengthKind="pattern"dfdl:lengthUnits="bytes"
> dfdl:lengthPattern=".+?(?=((\x00){20,20}|$))"/>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
>
> <xs:elementname="Import_Directory_Entry">
> <xs:complexType>
> <xs:sequence>
> <xs:elementname="Import_Lookup_Table_RVA"type="addressType"/>
> <xs:elementname="Date_Time_Stamp"type="xs:dateTime"dfdl:length="4"
>
> dfdl:lengthKind="explicit"dfdl:binaryCalendarRep="binarySeconds"
>
> dfdl:lengthUnits="bytes"dfdl:binaryCalendarEpoch="1970-01-01T00:00:00"
> /> <xs:elementname="Forwarder_Chain"type="unsignedint32"/>
> <xs:elementname="Name_RVA"type="addressType"/>
> <xs:elementname="Import_Address_Table_RVA"type="addressType"/>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
>