Thank you Steve! I implemented your idea (I made a small change to the 
expression in the discriminator). When I run it, parsing does not stop when all 
zeroes are encountered; it keeps going to the end of the input file. Here is an 
excerpt from the generated XML file; you can see that parsing continued even 
when all zeroes are encountered:

<Import_Directory_Table>
    ...
    <Import_Directory_Entry>
        <Import_Lookup_Table_RVA>
            <address_in_hex>BFB50002</address_in_hex>
        </Import_Lookup_Table_RVA>
        <Date_Time_Stamp>1970-01-04T00:49:04</Date_Time_Stamp>
        <Forwarder_Chain>0</Forwarder_Chain>
        <Name_RVA>
            <address_in_hex>00407E90</address_in_hex>
        </Name_RVA>
        <Import_Address_Table_RVA>
            <address_in_hex>0000015E</address_in_hex>
        </Import_Address_Table_RVA>
    </Import_Directory_Entry>
    <Import_Directory_Entry>
        <Import_Lookup_Table_RVA>
            <address_in_hex>00000000</address_in_hex>
        </Import_Lookup_Table_RVA>
        <Date_Time_Stamp>1970-01-01T00:00:00</Date_Time_Stamp>
        <Forwarder_Chain>0</Forwarder_Chain>
        <Name_RVA>
            <address_in_hex>00000000</address_in_hex>
        </Name_RVA>
        <Import_Address_Table_RVA>
            <address_in_hex>00000000</address_in_hex>
        </Import_Address_Table_RVA>
    </Import_Directory_Entry>
    <Import_Directory_Entry>
        <Import_Lookup_Table_RVA>
            <address_in_hex>00000000</address_in_hex>
        </Import_Lookup_Table_RVA>
        <Date_Time_Stamp>1970-01-01T00:00:00</Date_Time_Stamp>
        <Forwarder_Chain>0</Forwarder_Chain>
        <Name_RVA>
            <address_in_hex>00000000</address_in_hex>
        </Name_RVA>
        <Import_Address_Table_RVA>
            <address_in_hex>00000000</address_in_hex>
        </Import_Address_Table_RVA>
    </Import_Directory_Entry>
    ...
</Import_Directory_Table>

Below is my schema. Do you see any errors in it? Perhaps there is a bug in 
Daffodil?  /Roger

<xs:element name="Import_Directory_Table">
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="Import_Directory_Entry" maxOccurs="unbounded" />
            <xs:element name="Terminating_Bytes" type="xs:hexBinary"
                dfdl:lengthKind="explicit" dfdl:length="20">
                <xs:annotation>
                <xs:appinfo source="http://www.ogf.org/dfdl/";>
                        <dfdl:assert>
                        { . eq 
xs:hexBinary("0000000000000000000000000000000000000000") }
                        </dfdl:assert>
                </xs:appinfo>
        </xs:annotation>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:element name="Import_Directory_Entry">
    <xs:annotation>
        <xs:appinfo source="http://www.ogf.org/dfdl/";>
            <dfdl:discriminator test="{ 
                fn:not(
                (./Import_Lookup_Table_RVA/address_in_hex eq '00000000') and
                (xs:dateTime(./Date_Time_Stamp) eq 
xs:dateTime('1970-01-01T00:00:00')) and
                (xs:int(./Forwarder_Chain) eq 0) and
                (./Name_RVA/address_in_hex eq '00000000') and
                (./Import_Address_Table_RVA/address_in_hex eq '00000000')
                )
                }" />
        </xs:appinfo>
    </xs:annotation>
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Import_Lookup_Table_RVA" type="addressType" />
            <xs:element name="Date_Time_Stamp" type="xs:dateTime" 
dfdl:length="4" 
                dfdl:lengthKind="explicit" 
dfdl:binaryCalendarRep="binarySeconds" 
                dfdl:lengthUnits="bytes" 
dfdl:binaryCalendarEpoch="1970-01-01T00:00:00" />
            <xs:element name="Forwarder_Chain" type="unsignedint32" />
            <xs:element name="Name_RVA" type="addressType" />
            <xs:element name="Import_Address_Table_RVA" type="addressType" />
        </xs:sequence>
    </xs:complexType>
</xs:element>


-----Original Message-----
From: Steve Lawrence <[email protected]> 
Sent: Tuesday, February 5, 2019 8:05 AM
To: [email protected]; Costello, Roger L. <[email protected]>
Subject: [EXT] Re: The number of occurrences is dictated by the last value 
being all nulls ... how to express this?

Instead of using dfdl:lengthKind="pattern", I think it might be better to use a 
discriminator that fails when all the values in an entry are zero (whatever 
zero means for each of the values), followed by an element to consume those 
zero bytes. So something along these lines:

 <xs:element name="Import_Directory_Table">
   <xs:complexType>
     <xs:sequence>
       <xs:element ref="Import_Directory_Entry" maxOccurs="unbounded" />
       <xs:element name="Terminating_Bytes" type="xs:hexBinary"
         dfdl:lengthKind="explicit" dfdl:length="20">
         <annotation>
           <appinfo source="http://www.ogf.org/dfdl/";>
             <dfdl:assert>
 { . eq xs:hexBinary("0000000000000000000000000000000000000000") }
             </dfdl:assert>
           </appinfo>
         </annotation>
       </xs:element>
     </xs:sequence>
   </xs:complexType>
 </xs:element>

 <xs:element name="Import_Directory_Entry">
   <annotation>
     <appinfo source="http://www.ogf.org/dfdl/";>
       <dfdl:discriminator>
 {
   (./Import_Lookup_Table_RVA ne 0) and
   (./Date_Time_Stamp ne xs:date("1970-01-01T00:00:00")) and
   (./Forwarder_Change ne 0) and
   (./Name_RVA ne 0) and
   (./Import_Address_Table_RVA ne 0)
 }
       </dfdl:discriminator>
     </appinfo>
   </annotation>
   <xs:complexType>
     <xs:sequence>
        <!-- your elements here -->
     </xs:sequence>
   </xs:complexType>
 </xs:element>

So this will parse an unbounded number of Import_Directory_Entry's until either 
the discriminator fails. This discriminator is designed to fail when all of the 
values are zero. Note that this requires some thought about what "zero" means 
for each of the values. I'm not sure of the type of addressType, but for the 
dateTimeStamp, a zero value is represented as the unix epoch. The above assumes 
address Type is an int, but that maybe need to be adjusted.

Eventually that discriminator will fail when the values are all zero and so it 
will backtrack that last terminating entry. We then consume those
20 bytes and assert that they were all zero. It might be reasonable to put the 
Terminating_Bytes element in a hidden group since it's effectively a constant. 
This would leave you with just the non-zero entries in the infoset.

- Steve


On 2/4/19 6:00 PM, Costello, Roger L. wrote:
> Hello DFDL community,
> 
> My input file contains a bunch of Import Directory Entries. The last 
> Entry is empty (filled with null values). I figured that the way to 
> handle this is with a lengthPattern (see below), but that results in this 
> error message:
> 
> [error] Schema Definition Error: Element element reference 
> {}Import_Directory_Entry does not meet the requirements of 
> Pattern-Based lengths and Scanability.
> 
> What is the right way to express this?  /Roger
> 
> <xs:elementname="Import_Directory_Table">
> <xs:complexType>
> <xs:sequence>
> <xs:elementref="Import_Directory_Entry"maxOccurs="unbounded"
>                  dfdl:lengthKind="pattern"dfdl:lengthUnits="bytes"
>                  dfdl:lengthPattern=".+?(?=((\x00){20,20}|$))"/>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
> 
> <xs:elementname="Import_Directory_Entry">
> <xs:complexType>
> <xs:sequence>
> <xs:elementname="Import_Lookup_Table_RVA"type="addressType"/>
> <xs:elementname="Date_Time_Stamp"type="xs:dateTime"dfdl:length="4"
>                  
> dfdl:lengthKind="explicit"dfdl:binaryCalendarRep="binarySeconds"
>                  
> dfdl:lengthUnits="bytes"dfdl:binaryCalendarEpoch="1970-01-01T00:00:00"
> /> <xs:elementname="Forwarder_Chain"type="unsignedint32"/>
> <xs:elementname="Name_RVA"type="addressType"/>
> <xs:elementname="Import_Address_Table_RVA"type="addressType"/>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
> 

Reply via email to