Hi Mike,

Thank you very much! Based on your excellent information I succeeded in getting 
the schema to work. See below for the working schema.

The key was the addition of dfdl:assert

<xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/";>
        <dfdl:assert>{ fn:string-length(.) gt 0 }</dfdl:assert>
    </xs:appinfo>
</xs:annotation>

Without it, I get the "infinite loop" error.

I don't understand why the dfdl:assert should be necessary. After all, the plus 
sign ( + ) in the regex

dfdl:lengthPattern="[\x20-\x7F]+?(?=\x00)"

specifies that the string must contain at least one character. Can you describe 
a bit more why the dfdl:assert is needed, please?

Happy New Year! /Roger

<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="String" maxOccurs="unbounded">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="value" type="xs:string"
                            dfdl:lengthKind="pattern"
                            dfdl:lengthPattern="[\x20-\x7F]+?(?=\x00)"
                            dfdl:representation="text"
                            dfdl:encoding="ASCII">
                            <xs:annotation>
                                <xs:appinfo source="http://www.ogf.org/dfdl/";>
                                    <dfdl:assert>{ fn:string-length(.) gt 0 
}</dfdl:assert>
                                </xs:appinfo>
                            </xs:annotation>
                        </xs:element>
                        <xs:sequence dfdl:hiddenGroupRef="hidden_nulls_Group" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:group name="hidden_nulls_Group">
    <xs:sequence>
        <xs:element name="Hidden_nulls" type="xs:hexBinary"
            dfdl:lengthKind="pattern"
            dfdl:lengthUnits="bytes"
            dfdl:lengthPattern="[\x00]+?(?=([^\x00]|$))" 
dfdl:outputValueCalc='{ . }' />
    </xs:sequence>
</xs:group>


From: Beckerle, Mike <[email protected]>
Sent: Monday, December 31, 2018 12:50 PM
To: [email protected]
Subject: [EXT] Re: Why am I getting an "infinite loop" error message?


I have hit what I think is this problem this problem a bunch of times.

I have come to think of it as a flaw in dfdl.
The problem is lengthKind pattern, and what it means when there is no match.
Intuitively we think no match should cause a failure, and backtrack, but what 
it means is the length is "however much is matched", so no match means length 
zero. I.e., no match is a successful parse, producing zero length.
Seriously, I think DFDL may need a new length kind of patternMatch where it 
must positively match, where failure to match is a true failure  aka parse 
error.

You can simulate this by adding an dfdl:assert to the string element insisting 
that its length is greater than 0.
E.g.,
 <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/";>
    <dfdl:assert>{ fn:string-length(.) gt 0 }</dfdl:assert>
</xs:appinfo> </xs:annotation>

This will force failure and therefore backtracking if the regex match length is 
actually zero, which it should never be in your case.

What I think is happening here is at some point here, your match fails, which 
results in zero length for the element, and then your repeating thing has zero 
length, and a zero-length repeating thing, when maxOccurs="unbounded" is an 
error, because it would result in an infinite loop.

As for what's causing your match to fail, I'm less sure, Just some ideas here.

Keep in mind a regex match for lengthKind pattern, those \xHH patterns are 
matching character code points, not bytes. The correspondence of character code 
point to byte is only 1 to 1 if you specify iso-8859-1.

I think even though your hidden group is hexBinary, there may be some daffodil 
bug there. I suggest you try making the hidden group element not hidden (for 
testing), and make the element a string with encoding iso-8859-1 rather than a 
hexBinary.

Your regex might be simplified. Really it's just [\x00]+ I think, i.e., match 
as many nulls as possible. I don't think you need the added complexity of 
telling it to match reluctantly up until a non-null or end of data. I'm not 
sure what that added stuff achieves.

I don't know this is your error, but a common error is to forget that ASCII is 
7 bits only. So for example \xFF will never be a valid ASCII char and if that 
byte 0xFF is found in the data it will cause a replacement character and that 
replacement character will NOT match your regex. So the encoding really 
matters. If you are using \xFF as a byte, you need iso-8859-1 encoding for sure.


I hope that all helps



Happy New Year



Mike Beckerle

Tresys Technology.







Get Outlook for Android<https://aka.ms/ghei36>


From: Costello, Roger L.
Sent: Monday, December 31, 11:30 AM
Subject: Why am I getting an "infinite loop" error message?
To: [email protected]<mailto:[email protected]>

Hello DFDL community,

I have a binary input file containing:

               string null(s) string null(s) ....

Here is my input file:

[Image]

Notice that each string is followed by one or more null symbols.

One way to characterize the input is that there is a list of:

string followed by one or more nulls

The schema below is my attempt to faithfully implement that characterization. 
However, when I execute the schema, I get this "infinite loop" error message:

[error] Parse Error: Repeating or Optional Element -
No forward progress at byte 47. Attempt to parse
List_of_strings succeeded but consumed no data.
Please re-examine your schema to correct this infinite loop.

I do not understand where the infinite loop is occurring. Would you explain, 
please? How to fix it?  /Roger

<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="List_of_strings" maxOccurs="unbounded">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="string" type="xs:string"
                            dfdl:lengthKind="pattern"
                            dfdl:lengthPattern="[\x01-\xFF]+?(?=\x00)"
                            dfdl:representation="text"
                            dfdl:encoding="ISO-8859-1"/>
                        <xs:sequence dfdl:hiddenGroupRef="hidden_null_Group" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>
<xs:group name="hidden_null_Group">
    <xs:sequence>
        <xs:element name="Hidden_null" type="xs:hexBinary"
            dfdl:lengthKind="pattern"
            dfdl:lengthUnits="bytes"
            dfdl:lengthPattern="[\x00]+?(?=([^\x00]|$))" 
dfdl:outputValueCalc='{ . }' />
    </xs:sequence>
</xs:group>



Reply via email to