I think NL* NL+ are both reasonable suggestions. But no, today there is nothing like that.
Already on the 'we also need this' list is LSP, LSP*, LSP+ which is whitespace but NOT including line-endings. I.e.., Linear (as in within a line) whitespace. Reverse engineering data out of human-authored documents is not DFDL's forte. I've done it quite a bit, but DFDL is not the ideal kit for that work. ________________________________ From: Costello, Roger L. <[email protected]> Sent: Monday, November 26, 2018 1:35:45 PM To: [email protected] Subject: DFDL schema for input file with rows separated by one or more newlines? Hello DFDL community, I have an input file containing a sequence of rows, separated by one or more newlines, for example: [cid:[email protected]] If I assume that there are no more than 3 newlines between rows, then this works: <xs:sequence dfdl:separator="%NL; %NL;%NL; %NL;%NL;%NL;" dfdl:separatorPosition="infix" dfdl:separatorSuppressionPolicy="trailingEmptyStrict"> Of course, if the input contains 4 newlines between rows, then it breaks. I guess, the following is not legal, right? dfdl:separator="%NL;+?" Is there a more robust solution? Below is my DFDL schema. /Roger <xs:element name="really-simple-format"> <xs:complexType> <xs:sequence dfdl:separator="%NL; %NL;%NL; %NL;%NL;%NL;" dfdl:separatorPosition="infix" dfdl:separatorSuppressionPolicy="trailingEmptyStrict"> <xs:element name="row" maxOccurs="unbounded"> <xs:complexType> <xs:sequence dfdl:separator=":" dfdl:separatorPosition="infix"> <xs:element name="label" type="xs:string" /> <xs:element name="message" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element>
