Ah, so you have some simple problems here, and this thorny little issue about the NUL character.
Your regex, the character entities say  this must have a trailing ";" to terminate the character entity However, � is just plain disallowed by XML period. Can't put a NUL into XML even using a character entity to do so. This is one of the things I distinctly dislike about XML. To cope with this, given that in DFDL people have to talk about real data with NUL in it, DFDL does a bi-directional remapping from 0 to  But, you are trying to express a numeric range that is from char code 0 to char code 7F. So you can't just change your regex to use  because that's not the bottom of the range. To do what you want you need your regex to say [-] Notice the semicolons in there. With respect to the final CRLF at end of file, there are techniques to cope with this. We need to clarify, what is the canonical/preferred representation, and whether you want your schema to accept data that is missing this final CRLF. Assuming the final CRLF is required, non-optional, you can change the newline separator to add the DFDL property dfdl:separatorPosition="postfix" Just on the sequence that contains the rows of data. This means you get all the infix separator line-endings, plus one more at the end. However, that one at the end is NOT optional. If not present, you'll get parse errors. If you want the final CRLF missing to be tolerated on parsing, and whether it is there or not preserved when unparsing, then you actually have to model it as a data element: <element name="finalLineEnding" type="xs:string" minOccurs="0" dfdl:lengthKind="explicit" dfdl:length="0" dfdl:initiator="%CR;%LF;"/> That final element will absorb, and represent, a final CRLF, and on unparsing, lay it down so it matches the input data. ________________________________ From: Attila Horvath <attila.j.horv...@gmail.com> Sent: Monday, March 1, 2021 2:03 PM To: users@daffodil.apache.org <users@daffodil.apache.org> Subject: Re: regex |AND| left over data 1) b) should read ...value="�-" On 2021/03/01 18:58:08, Attila Horvath <attila.j.horv...@gmail.com> wrote: > All - two quick questions... > > 1) regex > > I am trying to use character range query in regex-pression like: > a)... > <xs: restriction base="xs:string"> > <xs:pattern value="[\x00-\x7F]{0,10}"/> > </cs:restriction> > |OR| > b)... > <xs: restriction base="xs:string"> > <xs:pattern value="[�-]{0,10}"/> > </cs:restriction> > - either way both throw error(s) re: invalid regex expression syntax. > - what is correct syntax for range of hex values? > > 2) my CSV files has CR/LF at end of last line in file > - when parsing, I get numerous warnings ultimately "left over data" > ...starting at byte xyz (0x0d0a...) > a) how to consume (parse) last two bytes and avoid warnings > b) how to reconstitute (unparse) so last two bytes are included > > Thx in advance > > Attila (newbie) >