There are a number of characters that are illegal to use in XML. The NUL
character (0x00) is one of these. When Daffodil projects these characters to
XML, we map those illegal characters to the Unicode Private Use Areas. Here's
details on how we map these characters:
https://daffodil.apache.org/infoset/#xml-illegal-characters
This means that when 0x00 appears in an infoset, we map it to 0xE000 in XML. And
these PUA areas must be used for validation regexes since that is what
validators use. For example, you schema might look something like this:
<element name="field" dfdl:lengthKind="explicit" dfdl:length="{ ../length }">
<simpleType>
<restriction base="xs:string">
<pattern value="*" />
</restriction>
</simpleType>
</element>
On 2025-11-26 03:59 PM, Mark Kozak wrote:
Hi Folks,
I am reaching out after trying a dozen approaches today.
My goal is to validate a string whose length is given as an expression from a
previous field contains only the Null character (hex 00). I have tried a variety
of regular expression patterns which didn’t work. Just setting the entity to be
Nil would be great but I can’t use the but when dfdl:nilKind is literalCharacter
the entity must be fix-length.
Looking for a clever solution please.
Thanks, and have a great Thanksgiving Holiday!
-Mark
Mark Kozak
Director of Engineering
Adeptus Cyber Solutions
Adeptus-CS.com