On Fri, Jan 5, 2024 at 12:22 PM Roger L Costello <coste...@mitre.org> wrote:

> [True/False] DFDL is intended for specifying the form of the input, e.g.,
> “this field is a two text digit unsigned integer.”
>
>
>
True


> [True/False] DFDL is not a validation language, e.g., don’t use DFDL to
> specify that the field’s value must be between 00 and 44.
>
>
False. We use XSD facets for this in DFDL v1.0, but if one invented a non
XSD-based syntax for DFDL it would include a way to specify these same
constraints.
But such an implementation would still want a way to disable enforcement of
the validation checks, and a validation error would still not be a parse
error that causes backtracking or failure to parse.


>
>
> [True/False] If you want to validate the input data, do it in a downstream
> activity, after DFDL parsing.
>

False. The line between well-formed and valid is sometimes grey. One
person's well-formedness is another's validity.

Consider that original JPEG DFDL schema you created years ago. The DFDL
part of that would accept line-noise can say it was well formed, because it
just treated the file as an unstructured bag of JPEG segments one of which
is a BLOB segment. The schematron rules that enforced the segment ordering
and nesting structure were needed, not for data validation, but to tell you
if the data was even well formed. So in that case, the schematron
processing was really part of the parsing process.

False. For performance reasons, it is pretty important to handle the data
once if possible. Why not check the ranges of values at the point where you
create that value? To do otherwise means another whole pass over the data.

True. Some kinds of validity cannot be done as part of the natural
traversal of data done by a parser/unparse. E.g., verifying that a key
field is unique.


>
>
> [True/False] Design DFDL to be liberal in what input it accepts. Accept
> input as long as it is well-formed, e.g., accept 87 because it is a two
> text digit unsigned integer.
>

False - a DFDL schema should reject malformed data. It should NOT reject
invalid data.

True - in this case you are separating well-formedness from validity. The
judgement here is - if the value is 87,  which is out of range, it's still
of interest to examine the data. Maybe the spec is wrong and the de-facto
data in the world doesn't respect the range bound. You want to see this (so
complete parsing the data) or you can't even observe this phenominon.

Reply via email to