Re: DFDL as a filter rule language?

Mike Beckerle Thu, 07 Jul 2022 08:07:47 -0700

Yes,

In numerous cases, data security filtering is easily expressed as
additional tests on the data done in the DFDL schema.

The simplest example of this is in a large data format with dozens of
message types, a cybersecurity fence may want to only allow specific
message types through.
By making the DFDL schema only contain those messages, and leaving out the
rest of the large data format, it is implicitly rejecting all the
non-authorized message types.

This comes at the cost of flexibility, as in that case it is hard to
quickly enable another message, as the schema for it isn't even available.

I prefer the notion of the DFDL being only about the format, with other
concerns being expressed separately, such as in schematron, which runs
after parsing is complete.
One can do "rules" similarly in DFDL by having a pile of assertions at the
end of the parse, so the message is fully parsed and then its contents are
subjected to additional "rule" assertions.

This position of the rules in the schema is important because you can't
just throw a DFDL assert into a DFDL schema to "fail" wherever you detect
data you want to block, as it depends on the context whether that failure
is masked by backtracking or not. But you can put a bunch of DFDL asserts
on the "top level" element, which then are applied to the entire parsed
message. Those are very much like schematron rules then at that point. They
use the DFDL expression language, not the schematron expression language,
but can still express many things.

Daffodil allows user-defined functions to be added so that DFDL schemas can
call those functions in expressions. This would allow more complex data
filtering such as using geographic regions of interest to be expressed.

On Thu, Jul 7, 2022 at 7:14 AM Roger L Costello <coste...@mitre.org> wrote:

> Hi Folks,
>
> Has anyone used DFDL as a filter rule language?
>
> For an example of what I mean by "filter rule language", suppose we have a
> comma-separated value (CSV) data format where instances contain data about
> cars (make, model, year, cost, etc.) and we have this filter rule
>
>         Release a CSV instance only if it contains data about cars
>         with year greater than 2000. Otherwise, delete the instance.
>
> It seems like DFDL/Daffodil should be able to express that filter rule.
> Yes? No?
>
> /Roger
>
>

Re: DFDL as a filter rule language?

Reply via email to