[ 
https://issues.apache.org/jira/browse/DAFFODIL-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886200#comment-17886200
 ] 

Steve Lawrence commented on DAFFODIL-2896:
------------------------------------------

I'm in favor of deprecating the "full" behavior. I'm not sure we really need to 
do double validation. And I think it actually makes testing harder because it's 
not clear where your validation messages are coming from. If we are testing 
validation, we probably want to explicitly test a message came from Daffodil or 
Xeres and not just that something found a problem somewhere.

I wonder if we should also deprecate "limited" and change it to "daffodil"? I 
feel like we always have to describe what the difference between "limited" and 
"full" is. If they were called "daffodil" and "xerces" it would probably be 
more clear.

Or maybe we should also avoid using specific library names so that we can 
change them in the future? "daffodil" is probably fine instead of limited, but 
maybe maybe "xerces" wants to be "xsd"?

Regarding pluggability, it would be really nice if we could just do

{code}
daffodil parse --validate xsd -s foo.dfdl.xsd
{code}

And SPI would look up and find the "xsd" validator. One issue with this is we 
need to provide the main schema to Xerces. Right now we do that inside Daffodil 
by hard coding things. If we wanted to go full SPI and not special case 
anything we would need away to provide that to Xerces. Or require something 
similar to what we do for schematron, e.g.

{code}
daffodil parse --validate xsd=foo.dfdl.xsd -s foo.dfdl.xsd
{code}

That's convenient if you want to validate with a separate schema than parse 
(needed for things like stringAsXml), but it's kind of a pain for most other 
cases since you have to duplicate the DFDL schema in the args. Maybe all 
validation factories (even schematron) are passed in the root DFDL schema and 
they can just choose to ignore it or not? That could actually be useful for 
DFDL files with embedded schematron, right now schematron embedded in DFDL must 
be done like

{code}
daffodil parse --validate schematron=foo.dfdl.xsd -s foo.dfdl.xsd
{code}

Maybe
{code}
daffodil parse --validation schematron -s foo.dfdlxsd
{code}
Defaults to assuming it's embedded schematron rules.

Another thought, if we really want support for "full", we could always extend 
things to support a list of validators, for example, "--validate full"  could 
be done like:

{code}
daffodil parse --validate xsd --validate daffodil -s foo.dfdl.xsd
{code}

It's a bit verbose, but it's probably rare so probably fine. And I"m not sure 
we'd ever need it, but the current CLI syntax could extend to that pretty 
easily.

> validationMode=full enables Daffodils limited validation
> --------------------------------------------------------
>
>                 Key: DAFFODIL-2896
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2896
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Back End
>            Reporter: Steve Lawrence
>            Assignee: Olabusayo Kilo
>            Priority: Major
>             Fix For: 3.7.0
>
>
> Daffodil has three validation modes: off, limited, and full. "Limited" 
> enables Daffodils internal validation during parsing and "full" enables 
> Xerces validation at the end of parsing. However, "full" also enables 
> Daffodils limited validation, so we incur extra overhead with full validation.
> We should change full validation so it only does Xerces validation. Note that 
> this change is not backwards compatible and could potentially break tests 
> that use full validation but specifically look for Daffodil validation 
> messages.
> Alternatively, we may want to consider adding a new validation mode that only 
> enables xerces. Then "full" could keep its current behavior of validating 
> with Daffodil and Xerces. This has the added benefit that we can get complete 
> test coverage of both validation mechanisms without having to duplicate tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to