This is correct. The way daffodil currently implements full validation (xerces) and custom validation (e.g. schematron) is pretty inefficient. We create two infosets: one the kind that the user passed to the parse function, and one that is text XML written to a ByteArrayOuputStream in memory that is used internally for the validation once the parse is completed. We do not currently stream validation.

If you wanted streaming, you would probably need to create custom InfosetOutputter, or maybe use the SAXInfosetOutputter with an XMLReader that chains/tees SAX events to custom schematron validation.

- Steve

On 2023-07-22 03:29 AM, Claude Mamo wrote:
Spotted this code so presumably it's not streaming when custom or full validation is in force: https://github.com/apache/daffodil/blob/main/daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/DataProcessor.scala#L345-L356 <https://github.com/apache/daffodil/blob/main/daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/DataProcessor.scala#L345-L356>

Claude

On Sat, Jul 22, 2023 at 8:07 AM Claude Mamo <claude.m...@gmail.com <mailto:claude.m...@gmail.com>> wrote:

    Hello Daffodil team,

    I'm looking into adding support for Schematron validation since we
    have had many Smooks developers asking for better validation of
    EDIFACT documents. One question I have is whether Schematron
    validation is applied in a streaming fashion. I mean, does Daffodil
    load the whole infoset into memory before applying the Schematron
    rules or is Schematron validating on the fly while accumulating any
    state that is required to be able to evaluate the rules?

    Thanks,

    Claude


Reply via email to