mbeckerle commented on a change in pull request #452:
URL: https://github.com/apache/incubator-daffodil/pull/452#discussion_r528877817



##########
File path: 
daffodil-schematron/src/main/resources/iso-schematron-xslt2/schematron-skeleton-api.htm
##########
@@ -0,0 +1,757 @@
+<!--

Review comment:
       Do we need this file in our schematron embodiment. Can we instead just 
reference doc in some location for iso schematron?

##########
File path: 
daffodil-schematron/src/main/resources/iso-schematron-xslt2/ExtractSchFromXSD-2.xsl
##########
@@ -0,0 +1,128 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+         http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+
+-->
+<!--
+    Extract embedded Schematron schemas in W3C XML Schemas schemas    

Review comment:
       So this makes me think this implementation can handle schematron rules 
embedded in the DFDL schema, but I didn't see tests for this. 
   
   That's ok for now, but this is very (very) desirable to centralize the 
schematron rules in the DFDL schema in many cases, so we should add tests for 
this to illustrate how it works, and if for some reason it does not, add a JIRA 
ticket for activating/fixing that feature. 

##########
File path: 
daffodil-schematron/src/main/resources/iso-schematron-xslt2/readme.txt
##########
@@ -0,0 +1,100 @@
+<h1>ISO SCHEMATRON 2010</h1>

Review comment:
       In the context of Daffodil, this file is a bit misleading. I understand 
including it purely for completeness (include all of it, to be clear nothing 
subset-wise is going on.) Is that the rationale for including this? 
   
   If so then I would suggest the daffodil-schematron/README.md file contain 
explanation about including *everything* without exception, vs. picking and 
choosing just what is required. 

##########
File path: 
daffodil-schematron/src/main/scala/org/apache/daffodil/validation/Schematron.scala
##########
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.validation
+
+import java.io.InputStream
+import java.io.StringWriter
+
+import javax.xml.parsers.ParserConfigurationException
+import javax.xml.parsers.SAXParserFactory
+import javax.xml.transform.Source
+import javax.xml.transform.Templates
+import javax.xml.transform.TransformerFactory
+import javax.xml.transform.URIResolver
+import javax.xml.transform.dom.DOMResult
+import javax.xml.transform.dom.DOMSource
+import javax.xml.transform.sax.SAXSource
+import javax.xml.transform.stream.StreamResult
+import javax.xml.transform.stream.StreamSource
+import org.apache.daffodil.api.ValidatorInitializationException
+import org.xml.sax.InputSource
+import org.xml.sax.SAXException
+import org.xml.sax.XMLReader
+
+
+/**
+ * Schematron engine implementation
+ */
+object Schematron {
+  val templatesRootDir = "iso-schematron-xslt2"
+  private val templatesPipeline = Array("iso_dsdl_include.xsl",
+                                        "iso_abstract_expand.xsl",
+                                        "iso_svrl_for_xslt2.xsl")
+
+  def fromRules(rules: Templates) = new Schematron(xmlReader.get(), rules)
+
+  def templatesFor(sch: InputStream, tf: TransformerFactory): Templates = 
tf.newTemplates(
+    templatesPipeline.foldLeft(new StreamSource(sch): Source) {
+      (source, template) =>
+        val xsl = 
getClass.getClassLoader.getResourceAsStream(s"$templatesRootDir/$template")
+        val result: DOMResult = new DOMResult
+        tf.newTransformer(new StreamSource(xsl)).transform(source, result)
+        new DOMSource(result.getNode)
+    }
+  )
+
+  def isoTemplateResolver(child: Option[URIResolver]) = new 
ClassPathUriResolver(Schematron.templatesRootDir, child)
+
+  // reduce overhead by caching the xml reader, but the SAXParser class is not 
thread safe so use a thread local
+  private val xmlReader = new ThreadLocal[XMLReader] {
+    override def initialValue(): XMLReader = {
+      val fac = SAXParserFactory.newInstance
+      try {
+        fac.setFeature(javax.xml.XMLConstants.FEATURE_SECURE_PROCESSING, true)
+        
fac.setFeature("http://xml.org/sax/features/external-general-entities";, false)
+        
fac.setFeature("http://xml.org/sax/features/external-parameter-entities";, false)
+        
fac.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd";,
 false)
+      } catch {
+        case ex@(_: ParserConfigurationException | _: SAXException) =>
+          throw ValidatorInitializationException(s"Error setting feature on 
parser: ${ex.getMessage}")
+      }
+      fac.setValidating(false)
+      fac.newSAXParser.getXMLReader
+    }
+  }
+}
+
+final class Schematron private(reader: XMLReader, templates: Templates) {
+  private lazy val transformer = templates.newTransformer
+
+  def validate(is: InputStream): String = {
+    val writer = new StringWriter
+    transformer.transform(new SAXSource(reader, new InputSource(is)), new 
StreamResult(writer))

Review comment:
       Sweet. All the overhead is compile/startup/first-use time, and at parse 
time we do this one call, then extract diagnostics from its output. Nice. 

##########
File path: daffodil-schematron/README.md
##########
@@ -0,0 +1,48 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Schematron Validation
+===
+
+Daffodil Validator for Schematron
+
+XSLT implementation adapted from the [Camel Schematron 
Component](https://github.com/apache/camel/tree/master/components/camel-schematron).

Review comment:
       Adapted in what way?
   There are files below that are included (such as a readme.txt) that may not 
be meaningful in Daffodil context. Were they included by mistake or is there a 
"include everything" approach here where we're only modifying the tree from 
Camel in minimal ways so as, for example, to facilitate diffs/merges of fixes 
etc. from Camel to here, or vice versa?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to