mbeckerle commented on a change in pull request #431:
URL: https://github.com/apache/incubator-daffodil/pull/431#discussion_r510419029
##########
File path:
daffodil-lib/src/main/scala/org/apache/daffodil/api/ValidationMode.scala
##########
@@ -26,4 +26,8 @@ object ValidationMode extends Enum {
case object Off extends Type(10)
case object Limited extends Type(20)
case object Full extends Type(30)
+
+ // todo;; revisit;; passing the validator like this for backwards compat
with existing
Review comment:
comment no longer relevant. You are passing the validator as the arg to
Custom, which keeps this type safe.
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala
##########
@@ -419,14 +427,16 @@ class DataProcessor private (
// events are created rather than writing the entire infoset in memory and
// then validating at the end of the parse. See DAFFODIL-2386
//
- val (outputter, maybeValidationBytes) =
- if (validationMode == ValidationMode.Full) {
- val bos = new java.io.ByteArrayOutputStream()
- val xmlOutputter = new XMLTextInfosetOutputter(bos, false)
- val teeOutputter = new TeeInfosetOutputter(output, xmlOutputter)
- (teeOutputter, One(bos))
- } else {
- (output, Nope)
+ val (outputter, maybeValidationBytes) = {
+ validationMode match {
+ case ValidationMode.Full | ValidationMode.Custom(_) =>
+ val bos = new java.io.ByteArrayOutputStream()
+ val xmlOutputter = new XMLTextInfosetOutputter(bos, false)
+ val teeOutputter = new TeeInfosetOutputter(output, xmlOutputter)
+ (teeOutputter, One(bos))
Review comment:
Ok, so this is going to buffer up the data for validation. That's fine.
This is going to get really interesting when we start using the forthcoming
SAX APIs. Then we could, in principle, feed SAX events to the validator while
the parsing is going on. Validation errors might be reported incrementally
even, before parsing is complete.
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala
##########
@@ -692,40 +699,35 @@ class DataProcessor private (
class ParseResult(dp: DataProcessor, override val resultState: PState)
extends DFDL.ParseResult
with WithDiagnosticsImpl
- with ErrorHandler {
Review comment:
Or any validator right? A non-default one also has to provide an error
handler, yes?
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala
##########
@@ -692,40 +699,35 @@ class DataProcessor private (
class ParseResult(dp: DataProcessor, override val resultState: PState)
extends DFDL.ParseResult
with WithDiagnosticsImpl
- with ErrorHandler {
+ with Logging {
/**
- * To be successful here, we need to capture xerces parse/validation
+ * To be successful here, we need to capture parse/validation
* errors and add them to the Diagnostics list in the PState.
*
- * @param state the initial parse state.
+ * @param bytes the parsed Infoset
*/
def validateResult(bytes: Array[Byte]): Unit = {
Assert.usage(resultState.processorStatus eq Success)
- val schemaURIStrings =
resultState.infoset.asInstanceOf[InfosetElement].runtimeData.schemaURIStringsForFullValidation
- try {
- val bis = new java.io.ByteArrayInputStream(bytes)
- Validator.validateXMLSources(schemaURIStrings, bis, this)
- } catch {
- //
- // Some SAX Parse errors are thrown even if you specify an error handler
to the
- // validator.
- //
- // So we also need this catch
- //
- case e: SAXException =>
- resultState.validationErrorNoContext(e)
+
+
+ val v = dp.validationMode match {
+ case ValidationMode.Custom(cv) => cv
+ case _ =>
+ /// todo;; potential corner case (or not corner?) of needing this
dynamic value to configure a validator
+ ///
resultState.infoset.asInstanceOf[InfosetElement].runtimeData.schemaURIStringsForFullValidation.mkString(",")
+ val v =
ConfigValueFactory.fromAnyRef(resultState.infoset.asInstanceOf[InfosetElement].runtimeData.schemaURIStringsForFullValidation)
+ Validators.default.make(
+ Some(ConfigFactory.parseMap(Map(Validators.default.name() ->
v).asJava))
Review comment:
Yeah, I've been unhappy with this InfosetElement vs. DIElement thing in
Daffodil since shortly after putting it in. Not clear having that distinction
adds any real value. The notion was that InfosetElement could be available on
the API, and would expose only things API users need. But we end up downcasting
them to DIElement all the time.
But I don't think we need the downcast here. So I think you can shorten:
```
resultState.infoset.asInstanceOf[InfosetElement]....
```
to drop the asInstanceOf. The PState.infoset is of type DIElement which isA
InfosetElement already
##########
File path:
daffodil-lib/src/main/scala/org/apache/daffodil/util/DefaultValidator.scala
##########
@@ -100,3 +111,35 @@ object Validator extends NoBindingFactoryAdapter {
}
}
+class DefaultValidatorFactory extends ValidatorFactory {
+ def name(): String = "default" /// todo;; or should it be "xerces"
+
+ // to not muck around too much with the original logic above just wrapping
access to the
Review comment:
This prevents the default validator from being a good motivation for the
factory API pattern here. Instead you end up wondering why the factory idiom is
here at all if the factory is just going to do nothing.
I think the xerces default validator - all the cache complexity goes away if
you actually implement the factory pattern.
##########
File path: daffodil-lib/src/main/scala/org/apache/daffodil/api/Validator.scala
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.api
+
+import com.typesafe.config.Config
+
+import scala.xml.SAXException
+
+trait Validator {
+ def validateXML(document: java.io.InputStream): ValidationResult
+}
+
+trait ValidatorFactory {
+ def name(): String
+ def make(config: Option[Config]): Validator
+}
+
+case class ValidationResult(warnings: Seq[ValidationWarning], errors:
Seq[ValidationError]) {
+ def failed: Boolean = errors.nonEmpty
+}
+
+
+trait ValidationError {
+ def message: String
+ def toThrowable: Throwable
+}
+trait ValidationWarning
+// todo;; can validator return other metadata?
+//trait ValidationReport
+
+
+// todo;; can DRY this up after sanity check
+case class SaxValidationError(e: SAXException) extends ValidationError {
+ def message: String = e.getMessage
+ def toThrowable: Throwable = e
Review comment:
Typically what you have as "def toThrowable" would be "def cause". Java
Exception objects have two members, a message and a cause. One of the two, or
both, have to be defined. (Typically one of the two.) You're not extending
java's Exception class here. I think probably you should be.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]